npm - @blamejs/core - Versions diffs - 0.12.26 → 0.12.28 - Mend

@blamejs/core 0.12.26 → 0.12.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -8,6 +8,10 @@ upgrading across more than a few patches at a time.
 ## v0.12.x
+- v0.12.28 (2026-05-24) — **`b.ai.capability` — model-capability registry + cheapest-satisfying-model router.** `b.ai.capability.create({ models })` turns a fleet of AI model descriptors into a routing decision: given a set of requirements (context window, input/output modalities, tool use, structured output, reasoning tier, citation support, prompt-caching size), it picks the cheapest model that satisfies all of them. NIST AI RMF (AI 100-1) MAP 2.x requires documenting each model's capabilities and limitations; the Model Cards convention (Mitchell et al., 2019) formalizes that descriptor — this primitive makes the descriptor actionable. Routing to the cheapest sufficient model is a front-line defense against over-provisioning spend and composes directly with `b.ai.quota`'s `cost-usd` dimension (the chosen descriptor's rate feeds the budget charge); refusing to route a request to a model that cannot satisfy it (missing modality, too-small context window, no tool use) catches a capability mismatch before the inference call burns tokens on a guaranteed-bad result. Cost ranking uses a supplied `costBasis` (`{ inputTokens, outputTokens }`) for real per-call spend, else the sum of the per-1k rates; ties break by model id so the choice is deterministic across calls and nodes. **Added:** *`b.ai.capability.create({ models })` — capability registry + router* — Returns `{ describe, list, register, satisfies, route }`. A descriptor carries `maxContextTokens`, `maxOutputTokens`, `modalitiesIn` / `modalitiesOut` (arrays), `toolUse`, `structuredOutput`, `fineTunable`, `reasoningTier` (`none` / `basic` / `standard` / `advanced`, ordered), `citationSupport`, `promptCachingMaxTokens`, and the cost rates `costPer1kInputTokens` / `costPer1kOutputTokens`. Descriptors are validated + frozen at registration so a typo (negative cost, unknown reasoning tier, non-array modality list) surfaces at config time rather than as a silent mis-route. `describe(modelId)` returns the frozen descriptor; `register(modelId, descriptor)` adds or replaces one at runtime. · *`route({ requirements, fallback?, costBasis? })` — cheapest-satisfying selection* — Collects every model whose descriptor satisfies all requirements, then returns the cheapest (`{ modelId, descriptor, estimatedCost, reason }`). Requirements: `minContextTokens`, `minOutputTokens`, `modalitiesIn` / `modalitiesOut` (model must support every listed modality), `toolUse`, `structuredOutput`, `fineTunable`, `minReasoningTier` (tier ordering — `standard` is met by `standard` or `advanced`), `citationSupport`, `minPromptCachingTokens`. When no model matches, `fallback` (a registered model id) is returned with `reason: "fallback"`, or the call refuses with `aiCapability/no-candidate` if no fallback was supplied. Routing decisions emit `ai/capability-routed` / `ai/capability-fallback` / `ai/capability-no-candidate` through the drop-silent audit chain. · *`satisfies(modelId, requirements)` — precise capability-mismatch reasons* — Returns `{ ok, failures }` where each failure names the `requirement`, the `need`, and what the model `have`s — so a caller surfaces a precise reason (e.g. `minReasoningTier need advanced have basic`) instead of a bare boolean. Use it to explain a routing miss or to gate a request against a specific model before calling it.
+- v0.12.27 (2026-05-24) — **`b.ai.quota` — per-tenant, per-model AI usage budgets with atomic consume-and-check.** `b.ai.quota.create(opts)` builds an enforcer that caps AI inference usage per `(tenant, model, dimension, period)` and defends OWASP LLM Top 10 2025 LLM10 (Unbounded Consumption) — the class that includes denial-of-wallet, where an attacker drives a high volume of pay-per-use inferences until the bill itself is the attack. Meter by `tokens`, `requests`, `cost-usd`, or `compute-hours` over a calendar-aligned UTC window (`second` through `month`). `consume(tenant, model, amount)` is a single atomic check-and-charge: under the default `hard` enforcement it reserves the amount only if it fits under the ceiling, otherwise it refuses without charging — the limit test and the charge are one indivisible operation, so there is no charge-then-refund window for a concurrent call to observe. The in-memory counter is per-process; multi-node deployments supply an `opts.store` adapter whose `reserve` (an atomic conditional test-and-charge — a Redis Lua script, a SQL `UPDATE ... WHERE used + :amt <= :limit RETURNING used`) and `add` are atomic on the shared backend to enforce one aggregate ceiling across the cluster without false denials under contention. Limit resolution is most-specific-first: `perTenantModel` over `perTenant` over `perModel` over the default `limit`; tenant and model identifiers are percent-encoded into the counter key so a hostile tenant name cannot collide with another tenant's budget. **Added:** *`b.ai.quota.create(opts)` — per-tenant AI usage-budget enforcer* — Returns `{ consume, check, snapshot, reset }` scoped to one `dimension` (`tokens` / `requests` / `cost-usd` / `compute-hours`) and one `period` (`second` / `minute` / `hour` / `day` / `week` (Monday-aligned) / `month` (1st-of-month), all UTC-aligned). `consume(tenant, model, amount, opts?)` returns `{ used, limit, remaining, allowed, exceeded, windowStart, resetsAt, ... }`. `check(tenant, model)` is the read-only snapshot. Spin up one enforcer per dimension you meter — a monthly `cost-usd` budget and a per-minute `tokens` burst cap coexist as two `create()` calls sharing one store. Defends OWASP LLM10:2025 Unbounded Consumption / denial-of-wallet; maps to NIST AI RMF (AI 100-1) MANAGE 2.x and EU AI Act Art. 15 (robustness / resource-exhaustion resilience). · *`hard` / `soft` / `warn` enforcement* — `hard` (default) refuses the over-budget call and throws `aiQuota/exceeded` without charging — the rejected reservation is refunded so the counter is untouched. `soft` admits the charge but reports `allowed: false` so the caller decides whether to honor it. `warn` admits and allows (advisory), flagging `exceeded: true`. A per-call `consume(..., { enforcement })` override lets one endpoint soften the mode for a trusted internal caller without a second enforcer. Every over-budget event emits `ai/quota-exceeded` through the drop-silent audit chain (`ai/quota-applied` on success), tagged with the active cluster node id for attribution. · *Cross-node aggregate budgets via `opts.store`* — The default counter is in-memory (per-process). Supply `opts.store` exposing atomic `reserve` / `add` / `get` / `reset` (a Redis Lua script, a shared SQL row) and the ceiling is enforced on the cluster-wide aggregate. `hard` mode goes through `reserve`, an atomic conditional test-and-charge that adds the amount only if it fits — so a concurrent over-budget call cannot transiently inflate the counter and falsely deny a smaller call that should fit. Per-tenant and per-model limit overrides (`perTenant` / `perModel` / `perTenantModel`) are validated at config time so a malformed cap surfaces at boot, not as a silent fall-through to the default.
 - v0.12.26 (2026-05-24) — **`b.compliance` posture cascades — `eu-ai-act` + `ca-ab-853` + `cac-genai-label` POSTURE_DEFAULTS + backup encryption refusal.** Three new posture cascades wired into `b.compliance.POSTURE_DEFAULTS` + `KNOWN_POSTURES` + `REGIME_MAP` so operators globally pinning the EU AI Act / California AB-853 / China CAC GenAI postures get the right floors automatically: backupEncryptionRequired:true, auditChainSignedRequired:true, tlsMinVersion:TLSv1.3, requireVacuumAfterErase:true. `b.backup.bundleAdapterStorage` extends the encryption-required posture list to include the three new postures so `cryptoStrategy: "none"` is refused upfront under any of them (parity with HIPAA + PCI-DSS, which the operator surface has carried since v0.12.10). The canonical `eu-ai-act` posture is the production name; the legacy `ai-act` short name stays in KNOWN_POSTURES for back-compat with operators who pinned it pre-v0.12.26. **Added:** *`eu-ai-act` posture cascade — Regulation (EU) 2024/1689* — POSTURE_DEFAULTS entry: backupEncryptionRequired:true (Art. 12 logging + Art. 15 robustness/cybersecurity demand encryption-at-rest for high-risk system training logs), auditChainSignedRequired:true (Art. 12 + Art. 13 audit-chain integrity), tlsMinVersion:TLSv1.3, requireVacuumAfterErase:true (Art. 50(4) synthetic-content provenance — residual EXIF / metadata pointing at the generating model must be cleared on erase). REGIME_MAP entry under jurisdiction:"EU" domain:"ai-governance". KNOWN_POSTURES carries both `eu-ai-act` (canonical) and `ai-act` (legacy short name). · *`ca-ab-853` posture cascade — California AB-853 effective 2026* — Same encryption + audit floor as eu-ai-act; jurisdiction:"US-CA". Model-generated content watermarking + disclosure regime. Operators serving California traffic pin this posture for the AB-853 §22949.91 obligations the v0.12.12 deepfake primitive's crossWalk references. · *`cac-genai-label` posture cascade — China CAC GenAI Service Measures* — Synthetic-content labelling per Art. 12 + algorithm filing per Art. 4. Same backup encryption + signed audit chain floor. Operators serving Chinese traffic pin this posture so the bundleAdapterStorage refuses plaintext bundles and the disclosure primitive's `jurisdiction: "cn"` cross-walk produces the right legal-reference array. · *`bundleAdapterStorage` BACKUP_ENCRYPTION_REQUIRED_POSTURES extended* — `hipaa` + `pci-dss` (the v0.12.10 baseline) joined by the three AI postures. `cryptoStrategy: "none"` refused upfront under any of `eu-ai-act` / `ca-ab-853` / `cac-genai-label` with `backup/posture-requires-encryption`. Operators wiring backup storage in a regulated AI deployment now get the same posture-driven gate that the storage primitive has always applied to health + payment data.
 - v0.12.25 (2026-05-24) — **`b.ai.disclosure.applyAll(scenario)` — bundle Art. 50(1) / 50(3) / 50(4) disclosures for mixed-modality AI systems.** Composes the three v0.12.12 disclosure primitives (chatbot / deepfake / emotion) into a single bundled emit. Operators running mixed-modality AI systems (e.g. a chatbot that also generates images, or an emotion-recognition system embedded in a chat flow) declare which Art. 50 obligations apply via `scenario.kinds` and the primitive fans out to the per-obligation emit calls in one pass. Shared opts (jurisdiction, language, audit, correlationId) propagate to every per-kind emission so the cross-walk + audit-chain entries stay correlated across the bundle. **Added:** *`b.ai.disclosure.applyAll(scenario)` — multi-obligation bundled emit* — `scenario.kinds: ["chatbot", "deepfake", "emotion"]` (subset) selects which Art. 50 obligations to satisfy. Per-kind required fields (session for chatbot, content + contentType for deepfake) refused upfront when missing. Returns `{ disclosures: { chatbot?, deepfake?, emotion? } }` with each entry being the corresponding primitive's emission payload. Shared opts propagate: `scenario.jurisdiction` / `scenario.language` / `scenario.audit` / `scenario.correlationId` reach every per-kind call so a US-CA deployment serving chat + image gets both the AB-853 cross-walk AND the Art. 50(1) audit event under the same correlationId.

package/README.md CHANGED Viewed

@@ -162,6 +162,8 @@ The framework bundles the surface a typical Node app reaches for. Every primitiv
 - **Prompt-injection classification** — OWASP LLM01:2025 / NIST COSAIS RFI (`b.ai.input.classify`)
 - **Agent identity** — A2A signed agent-card primitive (Linux Foundation Agentic AI Foundation v1.x, ML-DSA-87) (`b.a2a`)
 - **Content provenance** — C2PA 2.1 + California SB-942 / AB-853 manifest builder for AI-generated media (provider, model id + version, timestamp, content ID, signed) (`b.contentCredentials`)
+- **AI usage quotas** — per-tenant / per-model budgets metered by tokens / requests / cost-usd / compute-hours over calendar-aligned windows, with an atomic conditional reserve (no charge-then-refund race) + hard/soft/warn enforcement and an optional cross-node store; defends OWASP LLM10:2025 unbounded consumption / denial-of-wallet (`b.ai.quota`)
+- **AI capability routing** — model-capability registry (context window / modalities / tool use / reasoning tier / cost rates) + a router that picks the cheapest model satisfying a request's requirements, refusing capability mismatches before the inference call (NIST AI RMF MAP + Model Cards); composes with `b.ai.quota` cost budgets (`b.ai.capability`)
 ### Compliance regimes
 - **Posture coordinator** — `b.compliance` cascades operator-declared regime into retention / audit / db / cryptoField via POSTURE_DEFAULTS:

package/index.js CHANGED Viewed

@@ -443,6 +443,8 @@ module.exports = {
     aiContentDetect: require("./lib/ai-content-detect"),
     modelManifest:   require("./lib/ai-model-manifest"),
     disclosure:      require("./lib/ai-disclosure"),
+    quota:           require("./lib/ai-quota"),
+    capability:      require("./lib/ai-capability"),
   },
   promisePool:      require("./lib/promise-pool"),
   sdNotify:         require("./lib/sd-notify"),

package/lib/ai-capability.js ADDED Viewed

@@ -0,0 +1,482 @@
+"use strict";
+/**
+ * @module b.ai.capability
+ * @nav    AI
+ * @title  AI capability routing
+ *
+ * @intro
+ *   A capability registry + capability-aware router for AI model
+ *   fleets. NIST AI RMF (AI 100-1) MAP 2.x requires documenting each
+ *   model's capabilities and limitations; the Model Cards convention
+ *   (Mitchell et al., 2019) formalizes that descriptor. This module
+ *   turns those descriptors into a routing decision: given a set of
+ *   requirements (context window, modalities, tool use, reasoning
+ *   tier, …), pick the <em>cheapest</em> model in the fleet that
+ *   satisfies all of them, or fall back deterministically.
+ *
+ *   <code>b.ai.capability.create({ models })</code> builds a registry
+ *   from operator-supplied descriptors and returns:
+ *
+ *   - <code>describe(modelId)</code> — the frozen descriptor.
+ *   - <code>list()</code> — every registered model id.
+ *   - <code>register(modelId, descriptor)</code> — add / replace one.
+ *   - <code>satisfies(modelId, requirements)</code> —
+ *     <code>{ ok, failures }</code> where each failure names the
+ *     requirement, the need, and what the model has.
+ *   - <code>route({ requirements, fallback?, costBasis? })</code> —
+ *     the cheapest satisfying model, or the fallback, or a refusal.
+ *
+ *   A descriptor carries: <code>maxContextTokens</code>,
+ *   <code>maxOutputTokens</code>, <code>modalitiesIn</code> /
+ *   <code>modalitiesOut</code> (arrays — e.g. <code>"text"</code>,
+ *   <code>"image"</code>, <code>"audio"</code>, <code>"video"</code>),
+ *   <code>toolUse</code>, <code>structuredOutput</code>,
+ *   <code>fineTunable</code>, <code>reasoningTier</code>
+ *   (<code>"none" | "basic" | "standard" | "advanced"</code>,
+ *   ordered), <code>citationSupport</code>,
+ *   <code>promptCachingMaxTokens</code>, and the cost rates
+ *   <code>costPer1kInputTokens</code> / <code>costPer1kOutputTokens</code>.
+ *
+ *   <strong>Routing picks the cheapest match.</strong> When a
+ *   <code>costBasis</code> (<code>{ inputTokens, outputTokens }</code>)
+ *   is supplied the router estimates the per-call cost and ranks by
+ *   it; otherwise it ranks by the sum of the per-1k rates. Ties break
+ *   by model id so the choice is deterministic. Routing to the
+ *   cheapest sufficient model is the front-line defense against
+ *   over-provisioning spend — it composes with
+ *   <code>b.ai.quota</code>'s <code>cost-usd</code> dimension, where
+ *   the chosen descriptor's rate feeds the budget charge.
+ *
+ *   Refusing to route a request to a model that cannot satisfy it
+ *   (missing modality, too-small context window, no tool use) catches
+ *   a capability mismatch before the inference call burns tokens on a
+ *   guaranteed-bad result.
+ *
+ * @card
+ *   Capability registry + cheapest-satisfying-model router for AI
+ *   model fleets (context / modalities / tool use / reasoning tier /
+ *   cost). Composes with b.ai.quota cost budgets.
+ */
+var lazyRequire = require("./lazy-require");
+var validateOpts = require("./validate-opts");
+var { defineClass } = require("./framework-error");
+var AiCapabilityError = defineClass("AiCapabilityError", { alwaysPermanent: true });
+var audit = lazyRequire(function () { return require("./audit"); });
+// Ordered reasoning tiers — a requirement of `minReasoningTier:
+// "standard"` is satisfied by "standard" or "advanced", not "basic".
+var REASONING_TIERS = ["none", "basic", "standard", "advanced"];
+// Cost rates are quoted per 1000 tokens (industry convention; the
+// descriptor fields are costPer1kInputTokens / costPer1kOutputTokens).
+// Dividing a token count by this rate unit converts a per-1k rate into
+// the per-token multiplier — a rate denominator, not a byte size.
+var COST_RATE_TOKEN_UNIT = 1000;   // allow:raw-byte-literal — per-1k-token cost-rate denominator, not a byte count
+var DESCRIPTOR_KEYS = [
+  "maxContextTokens", "maxOutputTokens", "modalitiesIn", "modalitiesOut",
+  "toolUse", "structuredOutput", "fineTunable", "reasoningTier",
+  "citationSupport", "promptCachingMaxTokens",
+  "costPer1kInputTokens", "costPer1kOutputTokens", "provider", "version",
+];
+var REQUIREMENT_KEYS = [
+  "minContextTokens", "minOutputTokens", "modalitiesIn", "modalitiesOut",
+  "toolUse", "structuredOutput", "fineTunable", "minReasoningTier",
+  "citationSupport", "minPromptCachingTokens",
+];
+function _isPositiveInt(n) {
+  return typeof n === "number" && isFinite(n) && n > 0 && Math.floor(n) === n;
+}
+function _isNonNegFinite(n) {
+  return typeof n === "number" && isFinite(n) && n >= 0;
+}
+function _isStringArray(a) {
+  if (!Array.isArray(a)) return false;
+  for (var i = 0; i < a.length; i++) {
+    if (typeof a[i] !== "string" || a[i].length === 0) return false;
+  }
+  return true;
+}
+// Normalize + validate one descriptor at registration time so a typo
+// (negative cost, unknown reasoning tier, non-array modality list)
+// surfaces at config time rather than as a silent mis-route.
+function _normalizeDescriptor(modelId, d) {
+  if (!d || typeof d !== "object" || Array.isArray(d)) {
+    throw new AiCapabilityError("aiCapability/bad-descriptor",
+      "ai.capability: descriptor for '" + modelId + "' must be a plain object");
+  }
+  validateOpts(d, DESCRIPTOR_KEYS, "ai.capability descriptor['" + modelId + "']");
+  if (!_isPositiveInt(d.maxContextTokens)) {
+    throw new AiCapabilityError("aiCapability/bad-descriptor",
+      "ai.capability: '" + modelId + "'.maxContextTokens must be a positive integer");
+  }
+  var maxOut = (d.maxOutputTokens == null) ? d.maxContextTokens : d.maxOutputTokens;
+  if (!_isPositiveInt(maxOut)) {
+    throw new AiCapabilityError("aiCapability/bad-descriptor",
+      "ai.capability: '" + modelId + "'.maxOutputTokens must be a positive integer");
+  }
+  var modIn = (d.modalitiesIn == null) ? ["text"] : d.modalitiesIn;
+  var modOut = (d.modalitiesOut == null) ? ["text"] : d.modalitiesOut;
+  if (!_isStringArray(modIn) || !_isStringArray(modOut)) {
+    throw new AiCapabilityError("aiCapability/bad-descriptor",
+      "ai.capability: '" + modelId + "'.modalitiesIn / modalitiesOut must be arrays of non-empty strings");
+  }
+  var tier = (d.reasoningTier == null) ? "standard" : d.reasoningTier;
+  if (REASONING_TIERS.indexOf(tier) === -1) {
+    throw new AiCapabilityError("aiCapability/bad-descriptor",
+      "ai.capability: '" + modelId + "'.reasoningTier must be one of " + REASONING_TIERS.join(" / "));
+  }
+  var cachingMax = (d.promptCachingMaxTokens == null) ? 0 : d.promptCachingMaxTokens;
+  var costIn = (d.costPer1kInputTokens == null) ? 0 : d.costPer1kInputTokens;
+  var costOut = (d.costPer1kOutputTokens == null) ? 0 : d.costPer1kOutputTokens;
+  if (!_isNonNegFinite(cachingMax) || !_isNonNegFinite(costIn) || !_isNonNegFinite(costOut)) {
+    throw new AiCapabilityError("aiCapability/bad-descriptor",
+      "ai.capability: '" + modelId + "'.promptCachingMaxTokens / costPer1kInputTokens / " +
+      "costPer1kOutputTokens must be non-negative finite numbers");
+  }
+  return Object.freeze({
+    modelId:                modelId,
+    maxContextTokens:       d.maxContextTokens,
+    maxOutputTokens:        maxOut,
+    modalitiesIn:           Object.freeze(modIn.slice()),
+    modalitiesOut:          Object.freeze(modOut.slice()),
+    toolUse:                d.toolUse === true,
+    structuredOutput:       d.structuredOutput === true,
+    fineTunable:            d.fineTunable === true,
+    reasoningTier:          tier,
+    citationSupport:        d.citationSupport === true,
+    promptCachingMaxTokens: cachingMax,
+    costPer1kInputTokens:   costIn,
+    costPer1kOutputTokens:  costOut,
+    provider:               (typeof d.provider === "string") ? d.provider : null,
+    version:                (typeof d.version === "string") ? d.version : null,
+  });
+}
+/**
+ * @primitive b.ai.capability.create
+ * @signature b.ai.capability.create(opts)
+ * @since     0.12.28
+ * @status    stable
+ * @compliance soc2
+ * @related   b.ai.quota.create, b.ai.modelManifest.build
+ *
+ * Build a capability registry + router from operator-supplied model
+ * descriptors. Returns <code>{ describe, list, register, satisfies,
+ * route }</code>. Pair it with <code>b.ai.quota</code>:
+ * <code>route()</code> picks the cheapest model that meets the
+ * request, and the chosen descriptor's cost rate feeds the
+ * <code>cost-usd</code> budget charge.
+ *
+ * @opts
+ *   {
+ *     models: {                       // required, ≥ 1 entry
+ *       [modelId: string]: {
+ *         maxContextTokens:        number,    // required, positive int
+ *         maxOutputTokens?:        number,    // default: maxContextTokens
+ *         modalitiesIn?:           string[],  // default: ["text"]
+ *         modalitiesOut?:          string[],  // default: ["text"]
+ *         toolUse?:                boolean,   // default: false
+ *         structuredOutput?:       boolean,   // default: false
+ *         fineTunable?:            boolean,   // default: false
+ *         reasoningTier?:          string,    // none|basic|standard|advanced
+ *         citationSupport?:        boolean,   // default: false
+ *         promptCachingMaxTokens?: number,    // default: 0
+ *         costPer1kInputTokens?:   number,    // default: 0
+ *         costPer1kOutputTokens?:  number,    // default: 0
+ *         provider?:               string,
+ *         version?:                string,
+ *       }
+ *     },
+ *     audit?: boolean,                // default: true (route decisions)
+ *   }
+ *
+ * @example
+ *   var fleet = b.ai.capability.create({
+ *     models: {
+ *       "haiku":  { maxContextTokens: 200000, reasoningTier: "basic",
+ *                   costPer1kInputTokens: 0.001, costPer1kOutputTokens: 0.005 },
+ *       "opus":   { maxContextTokens: 200000, reasoningTier: "advanced",
+ *                   toolUse: true, modalitiesIn: ["text", "image"],
+ *                   costPer1kInputTokens: 0.015, costPer1kOutputTokens: 0.075 },
+ *     },
+ *   });
+ *   var pick = fleet.route({
+ *     requirements: { minContextTokens: 100000, toolUse: true,
+ *                     modalitiesIn: ["text", "image"] },
+ *     costBasis:    { inputTokens: 4000, outputTokens: 500 },
+ *   });
+ *   // → { modelId: "opus", descriptor: {...}, estimatedCost: 0.0975, reason: "cheapest-of-1" }
+ */
+function create(opts) {
+  validateOpts.requireObject(opts, "ai.capability.create", AiCapabilityError);
+  validateOpts(opts, ["models", "audit"], "ai.capability.create");
+  if (!opts.models || typeof opts.models !== "object" || Array.isArray(opts.models)) {
+    throw new AiCapabilityError("aiCapability/bad-models",
+      "ai.capability.create: models must be a plain object { modelId: descriptor }");
+  }
+  var ids = Object.keys(opts.models);
+  if (ids.length === 0) {
+    throw new AiCapabilityError("aiCapability/bad-models",
+      "ai.capability.create: models must declare at least one model");
+  }
+  var registry = new Map();
+  for (var i = 0; i < ids.length; i++) {
+    registry.set(ids[i], _normalizeDescriptor(ids[i], opts.models[ids[i]]));
+  }
+  var auditOn = opts.audit !== false;
+  function _emitAudit(action, outcome, metadata) {
+    if (!auditOn) return;
+    try {
+      audit().safeEmit({ action: action, outcome: outcome, metadata: metadata || {} });
+    } catch (_e) { /* audit best-effort — drop-silent */ }
+  }
+  function describe(modelId) {
+    var d = registry.get(modelId);
+    if (!d) {
+      throw new AiCapabilityError("aiCapability/unknown-model",
+        "ai.capability.describe: unknown model '" + modelId + "'");
+    }
+    return d;
+  }
+  function list() {
+    return Array.from(registry.keys());
+  }
+  function register(modelId, descriptor) {
+    validateOpts.requireNonEmptyString(modelId,
+      "ai.capability.register: modelId", AiCapabilityError, "aiCapability/bad-model");
+    registry.set(modelId, _normalizeDescriptor(modelId, descriptor));
+    return registry.get(modelId);
+  }
+  // Returns { ok, failures } — every unmet requirement names what was
+  // needed and what the model has, so a caller can surface a precise
+  // capability-mismatch reason instead of a bare boolean.
+  function _evaluate(descriptor, requirements) {
+    var failures = [];
+    function fail(requirement, need, have) {
+      failures.push({ requirement: requirement, need: need, have: have });
+    }
+    if (requirements.minContextTokens != null &&
+        descriptor.maxContextTokens < requirements.minContextTokens) {
+      fail("minContextTokens", requirements.minContextTokens, descriptor.maxContextTokens);
+    }
+    if (requirements.minOutputTokens != null &&
+        descriptor.maxOutputTokens < requirements.minOutputTokens) {
+      fail("minOutputTokens", requirements.minOutputTokens, descriptor.maxOutputTokens);
+    }
+    if (requirements.modalitiesIn != null) {
+      for (var a = 0; a < requirements.modalitiesIn.length; a++) {
+        if (descriptor.modalitiesIn.indexOf(requirements.modalitiesIn[a]) === -1) {
+          fail("modalitiesIn", requirements.modalitiesIn[a], descriptor.modalitiesIn);
+        }
+      }
+    }
+    if (requirements.modalitiesOut != null) {
+      for (var b = 0; b < requirements.modalitiesOut.length; b++) {
+        if (descriptor.modalitiesOut.indexOf(requirements.modalitiesOut[b]) === -1) {
+          fail("modalitiesOut", requirements.modalitiesOut[b], descriptor.modalitiesOut);
+        }
+      }
+    }
+    if (requirements.toolUse === true && descriptor.toolUse !== true) {
+      fail("toolUse", true, false);
+    }
+    if (requirements.structuredOutput === true && descriptor.structuredOutput !== true) {
+      fail("structuredOutput", true, false);
+    }
+    if (requirements.fineTunable === true && descriptor.fineTunable !== true) {
+      fail("fineTunable", true, false);
+    }
+    if (requirements.citationSupport === true && descriptor.citationSupport !== true) {
+      fail("citationSupport", true, false);
+    }
+    if (requirements.minReasoningTier != null &&
+        REASONING_TIERS.indexOf(descriptor.reasoningTier) <
+        REASONING_TIERS.indexOf(requirements.minReasoningTier)) {
+      fail("minReasoningTier", requirements.minReasoningTier, descriptor.reasoningTier);
+    }
+    if (requirements.minPromptCachingTokens != null &&
+        descriptor.promptCachingMaxTokens < requirements.minPromptCachingTokens) {
+      fail("minPromptCachingTokens", requirements.minPromptCachingTokens, descriptor.promptCachingMaxTokens);
+    }
+    return { ok: failures.length === 0, failures: failures };
+  }
+  function _validateRequirements(requirements) {
+    if (requirements == null) return {};
+    if (typeof requirements !== "object" || Array.isArray(requirements)) {
+      throw new AiCapabilityError("aiCapability/bad-requirements",
+        "ai.capability: requirements must be a plain object");
+    }
+    validateOpts(requirements, REQUIREMENT_KEYS, "ai.capability requirements");
+    if (requirements.minReasoningTier != null &&
+        REASONING_TIERS.indexOf(requirements.minReasoningTier) === -1) {
+      throw new AiCapabilityError("aiCapability/bad-requirements",
+        "ai.capability: minReasoningTier must be one of " + REASONING_TIERS.join(" / "));
+    }
+    if (requirements.modalitiesIn != null && !_isStringArray(requirements.modalitiesIn)) {
+      throw new AiCapabilityError("aiCapability/bad-requirements",
+        "ai.capability: requirements.modalitiesIn must be an array of non-empty strings");
+    }
+    if (requirements.modalitiesOut != null && !_isStringArray(requirements.modalitiesOut)) {
+      throw new AiCapabilityError("aiCapability/bad-requirements",
+        "ai.capability: requirements.modalitiesOut must be an array of non-empty strings");
+    }
+    // Numeric minimums are compared with `<` against the descriptor; a
+    // non-numeric value (NaN, "128k", a bad parse) makes that compare
+    // false and SILENTLY satisfies the requirement, so an undersized
+    // model could be selected. Reject non-finite / negative here so a
+    // malformed requirement fails fast instead of fail-open.
+    var numericMins = ["minContextTokens", "minOutputTokens", "minPromptCachingTokens"];
+    for (var ni = 0; ni < numericMins.length; ni++) {
+      var nk = numericMins[ni];
+      if (requirements[nk] != null && !_isNonNegFinite(requirements[nk])) {
+        throw new AiCapabilityError("aiCapability/bad-requirements",
+          "ai.capability: requirements." + nk + " must be a non-negative finite number");
+      }
+    }
+    // Boolean opt-in requirements are matched with `=== true`; a
+    // non-boolean (truthy 1, "false") would silently fail to require
+    // the capability. Reject non-booleans so the intent is explicit.
+    var booleanReqs = ["toolUse", "structuredOutput", "fineTunable", "citationSupport"];
+    for (var bi = 0; bi < booleanReqs.length; bi++) {
+      var bk = booleanReqs[bi];
+      if (requirements[bk] != null && typeof requirements[bk] !== "boolean") {
+        throw new AiCapabilityError("aiCapability/bad-requirements",
+          "ai.capability: requirements." + bk + " must be a boolean");
+      }
+    }
+    return requirements;
+  }
+  function satisfies(modelId, requirements) {
+    return _evaluate(describe(modelId), _validateRequirements(requirements));
+  }
+  // Per-call cost estimate. With a costBasis the estimate is the
+  // real per-call spend (input + output tokens at the model's rates);
+  // without one it is the sum of the per-1k rates — a stable proxy
+  // for "cheaper model" when the caller hasn't sized the request.
+  function _estimateCost(descriptor, costBasis) {
+    if (costBasis) {
+      var inTok = _isNonNegFinite(costBasis.inputTokens) ? costBasis.inputTokens : 0;
+      var outTok = _isNonNegFinite(costBasis.outputTokens) ? costBasis.outputTokens : 0;
+      return (inTok / COST_RATE_TOKEN_UNIT) * descriptor.costPer1kInputTokens +
+             (outTok / COST_RATE_TOKEN_UNIT) * descriptor.costPer1kOutputTokens;
+    }
+    return descriptor.costPer1kInputTokens + descriptor.costPer1kOutputTokens;
+  }
+  function route(routeOpts) {
+    routeOpts = routeOpts || {};
+    validateOpts(routeOpts, ["requirements", "fallback", "costBasis"], "ai.capability.route");
+    var requirements = _validateRequirements(routeOpts.requirements);
+    var costBasis = null;
+    if (routeOpts.costBasis != null) {
+      if (typeof routeOpts.costBasis !== "object" || Array.isArray(routeOpts.costBasis)) {
+        throw new AiCapabilityError("aiCapability/bad-requirements",
+          "ai.capability.route: costBasis must be a plain object { inputTokens, outputTokens }");
+      }
+      validateOpts(routeOpts.costBasis, ["inputTokens", "outputTokens"],
+        "ai.capability.route costBasis");
+      // A malformed costBasis field silently underprices a candidate
+      // and biases the "cheapest" choice toward the wrong model — fail
+      // fast instead. An absent field is fine (treated as 0 tokens on
+      // that side); a present-but-non-numeric field is rejected.
+      var cbFields = ["inputTokens", "outputTokens"];
+      for (var ci = 0; ci < cbFields.length; ci++) {
+        var ck = cbFields[ci];
+        if (routeOpts.costBasis[ck] != null && !_isNonNegFinite(routeOpts.costBasis[ck])) {
+          throw new AiCapabilityError("aiCapability/bad-requirements",
+            "ai.capability.route: costBasis." + ck + " must be a non-negative finite number");
+        }
+      }
+      costBasis = routeOpts.costBasis;
+    }
+    // Collect every satisfying model, then pick the cheapest. Tie
+    // break by model id (lexicographic) so the choice is deterministic
+    // across calls and across nodes.
+    var candidates = [];
+    var modelIds = Array.from(registry.keys());
+    for (var i = 0; i < modelIds.length; i++) {
+      var d = registry.get(modelIds[i]);
+      if (_evaluate(d, requirements).ok) {
+        candidates.push({ modelId: modelIds[i], descriptor: d, cost: _estimateCost(d, costBasis) });
+      }
+    }
+    candidates.sort(function (x, y) {
+      if (x.cost !== y.cost) return x.cost - y.cost;
+      return x.modelId < y.modelId ? -1 : (x.modelId > y.modelId ? 1 : 0);
+    });
+    if (candidates.length > 0) {
+      var pick = candidates[0];
+      _emitAudit("ai/capability-routed", "allowed", {
+        modelId: pick.modelId, candidateCount: candidates.length,
+        estimatedCost: pick.cost, requirements: requirements,
+      });
+      return {
+        modelId:       pick.modelId,
+        descriptor:    pick.descriptor,
+        estimatedCost: pick.cost,
+        reason:        "cheapest-of-" + candidates.length,
+      };
+    }
+    // No model satisfies the requirements.
+    if (routeOpts.fallback != null) {
+      var fb = registry.get(routeOpts.fallback);
+      if (!fb) {
+        throw new AiCapabilityError("aiCapability/unknown-model",
+          "ai.capability.route: fallback '" + routeOpts.fallback + "' is not a registered model");
+      }
+      _emitAudit("ai/capability-fallback", "allowed", {
+        modelId: routeOpts.fallback, requirements: requirements,
+      });
+      return {
+        modelId:       routeOpts.fallback,
+        descriptor:    fb,
+        estimatedCost: _estimateCost(fb, costBasis),
+        reason:        "fallback",
+      };
+    }
+    _emitAudit("ai/capability-no-candidate", "denied", { requirements: requirements });
+    throw new AiCapabilityError("aiCapability/no-candidate",
+      "ai.capability.route: no registered model satisfies the requirements " +
+      "and no fallback was supplied");
+  }
+  return {
+    describe:  describe,
+    list:      list,
+    register:  register,
+    satisfies: satisfies,
+    route:     route,
+  };
+}
+module.exports = {
+  create:             create,
+  REASONING_TIERS:    REASONING_TIERS,
+  AiCapabilityError:  AiCapabilityError,
+};

package/lib/ai-quota.js ADDED Viewed

@@ -0,0 +1,526 @@
+"use strict";
+/**
+ * @module b.ai.quota
+ * @nav    Compliance
+ * @title  AI usage quota
+ *
+ * @intro
+ *   Per-tenant, per-model usage budgets for AI inference endpoints.
+ *   OWASP LLM Top 10 2025 ranks <strong>LLM10: Unbounded
+ *   Consumption</strong> — the class that includes "denial of
+ *   wallet" (DoW), where an attacker drives a high volume of
+ *   pay-per-use inferences until the bill itself becomes the
+ *   attack — as a top application risk. A single misbehaving (or
+ *   compromised) tenant can saturate context windows, exhaust GPU
+ *   minutes, or run up an unbounded cloud-inference bill long
+ *   before a human notices.
+ *
+ *   This primitive enforces a hard ceiling per
+ *   <code>(tenant, model, dimension, period)</code>:
+ *
+ *   - <code>dimension</code> — what is being metered:
+ *     <code>"tokens"</code> (context + completion tokens),
+ *     <code>"requests"</code> (inference calls),
+ *     <code>"cost-usd"</code> (provider spend), or
+ *     <code>"compute-hours"</code> (GPU / accelerator time).
+ *   - <code>period</code> — the budget window, calendar-aligned in
+ *     UTC: <code>"second"</code>, <code>"minute"</code>,
+ *     <code>"hour"</code>, <code>"day"</code>, <code>"week"</code>
+ *     (Monday-aligned), or <code>"month"</code> (1st-of-month).
+ *   - <code>enforcement</code> — <code>"hard"</code> (default,
+ *     refuse the over-budget call), <code>"soft"</code> (admit but
+ *     report <code>allowed:false</code> so the caller decides), or
+ *     <code>"warn"</code> (admit + audit only).
+ *
+ *   <code>consume(tenant, model, amount)</code> is the single
+ *   atomic check-and-charge entry point: in <code>"hard"</code>
+ *   mode it reserves <code>amount</code> only if it fits under the
+ *   limit, otherwise it refuses without charging. There is no
+ *   separate "check then add" two-call shape to race against — the
+ *   reservation and the limit test happen in one operation.
+ *
+ *   <strong>Single-process by default; cross-node via store.</strong>
+ *   The in-memory counter is per-process. Multi-node deployments
+ *   that need an aggregate ceiling across the cluster supply an
+ *   <code>opts.store</code> adapter whose <code>reserve</code> (an
+ *   atomic conditional test-and-charge — "add only if current +
+ *   amount fits under the limit") and <code>add</code> are atomic on
+ *   the shared backend: a Redis Lua script, or a SQL
+ *   <code>UPDATE ... SET used = used + :amt WHERE used + :amt &lt;= :limit
+ *   RETURNING used</code>. The conditional reserve is what keeps
+ *   <code>hard</code> enforcement correct under cross-node
+ *   contention — there is no charge-then-refund window for a
+ *   concurrent call to observe. The framework records the active
+ *   cluster node id on every breach event so a denial-of-wallet
+ *   spike is attributable.
+ *
+ *   Limit resolution is most-specific-first:
+ *   <code>perTenantModel[t|m]</code> →
+ *   <code>perTenant[t]</code> → <code>perModel[m]</code> →
+ *   <code>limit</code> (the default). Tenant and model identifiers
+ *   are percent-encoded into the counter key so a hostile tenant
+ *   name cannot collide with another tenant's budget.
+ *
+ *   Audit emissions (drop-silent via <code>b.audit.safeEmit</code>):
+ *     - <code>ai/quota-applied</code>  — a consume succeeded.
+ *     - <code>ai/quota-exceeded</code> — a consume hit the ceiling
+ *       (refused under <code>"hard"</code>; reported under
+ *       <code>"soft"</code> / <code>"warn"</code>).
+ *
+ *   NIST AI RMF (AI 100-1) MANAGE 2.x ("AI system performance and
+ *   trustworthiness are monitored") and EU AI Act Art. 15
+ *   (accuracy, robustness and cybersecurity of high-risk systems —
+ *   resource-exhaustion resilience) map onto this primitive;
+ *   operators wire its emissions into the same audit chain auditors
+ *   read.
+ *
+ * @card
+ *   Per-tenant, per-model AI usage budgets (tokens / requests /
+ *   cost-usd / compute-hours) with atomic consume-and-check.
+ *   Defends OWASP LLM10 unbounded consumption / denial-of-wallet.
+ */
+var C = require("./constants");
+var lazyRequire = require("./lazy-require");
+var validateOpts = require("./validate-opts");
+var { defineClass } = require("./framework-error");
+var AiQuotaError = defineClass("AiQuotaError", { alwaysPermanent: true });
+var audit = lazyRequire(function () { return require("./audit"); });
+var observability = lazyRequire(function () { return require("./observability"); });
+var cluster = lazyRequire(function () { return require("./cluster"); });
+var DIMENSIONS   = ["tokens", "requests", "cost-usd", "compute-hours"];
+var PERIODS      = ["second", "minute", "hour", "day", "week", "month"];
+var ENFORCEMENTS = ["hard", "soft", "warn"];
+// ---- Calendar-aligned period windows (UTC) ----
+//
+// Fixed-duration periods (second / minute) align to the epoch, which
+// is itself UTC midnight, so a modulo is exact. Hour / day / week /
+// month align to human UTC boundaries via Date.UTC truncation —
+// week starts Monday, month starts on the 1st — so "100k tokens per
+// day" resets at 00:00 UTC, not at a rolling 24h offset from first
+// use.
+function _windowStartFor(period, now) {
+  var d = new Date(now);
+  switch (period) {
+    case "second": return now - (now % C.TIME.seconds(1));
+    case "minute": return now - (now % C.TIME.minutes(1));
+    case "hour":
+      return Date.UTC(d.getUTCFullYear(), d.getUTCMonth(), d.getUTCDate(), d.getUTCHours());
+    case "day":
+      return Date.UTC(d.getUTCFullYear(), d.getUTCMonth(), d.getUTCDate());
+    case "week": {
+      var dayMid = Date.UTC(d.getUTCFullYear(), d.getUTCMonth(), d.getUTCDate());
+      var dow = new Date(dayMid).getUTCDay();            // 0=Sun .. 6=Sat
+      var sinceMonday = (dow + 6) % 7;                   // 0=Mon .. 6=Sun
+      return dayMid - sinceMonday * C.TIME.days(1);
+    }
+    case "month":
+      return Date.UTC(d.getUTCFullYear(), d.getUTCMonth(), 1);
+    default:
+      // unreachable — period validated at create()
+      return now;
+  }
+}
+function _resetsAtFor(period, windowStart) {
+  var d = new Date(windowStart);
+  switch (period) {
+    case "second": return windowStart + C.TIME.seconds(1);
+    case "minute": return windowStart + C.TIME.minutes(1);
+    case "hour":   return windowStart + C.TIME.hours(1);
+    case "day":    return windowStart + C.TIME.days(1);
+    case "week":   return windowStart + C.TIME.weeks(1);
+    case "month":  return Date.UTC(d.getUTCFullYear(), d.getUTCMonth() + 1, 1);
+    default:       return windowStart;
+  }
+}
+// ---- Default in-memory atomic counter store ----
+//
+// Single-threaded JS makes each operation below one indivisible step,
+// so a concurrent caller never observes a partial update. Entries
+// self-expire at the window boundary; reads past expiry return 0 (a
+// fresh window). _keysWithPrefix backs the reset(tenant) enumeration
+// the default store can satisfy without an external scan.
+function _memoryStore() {
+  var m = new Map();                                     // key -> { value, expiresAt }
+  function _slot(key, windowMs) {
+    var now = Date.now();
+    var e = m.get(key);
+    if (!e || e.expiresAt <= now) {
+      e = { value: 0, expiresAt: now + windowMs };
+      m.set(key, e);
+    }
+    return e;
+  }
+  return {
+    // Atomic conditional reserve — tests current + amount <= limit and
+    // charges only if it fits, as one indivisible operation. Returns
+    // { allowed, used }; on refusal the amount is NOT charged, so a
+    // concurrent over-budget call cannot transiently inflate the
+    // counter and falsely deny a smaller call that should fit.
+    reserve: function (key, amount, limit, windowMs) {
+      var e = _slot(key, windowMs);
+      if (e.value + amount > limit) return { allowed: false, used: e.value };
+      e.value += amount;
+      return { allowed: true, used: e.value };
+    },
+    // Unconditional add — for soft / warn modes, which always charge.
+    add: function (key, amount, windowMs) {
+      var e = _slot(key, windowMs);
+      e.value += amount;
+      return e.value;
+    },
+    get: function (key) {
+      var e = m.get(key);
+      if (!e || e.expiresAt <= Date.now()) return 0;
+      return e.value;
+    },
+    reset: function (key) {
+      m.delete(key);
+    },
+    _keysWithPrefix: function (prefix) {
+      var out = [];
+      m.forEach(function (_e, k) { if (k.indexOf(prefix) === 0) out.push(k); });
+      return out;
+    },
+    _clear: function () { m.clear(); },
+  };
+}
+/**
+ * @primitive b.ai.quota.create
+ * @signature b.ai.quota.create(opts)
+ * @since     0.12.27
+ * @status    stable
+ * @compliance soc2, gdpr
+ * @related   b.tenantQuota.budget, b.ai.disclosure.chatbot
+ *
+ * Build a per-tenant AI usage-budget enforcer scoped to one
+ * <code>dimension</code> and one <code>period</code>. Returns an
+ * object exposing <code>consume(tenant, model, amount, opts?)</code>
+ * (the atomic check-and-charge), <code>check(tenant, model)</code>
+ * (read-only snapshot), <code>snapshot(tenant, model)</code> (alias
+ * of <code>check</code>), and <code>reset(tenant?, model?)</code>
+ * (drop the current window's counters).
+ *
+ * Spin up one enforcer per dimension you meter — e.g. a
+ * <code>"cost-usd"</code> monthly budget and a
+ * <code>"tokens"</code> per-minute burst cap can coexist as two
+ * <code>create()</code> calls sharing the same store.
+ *
+ * @opts
+ *   {
+ *     dimension:        string,    // required, one of:
+ *                                  //   "tokens" | "requests" |
+ *                                  //   "cost-usd" | "compute-hours"
+ *     period:           string,    // required, one of:
+ *                                  //   "second" | "minute" | "hour" |
+ *                                  //   "day" | "week" | "month"
+ *     limit:            number,    // required, default ceiling (> 0)
+ *     perTenant?:       { [tenantId: string]: number },
+ *     perModel?:        { [model: string]: number },
+ *     perTenantModel?:  { [tenantPipeModel: string]: number },
+ *                                  // key is `tenantId + "|" + model`
+ *     enforcement?:     string,    // "hard" (default) | "soft" | "warn"
+ *     store?:           object,    // { reserve, add, get, reset };
+ *                                  // default in-memory (per-process)
+ *     audit?:           boolean,   // default: true
+ *   }
+ *
+ * @example
+ *   var budget = b.ai.quota.create({
+ *     dimension:  "cost-usd",
+ *     period:     "month",
+ *     limit:      500,
+ *     perTenant:  { "tenant-vip": 5000 },
+ *     enforcement: "hard",
+ *   });
+ *   var r = await budget.consume("tenant-acme", "opus-4", 0.42);
+ *   // → { tenantId: "tenant-acme", model: "opus-4",
+ *   //     dimension: "cost-usd", period: "month", used: 0.42,
+ *   //     limit: 500, remaining: 499.58, allowed: true,
+ *   //     exceeded: false, windowStart: ..., resetsAt: ... }
+ */
+function create(opts) {
+  validateOpts.requireObject(opts, "ai.quota.create", AiQuotaError);
+  validateOpts(opts, [
+    "dimension", "period", "limit", "perTenant", "perModel",
+    "perTenantModel", "enforcement", "store", "audit",
+  ], "ai.quota.create");
+  var dimension = opts.dimension;
+  if (DIMENSIONS.indexOf(dimension) === -1) {
+    throw new AiQuotaError("aiQuota/bad-dimension",
+      "ai.quota.create: dimension must be one of " + DIMENSIONS.join(" / ") +
+      " (got " + JSON.stringify(dimension) + ")");
+  }
+  var period = opts.period;
+  if (PERIODS.indexOf(period) === -1) {
+    throw new AiQuotaError("aiQuota/bad-period",
+      "ai.quota.create: period must be one of " + PERIODS.join(" / ") +
+      " (got " + JSON.stringify(period) + ")");
+  }
+  if (typeof opts.limit !== "number" || !isFinite(opts.limit) || opts.limit <= 0) {
+    throw new AiQuotaError("aiQuota/bad-limit",
+      "ai.quota.create: limit must be a positive finite number");
+  }
+  var defaultLimit = opts.limit;
+  var perTenant      = _validateLimitMap(opts.perTenant, "perTenant");
+  var perModel       = _validateLimitMap(opts.perModel, "perModel");
+  var perTenantModel = _validateLimitMap(opts.perTenantModel, "perTenantModel");
+  var enforcement = (opts.enforcement == null) ? "hard" : opts.enforcement;
+  if (ENFORCEMENTS.indexOf(enforcement) === -1) {
+    throw new AiQuotaError("aiQuota/bad-enforcement",
+      "ai.quota.create: enforcement must be one of " + ENFORCEMENTS.join(" / ") +
+      " (got " + JSON.stringify(enforcement) + ")");
+  }
+  var store = opts.store || _memoryStore();
+  _validateStore(store);
+  var storeIsDefault = !opts.store;
+  var auditOn = opts.audit !== false;
+  function _limitFor(tenantId, model) {
+    var tmKey = tenantId + "|" + model;
+    if (Object.prototype.hasOwnProperty.call(perTenantModel, tmKey)) return perTenantModel[tmKey];
+    if (Object.prototype.hasOwnProperty.call(perTenant, tenantId))   return perTenant[tenantId];
+    if (Object.prototype.hasOwnProperty.call(perModel, model))       return perModel[model];
+    return defaultLimit;
+  }
+  // Counter key — tenant + model percent-encoded so a value
+  // containing the ":" separator cannot collide with another
+  // (tenant, model) pair's budget.
+  function _keyFor(tenantId, model, windowStart) {
+    return "aiq:" + dimension + ":" + period + ":" +
+      encodeURIComponent(tenantId) + ":" + encodeURIComponent(model) + ":" + windowStart;
+  }
+  function _keyPrefixForTenant(tenantId) {
+    return "aiq:" + dimension + ":" + period + ":" + encodeURIComponent(tenantId) + ":";
+  }
+  function _nodeId() {
+    try {
+      if (cluster().isClusterMode()) return cluster().currentNodeId();
+    } catch (_e) { /* cluster optional */ }
+    return null;
+  }
+  function _emitAudit(action, outcome, metadata) {
+    if (!auditOn) return;
+    try {
+      audit().safeEmit({ action: action, outcome: outcome, metadata: metadata || {} });
+    } catch (_e) { /* audit best-effort — drop-silent */ }
+  }
+  function _emitMetric(name, n) {
+    try { observability().safeEvent(name, n || 1, {}); }
+    catch (_e) { /* drop-silent */ }
+  }
+  // `mode` is the enforcement actually applied to this call (the
+  // per-call override when present, else the instance default) so the
+  // returned `enforcement` reflects how the call was evaluated.
+  function _result(tenantId, model, used, limit, windowStart, resetsAt, mode, allowed, exceeded) {
+    var remaining = limit - used;
+    return {
+      tenantId:    tenantId,
+      model:       model,
+      dimension:   dimension,
+      period:      period,
+      used:        used,
+      limit:       limit,
+      remaining:   remaining < 0 ? 0 : remaining,
+      allowed:     allowed,
+      exceeded:    exceeded,
+      enforcement: mode,
+      windowStart: windowStart,
+      resetsAt:    resetsAt,
+    };
+  }
+  function consume(tenantId, model, amount, consumeOpts) {
+    validateOpts.requireNonEmptyString(tenantId,
+      "ai.quota.consume: tenantId", AiQuotaError, "aiQuota/bad-tenant");
+    validateOpts.requireNonEmptyString(model,
+      "ai.quota.consume: model", AiQuotaError, "aiQuota/bad-model");
+    if (typeof amount !== "number" || !isFinite(amount) || amount < 0) {
+      throw new AiQuotaError("aiQuota/bad-amount",
+        "ai.quota.consume: amount must be a non-negative finite number");
+    }
+    consumeOpts = consumeOpts || {};
+    // Per-call enforcement override lets a single endpoint dial a
+    // softer mode for a trusted internal caller without a second
+    // enforcer; still validated against the allowlist.
+    var mode = (consumeOpts.enforcement == null) ? enforcement : consumeOpts.enforcement;
+    if (ENFORCEMENTS.indexOf(mode) === -1) {
+      throw new AiQuotaError("aiQuota/bad-enforcement",
+        "ai.quota.consume: enforcement override must be one of " + ENFORCEMENTS.join(" / "));
+    }
+    var now = Date.now();
+    var windowStart = _windowStartFor(period, now);
+    var resetsAt = _resetsAtFor(period, windowStart);
+    var windowMs = resetsAt - windowStart;
+    var limit = _limitFor(tenantId, model);
+    var key = _keyFor(tenantId, model, windowStart);
+    if (mode === "hard") {
+      // Atomic conditional reserve — the store tests current + amount
+      // <= limit and charges only if it fits, as one indivisible
+      // operation. Charging first and refunding the overage (a
+      // read-then-add or add-then-refund shape) would let a concurrent
+      // over-budget call transiently inflate the counter and falsely
+      // deny a smaller call that should fit; the conditional reserve
+      // never charges on refusal, so there is no transient to race.
+      var rv = store.reserve(key, amount, limit, windowMs);
+      if (rv.allowed) {
+        _emitAudit("ai/quota-applied", "allowed", {
+          tenantId: tenantId, model: model, dimension: dimension,
+          period: period, amount: amount, used: rv.used, limit: limit,
+          nodeId: _nodeId(),
+        });
+        _emitMetric("ai.quota.applied", 1);
+        return _result(tenantId, model, rv.used, limit, windowStart, resetsAt, mode, true, false);
+      }
+      _emitAudit("ai/quota-exceeded", "denied", {
+        tenantId: tenantId, model: model, dimension: dimension,
+        period: period, amount: amount, used: rv.used, limit: limit,
+        enforcement: mode, nodeId: _nodeId(),
+      });
+      _emitMetric("ai.quota.exceeded", 1);
+      throw new AiQuotaError("aiQuota/exceeded",
+        "ai.quota.consume: tenant '" + tenantId + "' model '" + model +
+        "' is at " + rv.used + " of " + limit + " " + dimension +
+        " this " + period + "; consuming " + amount + " would exceed the budget — call refused");
+    }
+    // soft / warn always charge — the call proceeds regardless of the
+    // ceiling; the mode only changes how the overage is reported.
+    var used = store.add(key, amount, windowMs);
+    if (used > limit) {
+      _emitAudit("ai/quota-exceeded", "allowed", {
+        tenantId: tenantId, model: model, dimension: dimension,
+        period: period, amount: amount, used: used, limit: limit,
+        enforcement: mode, nodeId: _nodeId(),
+      });
+      _emitMetric("ai.quota.exceeded", 1);
+      // soft reports allowed:false so the caller can choose to honor
+      // the ceiling; warn reports allowed:true (advisory only).
+      return _result(tenantId, model, used, limit, windowStart, resetsAt, mode, mode === "warn", true);
+    }
+    _emitAudit("ai/quota-applied", "allowed", {
+      tenantId: tenantId, model: model, dimension: dimension,
+      period: period, amount: amount, used: used, limit: limit,
+      nodeId: _nodeId(),
+    });
+    _emitMetric("ai.quota.applied", 1);
+    return _result(tenantId, model, used, limit, windowStart, resetsAt, mode, true, false);
+  }
+  function check(tenantId, model) {
+    validateOpts.requireNonEmptyString(tenantId,
+      "ai.quota.check: tenantId", AiQuotaError, "aiQuota/bad-tenant");
+    validateOpts.requireNonEmptyString(model,
+      "ai.quota.check: model", AiQuotaError, "aiQuota/bad-model");
+    var now = Date.now();
+    var windowStart = _windowStartFor(period, now);
+    var resetsAt = _resetsAtFor(period, windowStart);
+    var limit = _limitFor(tenantId, model);
+    var used = store.get(_keyFor(tenantId, model, windowStart));
+    return _result(tenantId, model, used, limit, windowStart, resetsAt, enforcement, used < limit, used >= limit);
+  }
+  function reset(tenantId, model) {
+    var now = Date.now();
+    var windowStart = _windowStartFor(period, now);
+    if (tenantId === undefined) {
+      // Clear everything. The default store supports a full clear;
+      // an external store gets a no-arg reset() if it offers one.
+      if (storeIsDefault) { store._clear(); return; }
+      if (typeof store.reset === "function") { store.reset(); return; }
+      return;
+    }
+    validateOpts.requireNonEmptyString(tenantId,
+      "ai.quota.reset: tenantId", AiQuotaError, "aiQuota/bad-tenant");
+    if (model !== undefined) {
+      validateOpts.requireNonEmptyString(model,
+        "ai.quota.reset: model", AiQuotaError, "aiQuota/bad-model");
+      store.reset(_keyFor(tenantId, model, windowStart));
+      return;
+    }
+    // tenant-wide reset needs key enumeration. The default in-memory
+    // store can scan its own keys; an external store would need a
+    // server-side prefix delete the framework can't portably issue.
+    if (storeIsDefault) {
+      var prefix = _keyPrefixForTenant(tenantId);
+      var keys = store._keysWithPrefix(prefix);
+      for (var i = 0; i < keys.length; i++) store.reset(keys[i]);
+      return;
+    }
+    throw new AiQuotaError("aiQuota/reset-unsupported",
+      "ai.quota.reset: tenant-wide reset with an external store requires " +
+      "an explicit model argument (per-key) or a store-side prefix delete");
+  }
+  return {
+    consume:   consume,
+    check:     check,
+    snapshot:  check,
+    reset:     reset,
+    dimension: dimension,
+    period:    period,
+  };
+}
+// Per-tenant / per-model / per-tenant-model limit-override maps are
+// validated at config time so a typo (negative cap, NaN) surfaces at
+// boot, not as a silent fall-through to the default ceiling.
+function _validateLimitMap(map, label) {
+  if (map == null) return {};
+  if (typeof map !== "object" || Array.isArray(map)) {
+    throw new AiQuotaError("aiQuota/bad-override",
+      "ai.quota.create: " + label + " must be a plain object { key: limit }");
+  }
+  var keys = Object.keys(map);
+  for (var i = 0; i < keys.length; i++) {
+    var v = map[keys[i]];
+    if (typeof v !== "number" || !isFinite(v) || v <= 0) {
+      throw new AiQuotaError("aiQuota/bad-override",
+        "ai.quota.create: " + label + "['" + keys[i] +
+        "'] must be a positive finite number");
+    }
+  }
+  return map;
+}
+function _validateStore(store) {
+  if (!store || typeof store !== "object" ||
+      typeof store.reserve !== "function" ||
+      typeof store.add !== "function" ||
+      typeof store.get !== "function" ||
+      typeof store.reset !== "function") {
+    throw new AiQuotaError("aiQuota/bad-store",
+      "ai.quota.create: store must expose reserve / add / get / reset functions");
+  }
+}
+module.exports = {
+  create:       create,
+  DIMENSIONS:   DIMENSIONS,
+  PERIODS:      PERIODS,
+  ENFORCEMENTS: ENFORCEMENTS,
+  AiQuotaError: AiQuotaError,
+};

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@blamejs/core",
-  "version": "0.12.26",
+  "version": "0.12.28",
   "description": "The Node framework that owns its stack.",
   "license": "Apache-2.0",
   "author": "blamejs contributors",

package/sbom.cdx.json CHANGED Viewed

@@ -2,10 +2,10 @@
   "$schema": "http://cyclonedx.org/schema/bom-1.5.schema.json",
   "bomFormat": "CycloneDX",
   "specVersion": "1.5",
-  "serialNumber": "urn:uuid:f16ae992-0fc9-4ad6-b05b-b6dfd629e058",
+  "serialNumber": "urn:uuid:2bff79a1-ab38-4b20-8cdc-37b5e80872a3",
   "version": 1,
   "metadata": {
-    "timestamp": "2026-05-24T10:57:05.537Z",
+    "timestamp": "2026-05-24T16:44:14.267Z",
     "lifecycles": [
       {
         "phase": "build"
@@ -19,14 +19,14 @@
       }
     ],
     "component": {
-      "bom-ref": "@blamejs/core@0.12.26",
+      "bom-ref": "@blamejs/core@0.12.28",
       "type": "application",
       "name": "blamejs",
-      "version": "0.12.26",
+      "version": "0.12.28",
       "scope": "required",
       "author": "blamejs contributors",
       "description": "The Node framework that owns its stack.",
-      "purl": "pkg:npm/%40blamejs/core@0.12.26",
+      "purl": "pkg:npm/%40blamejs/core@0.12.28",
       "properties": [],
       "externalReferences": [
         {
@@ -54,7 +54,7 @@
   "components": [],
   "dependencies": [
     {
-      "ref": "@blamejs/core@0.12.26",
+      "ref": "@blamejs/core@0.12.28",
       "dependsOn": []
     }
   ]