npm - volume-anomaly - Versions diffs - 0.1.0 → 1.2.3 - Mend

volume-anomaly 0.1.0 → 1.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -124,6 +124,44 @@ interface PredictionResult {
 }
 ```
+**Practical usage with `getAggregatedTrades` from `backtest-kit`:**
+> ⚠️ **Never pass the same trades in both `historical` and `recent`.** Training calibrates the baseline. If `recent` overlaps with `historical`, any anomaly in that period is absorbed into the baseline — the detector learns to treat it as normal and misses it. Always slice so that `recent` starts where `historical` ends.
+```typescript
+import { predict } from 'volume-anomaly';
+import type { IAggregatedTradeData } from 'volume-anomaly';
+// Your data-fetching function — returns the last `limit` trades, oldest first:
+declare function getAggregatedTrades(
+  symbol: string,
+  limit:  number,
+): Promise<IAggregatedTradeData[]>;
+// ── One-shot (single API call, zero overlap) ──────────────────────────────────
+const N_train  = 1200;  // calibration window
+const N_detect = 200;   // window to evaluate
+const all        = await getAggregatedTrades('BTCUSDT', N_train + N_detect);
+const historical = all.slice(0, N_train);  // older 1200 trades — baseline
+const recent     = all.slice(N_train);     // newest 200 trades — no overlap
+const result = predict(historical, recent, 0.75);
+// {
+//   anomaly:    true,
+//   confidence: 0.83,
+//   direction:  'long',   // 'long' | 'short' | 'neutral'
+//   imbalance:  0.61,
+// }
+if (result.anomaly) {
+  console.log(`direction=${result.direction}  confidence=${result.confidence.toFixed(2)}`);
+}
+```
+`predict()` trains a fresh detector on every call. For continuous monitoring (many `detect()` calls from one trained model) use `VolumeAnomalyDetector` directly — see the class API below.
 ---
 ### `new VolumeAnomalyDetector(config?)`
@@ -350,7 +388,7 @@ S⁻ₜ = max(0,  S⁻_{t-1} − xₜ + μ₀ − k)
 μ₀   = mean(|imbalance|)       over the training window
 σ₀²  = var(|imbalance|)        sample variance
 k    = cusumKSigmas · σ₀       (default 0.5σ)
-h    = cusumHSigmas · σ₀       (default 4σ)
+h    = cusumHSigmas · σ₀       (default 5σ)
 ```
 **Average run length under H₀ (ARL₀):** the expected number of observations before a false alarm. For Gaussian series, the approximate relationship between h, k and ARL₀ is:
@@ -511,11 +549,12 @@ import {
   hawkesLogLikelihood,
   hawkesFit,
   hawkesLambda,
+  hawkesPeakLambda,  // max λ(tᵢ) over window — used by the detector
   hawkesAnomalyScore,
   // CUSUM
   cusumFit,
-  cusumUpdate,       // returns { state, alarm }
+  cusumUpdate,       // returns { state, alarm, preResetState }
   cusumInitState,
   cusumAnomalyScore,
   cusumBatch,
@@ -544,6 +583,10 @@ Returns `{ params, logLik, stationarity, converged }`. `stationarity = α/β`. I
 Evaluates `λ(t)` at a specific time given a history of prior events. All timestamps must be `< t`.
+### `hawkesPeakLambda(timestamps, params)`
+Returns the **maximum** `λ(tᵢ)` over all events in `timestamps` using the O(n) recursive A(i) trick. This is what the detector uses internally instead of `hawkesLambda` — a burst that decayed by the last event is still captured. `hawkesLambda` evaluates at a single point; `hawkesPeakLambda` scans the full window.
 ### `cusumUpdate(state, x, params)`
 Pure function. Returns `{ state: CusumState, alarm: boolean, preResetState: CusumState }`. Does **not** mutate the input state. `preResetState` holds the accumulator values *before* the alarm reset — use it for scoring, since `state.sPos/sNeg` are zeroed when `alarm = true`.
@@ -572,26 +615,47 @@ BOCPD update is technically O(r_max) where r_max is the number of surviving run-
 ---
-## Training data guidance
+## Training and detection window sizes
+### `train()` — historical window
-| Trades in historical window | Quality |
-|----------------------------|---------|
-| < 50 | Rejected (throws) |
-| 50–200 | Minimal — CUSUM μ₀/σ₀ estimates unreliable |
-| 200–500 | Adequate for typical use |
-| 500–2000 | Good — stable Hawkes MLE, representative CUSUM baseline |
-| 2000+ | Best — especially important for low-activity pairs |
+The rolling imbalance series used to calibrate CUSUM and BOCPD has length `max(0, N − windowSize + 1)`. Too few trades → empty or near-empty calibration series → CUSUM baseline is a fallback (μ₀ = 0, σ₀ = 1) and Hawkes MLE is unreliable.
+| Trades in `historical` | Rolling windows for calibration¹ | Hawkes MLE | Notes |
+|------------------------|----------------------------------|------------|-------|
+| < 50 | — | — | **Rejected — `train()` throws** |
+| 50–99 | 1–50 | Borderline | CUSUM/BOCPD barely calibrated; Hawkes fallback path active (< 10 events triggers flat Poisson) |
+| 100–199 | 51–150 | Adequate | Practical minimum; mean/σ estimates reasonable |
+| 200–499 | 151–450 | Good | Stable MLE; recommended baseline for liquid pairs |
+| 500–2000 | 451–1951 | Robust | Best calibration; use for low-activity or volatile pairs |
+| > 2000 | > 1951 | Robust | Beware regime staleness — window may span multiple market conditions |
+¹ Assumes default `windowSize = 50`.
 The training window should represent **normal, in-control market conditions**. Fitting on data that already contains anomalies will inflate the baseline and reduce sensitivity. If your market opens with a gap or major event, use a calmer historical window from the previous session.
-**`windowSize` guidance** — the number of trades per rolling imbalance step:
+### `detect()` — recent window
+The same rolling logic applies: CUSUM and BOCPD only receive data when `trades ≥ windowSize`. Below that threshold only the Hawkes score contributes, and maximum confidence is `0.4 × hawkesScore ≤ 0.40` — the anomaly flag **cannot fire** at the default threshold of 0.75.
+| Trades in `recent` | Rolling windows¹ | All three detectors active | Notes |
+|--------------------|-----------------|---------------------------|-------|
+| < `windowSize` (< 50) | 0 | **No** | Hawkes-only; `anomaly` cannot fire at default threshold |
+| = `windowSize` (= 50) | 1 | Barely | Minimum for full detection; CUSUM/BOCPD signal is very sparse |
+| 2× `windowSize` (100) | 51 | Yes | **Recommended minimum** for production use |
+| 4× `windowSize` (200) | 151 | Yes | Good — default in code examples |
+| 10× `windowSize` (500) | 451 | Yes | Best accuracy; higher latency |
+**Rule of thumb:** `recent ≥ 2 × windowSize`. On BTC/USDT perpetual (windowSize = 50), 200 trades typically spans 5–30 seconds and is comfortably available from a real-time buffer.
+### `windowSize` guidance
-| `windowSize` | Trades in window | Sensitivity | Lag |
-|-------------|-----------------|-------------|-----|
-| 20 | 20 | Very high | Low |
-| 50 (default) | 50 | Balanced | Moderate |
-| 100 | 100 | Lower | Higher |
-| 200 | 200 | Low | High |
+| `windowSize` | Sensitivity | Lag | Minimum `train()` | Minimum `detect()` for full signal |
+|-------------|-------------|-----|-------------------|-------------------------------------|
+| 20 | Very high | Low | 50 trades (code minimum) | 40 trades |
+| 50 (default) | Balanced | Moderate | 100 trades (recommended) | 100 trades |
+| 100 | Lower | Higher | 200 trades | 200 trades |
+| 200 | Low | High | 400 trades | 400 trades |
 On high-volume pairs (BTC/USDT perpetual), 50 trades may span only 1–2 seconds. On low-volume pairs, 50 trades may span minutes. Calibrate to the effective time scale that matters for your entry.
@@ -641,21 +705,28 @@ async function onCandle(candles: Candle[], recentTrades: IAggregatedTradeData[])
 ## Tests
-**359 tests** across **11 test files**. All passing.
+**735 tests** across **18 test files**. All passing. 100% statement/function/line coverage, 98.72% branch (two unreachable `??` guards).
-| File | Tests | Coverage |
-|------|-------|----------|
+| File | Tests | What is covered |
+|------|-------|-----------------|
 | `hawkes.test.ts` | 20 | Imbalance formula, LL computation, MLE fitting, λ evaluation and decay, anomaly score monotonicity and supercritical clamp |
 | `cusum.test.ts` | 15 | Parameter estimation, state update (pure function), accumulation, alarm + reset, score range, batch detection |
-| `bocpd.test.ts` | 13 | Init state, t increment, probability normalisation, run length growth in stable regime, CP spike on distribution shift, immutability, batch changepoint detection |
-| `detector.test.ts` | 20 | Pre-train guard, isTrained flag, minimum training size, DetectionResult fields, confidence range, empty window, signal score range, functional API determinism |
+| `bocpd.test.ts` | 13 | Init state, update, probability normalisation, run length growth in stable regime, CP spike on distribution shift, immutability, batch |
+| `detector.test.ts` | 20 | Pre-train guard, isTrained flag, minimum training size, DetectionResult fields, confidence range, empty window, signal score range |
 | `detect.test.ts` | 36 | End-to-end anomaly detection, confidence thresholds, signal composition, edge inputs |
 | `seeded.test.ts` | 67 | Deterministic seeded scenarios covering long/short/neutral bursts across parameter space |
-| `predict.test.ts` | 24 | Direction assignment, trained imbalanceThreshold, imbalancePercentile config, trending vs balanced threshold, fallback 0.3 when window > training size |
+| `predict.test.ts` | 24 | Direction assignment, trained imbalanceThreshold, imbalancePercentile config, trending vs balanced threshold |
 | `invariants.test.ts` | 29 | Monotonicity, score bounds, immutability, score weight validation |
 | `adversarial.test.ts` | 58 | Adversarial inputs: NaN propagation, extreme values, Inf timestamps, zero-qty trades |
 | `falsepositive.test.ts` | 18 | Scenarios that must NOT trigger: gradual drift, HFT clusters, trending market, whale trades, overnight gaps |
-| `edgecases.test.ts` | 59 | Boundary conditions, empty arrays, signal threshold exact values, BOCPD pruning, regression for NaN bug |
+| `edgecases.test.ts` | 80 | Boundary conditions, signal threshold exact values (strict >), detect < windowSize bypass, train twice, cusumBatch multiple alarms |
+| `realdata.test.ts` | 23 | Real BTCUSDT-2025-03-01 data: 4 spike windows + 1 calm baseline |
+| `robustness.test.ts` | 66 | Mathematical invariants: range/symmetry/monotonicity for all functions, BOCPD normalisation Σexp(lp) ≤ 1, 100-case property-based detector test |
+| `extreme.test.ts` | 52 | Stuck-at-extremum: hazardLambda edge cases, μ = 0, extreme β, degenerate Nelder-Mead, Welford drift, β₀ = 0 |
+| `newextreme.test.ts` | 58 | NaN propagation in CUSUM/BOCPD, hawkesAnomalyScore extremes, cusumAnomalyScore h = NaN, prevRL = Inf/NaN, kappa0 = 0 |
+| `thirdextreme.test.ts` | 74 | hazardLambda = 0 collapse, β ≤ 0, hawkesFit n = 0/T = 0, hawkesPeakLambda n = 1/β = 0, volumeImbalance NaN qty, cusumFit NaN filter |
+| `fourthextreme.test.ts` | 63 | hawkesAnomalyScore NaN peak + valid params, cusumUpdate NaN params, cusumAnomalyScore NaN state, bocpdUpdate beta0 = Inf, Infinity qty |
+| `perf.test.ts` | 19 | Latency P95 bounds, throughput (detect(200) ≥ 800/s), scaling ratios, stability over 500 sequential calls |
 ```bash
 npm test

package/build/index.cjs CHANGED Viewed

@@ -108,6 +108,15 @@ function volumeImbalance(trades) {
     const total = buyVol + sellVol;
     if (total === 0)
         return 0;
+    // When total = Infinity (an overflowed qty) the division (buyVol−sellVol)/Infinity
+    // is NaN even for a one-sided burst (Inf/Inf = NaN in IEEE 754).
+    // Compare sides directly to get the correct ±1 / 0 answer.
+    // NaN total (from NaN qty) falls through to the regular division — GIGO.
+    if (total === Infinity) {
+        if (buyVol === sellVol)
+            return 0; // both Infinity — symmetric burst
+        return buyVol > sellVol ? 1 : -1;
+    }
     return (buyVol - sellVol) / total;
 }
 // ─── Log-likelihood (O(n) recursive) ────────────────────────────────────────
@@ -120,6 +129,11 @@ function hawkesLogLikelihood(timestamps, params) {
     const n = timestamps.length;
     if (n === 0)
         return 0;
+    // β ≤ 0: kernel exp(−β·dt) does not decay (diverges or flat).
+    // Compensator = (α/β)·(1−exp(−β·(T−tᵢ))) → Inf·0 = NaN when β=0.
+    // Return −Infinity so the optimizer treats this as an infeasible region.
+    if (beta <= 0)
+        return -Infinity;
     // Use observation window length, not absolute time, so the LL is invariant
     // to timestamp origin (works for both t0=0 and Unix-epoch seconds).
     const t0 = timestamps[0];
@@ -251,8 +265,15 @@ function hawkesAnomalyScore(peakLambda, params, empiricalRate = 0) {
     const meanLambda = params.mu / (1 - branching);
     // sigmoid centred at 2× baseline
     const sig = (ratio) => 1 / (1 + Math.exp(-(ratio - 2) * 2));
-    const intensityScore = sig(peakLambda / meanLambda);
-    const rateScore = empiricalRate > 0 ? sig(empiricalRate / params.mu) : 0;
+    // meanLambda = 0 when mu = 0: ratio = peakLambda / 0 = Infinity (score=1) when
+    // peakLambda > 0, or NaN (0/0) when peakLambda = 0.  Guard the NaN case.
+    // NaN peakLambda (e.g. timestamps contained NaN): treat as "no signal" → 0.
+    const intensityScore = meanLambda > 0
+        ? (Number.isNaN(peakLambda) ? 0 : sig(peakLambda / meanLambda))
+        : peakLambda > 0 ? 1 : 0;
+    const rateScore = empiricalRate > 0
+        ? (params.mu > 0 ? sig(empiricalRate / params.mu) : 1)
+        : 0;
     return Math.max(intensityScore, rateScore);
 }
@@ -278,12 +299,16 @@ function hawkesAnomalyScore(peakLambda, params, empiricalRate = 0) {
  * values — e.g. array of |imbalance| from a calm training window.
  */
 function cusumFit(values, kSigmas = 0.5, hSigmas = 4) {
-    if (values.length === 0) {
+    // Drop non-finite values (NaN, ±Infinity) before computing statistics.
+    // A single NaN in `values` would make mu0 = NaN, which later poisons the
+    // CUSUM accumulator even for valid observations (Math.max(0, x − NaN) = NaN).
+    const clean = values.filter(Number.isFinite);
+    if (clean.length === 0) {
         return { mu0: 0, std0: 1, k: kSigmas, h: hSigmas };
     }
-    const n = values.length;
-    const mu0 = values.reduce((s, x) => s + x, 0) / n;
-    const var0 = values.reduce((s, x) => s + (x - mu0) ** 2, 0) / Math.max(n - 1, 1);
+    const n = clean.length;
+    const mu0 = clean.reduce((s, x) => s + x, 0) / n;
+    const var0 = clean.reduce((s, x) => s + (x - mu0) ** 2, 0) / Math.max(n - 1, 1);
     const std0 = Math.sqrt(var0) || 1e-6;
     return {
         mu0,
@@ -297,7 +322,18 @@ function cusumFit(values, kSigmas = 0.5, hSigmas = 4) {
  * Pure function — does not mutate input.
  */
 function cusumUpdate(state, x, params) {
+    // Non-finite x (NaN, ±Infinity that doesn't trigger alarm) would poison the
+    // accumulators via Math.max(0, NaN) = NaN.  Skip the update entirely for NaN;
+    // ±Infinity is handled naturally (Inf ≥ h → alarm fires and resets state).
+    if (Number.isNaN(x)) {
+        return { alarm: false, preResetState: state, state };
+    }
     const { mu0, k, h } = params;
+    // NaN in mu0 or k also poisons the accumulator (x − NaN = NaN).
+    // Treat corrupt params as a no-op, same semantics as x=NaN.
+    if (!Number.isFinite(mu0) || !Number.isFinite(k)) {
+        return { alarm: false, preResetState: state, state };
+    }
     const sPos = Math.max(0, state.sPos + (x - mu0) - k);
     const sNeg = Math.max(0, state.sNeg - (x - mu0) - k);
     const alarm = sPos >= h || sNeg >= h;
@@ -321,7 +357,12 @@ function cusumInitState() {
  */
 function cusumAnomalyScore(state, params) {
     const s = Math.max(state.sPos, state.sNeg);
-    if (params.h <= 0)
+    // Math.max(NaN, finite) = NaN in JS (unlike some other languages).
+    // A poisoned state must not propagate NaN to the confidence score.
+    if (Number.isNaN(s))
+        return 0;
+    // NaN <= 0 is false in IEEE 754, so guard explicitly against non-finite h.
+    if (params.h <= 0 || !Number.isFinite(params.h))
         return 0;
     return Math.min(s / params.h, 1);
 }
@@ -389,6 +430,7 @@ function bocpdInitState() {
         logProbs: [0], // P(r₀ = 0) = 1  →  log = 0
         suffStats: [ssEmpty()],
         t: 0,
+        minRl: 0,
     };
 }
 /**
@@ -426,12 +468,20 @@ function bocpdUpdate(state, x, prior, hazardLambda = 200) {
     const keep = normLogProbs.map((lp) => lp > PRUNE_THRESH);
     const prunedLogProbs = normLogProbs.filter((_, i) => keep[i]);
     const prunedSuffStats = newSuffStats.filter((_, i) => keep[i]);
+    // Track actual run-length offset after pruning.
+    // normLogProbs[0] → RL 0; normLogProbs[i] (i>0) → RL state.minRl + i.
+    // When H=0 (hazardLambda=∞) the changepoint entry (i=0) gets log-prob −∞ and is
+    // pruned; the first surviving entry then represents RL state.minRl + firstKept, not 0.
+    const firstKept = keep.indexOf(true);
+    const newMinRl = firstKept <= 0 ? 0 : state.minRl + firstKept;
     const newState = {
         logProbs: prunedLogProbs,
         suffStats: prunedSuffStats,
         t: state.t + 1,
+        minRl: newMinRl,
     };
-    // MAP run length (highest probability)
+    // MAP run length: index in normLogProbs → actual run-length.
+    // normLogProbs[0] → RL 0; normLogProbs[r] (r>0) → RL state.minRl + r.
     let mapR = 0;
     let mapLP = -Infinity;
     for (let r = 0; r < normLogProbs.length; r++) {
@@ -440,10 +490,14 @@ function bocpdUpdate(state, x, prior, hazardLambda = 200) {
             mapR = r;
         }
     }
+    const mapRunLength = mapR === 0 ? 0 : state.minRl + mapR;
+    // normLogProbs[0] can be NaN when all log-probs are NaN (e.g. x=NaN, kappa0=0).
+    // `?? -Infinity` only guards undefined/null, not NaN.  Clamp to 0 explicitly.
+    const rawCp = Math.exp(normLogProbs[0] ?? -Infinity);
     return {
         state: newState,
-        mapRunLength: mapR,
-        cpProbability: Math.exp(normLogProbs[0] ?? -Infinity),
+        mapRunLength,
+        cpProbability: Number.isFinite(rawCp) ? rawCp : 0,
     };
 }
 // ─── Score ────────────────────────────────────────────────────────────────────
@@ -474,7 +528,9 @@ function bocpdUpdate(state, x, prior, hazardLambda = 200) {
  * @param prevRunLength  mapRunLength from the previous bocpdUpdate call.
  */
 function bocpdAnomalyScore(result, prevRunLength = 0) {
-    if (prevRunLength <= 0)
+    // NaN <= 0 is false (IEEE 754), and (Infinity - finite) / Infinity = NaN.
+    // Guard both: require prevRunLength to be a finite positive number.
+    if (!Number.isFinite(prevRunLength) || prevRunLength <= 0)
         return 0;
     const drop = Math.max(0, (prevRunLength - result.mapRunLength) / prevRunLength);
     return 1 / (1 + Math.exp(-(drop - 0.5) * 8));
@@ -538,6 +594,12 @@ class VolumeAnomalyDetector {
     constructor(config = {}) {
         this.cfg = { ...DEFAULTS, ...config };
         if (config.scoreWeights) {
+            if (!config.scoreWeights.every(Number.isFinite)) {
+                throw new Error(`scoreWeights must be finite numbers, got ${config.scoreWeights}`);
+            }
+            if (config.scoreWeights.some((w) => w < 0)) {
+                throw new Error(`scoreWeights must be non-negative, got ${config.scoreWeights}`);
+            }
             const sum = config.scoreWeights.reduce((a, b) => a + b, 0);
             if (Math.abs(sum - 1) > 1e-6) {
                 throw new Error(`scoreWeights must sum to 1, got ${sum}`);

package/build/index.mjs CHANGED Viewed

@@ -106,6 +106,15 @@ function volumeImbalance(trades) {
     const total = buyVol + sellVol;
     if (total === 0)
         return 0;
+    // When total = Infinity (an overflowed qty) the division (buyVol−sellVol)/Infinity
+    // is NaN even for a one-sided burst (Inf/Inf = NaN in IEEE 754).
+    // Compare sides directly to get the correct ±1 / 0 answer.
+    // NaN total (from NaN qty) falls through to the regular division — GIGO.
+    if (total === Infinity) {
+        if (buyVol === sellVol)
+            return 0; // both Infinity — symmetric burst
+        return buyVol > sellVol ? 1 : -1;
+    }
     return (buyVol - sellVol) / total;
 }
 // ─── Log-likelihood (O(n) recursive) ────────────────────────────────────────
@@ -118,6 +127,11 @@ function hawkesLogLikelihood(timestamps, params) {
     const n = timestamps.length;
     if (n === 0)
         return 0;
+    // β ≤ 0: kernel exp(−β·dt) does not decay (diverges or flat).
+    // Compensator = (α/β)·(1−exp(−β·(T−tᵢ))) → Inf·0 = NaN when β=0.
+    // Return −Infinity so the optimizer treats this as an infeasible region.
+    if (beta <= 0)
+        return -Infinity;
     // Use observation window length, not absolute time, so the LL is invariant
     // to timestamp origin (works for both t0=0 and Unix-epoch seconds).
     const t0 = timestamps[0];
@@ -249,8 +263,15 @@ function hawkesAnomalyScore(peakLambda, params, empiricalRate = 0) {
     const meanLambda = params.mu / (1 - branching);
     // sigmoid centred at 2× baseline
     const sig = (ratio) => 1 / (1 + Math.exp(-(ratio - 2) * 2));
-    const intensityScore = sig(peakLambda / meanLambda);
-    const rateScore = empiricalRate > 0 ? sig(empiricalRate / params.mu) : 0;
+    // meanLambda = 0 when mu = 0: ratio = peakLambda / 0 = Infinity (score=1) when
+    // peakLambda > 0, or NaN (0/0) when peakLambda = 0.  Guard the NaN case.
+    // NaN peakLambda (e.g. timestamps contained NaN): treat as "no signal" → 0.
+    const intensityScore = meanLambda > 0
+        ? (Number.isNaN(peakLambda) ? 0 : sig(peakLambda / meanLambda))
+        : peakLambda > 0 ? 1 : 0;
+    const rateScore = empiricalRate > 0
+        ? (params.mu > 0 ? sig(empiricalRate / params.mu) : 1)
+        : 0;
     return Math.max(intensityScore, rateScore);
 }
@@ -276,12 +297,16 @@ function hawkesAnomalyScore(peakLambda, params, empiricalRate = 0) {
  * values — e.g. array of |imbalance| from a calm training window.
  */
 function cusumFit(values, kSigmas = 0.5, hSigmas = 4) {
-    if (values.length === 0) {
+    // Drop non-finite values (NaN, ±Infinity) before computing statistics.
+    // A single NaN in `values` would make mu0 = NaN, which later poisons the
+    // CUSUM accumulator even for valid observations (Math.max(0, x − NaN) = NaN).
+    const clean = values.filter(Number.isFinite);
+    if (clean.length === 0) {
         return { mu0: 0, std0: 1, k: kSigmas, h: hSigmas };
     }
-    const n = values.length;
-    const mu0 = values.reduce((s, x) => s + x, 0) / n;
-    const var0 = values.reduce((s, x) => s + (x - mu0) ** 2, 0) / Math.max(n - 1, 1);
+    const n = clean.length;
+    const mu0 = clean.reduce((s, x) => s + x, 0) / n;
+    const var0 = clean.reduce((s, x) => s + (x - mu0) ** 2, 0) / Math.max(n - 1, 1);
     const std0 = Math.sqrt(var0) || 1e-6;
     return {
         mu0,
@@ -295,7 +320,18 @@ function cusumFit(values, kSigmas = 0.5, hSigmas = 4) {
  * Pure function — does not mutate input.
  */
 function cusumUpdate(state, x, params) {
+    // Non-finite x (NaN, ±Infinity that doesn't trigger alarm) would poison the
+    // accumulators via Math.max(0, NaN) = NaN.  Skip the update entirely for NaN;
+    // ±Infinity is handled naturally (Inf ≥ h → alarm fires and resets state).
+    if (Number.isNaN(x)) {
+        return { alarm: false, preResetState: state, state };
+    }
     const { mu0, k, h } = params;
+    // NaN in mu0 or k also poisons the accumulator (x − NaN = NaN).
+    // Treat corrupt params as a no-op, same semantics as x=NaN.
+    if (!Number.isFinite(mu0) || !Number.isFinite(k)) {
+        return { alarm: false, preResetState: state, state };
+    }
     const sPos = Math.max(0, state.sPos + (x - mu0) - k);
     const sNeg = Math.max(0, state.sNeg - (x - mu0) - k);
     const alarm = sPos >= h || sNeg >= h;
@@ -319,7 +355,12 @@ function cusumInitState() {
  */
 function cusumAnomalyScore(state, params) {
     const s = Math.max(state.sPos, state.sNeg);
-    if (params.h <= 0)
+    // Math.max(NaN, finite) = NaN in JS (unlike some other languages).
+    // A poisoned state must not propagate NaN to the confidence score.
+    if (Number.isNaN(s))
+        return 0;
+    // NaN <= 0 is false in IEEE 754, so guard explicitly against non-finite h.
+    if (params.h <= 0 || !Number.isFinite(params.h))
         return 0;
     return Math.min(s / params.h, 1);
 }
@@ -387,6 +428,7 @@ function bocpdInitState() {
         logProbs: [0], // P(r₀ = 0) = 1  →  log = 0
         suffStats: [ssEmpty()],
         t: 0,
+        minRl: 0,
     };
 }
 /**
@@ -424,12 +466,20 @@ function bocpdUpdate(state, x, prior, hazardLambda = 200) {
     const keep = normLogProbs.map((lp) => lp > PRUNE_THRESH);
     const prunedLogProbs = normLogProbs.filter((_, i) => keep[i]);
     const prunedSuffStats = newSuffStats.filter((_, i) => keep[i]);
+    // Track actual run-length offset after pruning.
+    // normLogProbs[0] → RL 0; normLogProbs[i] (i>0) → RL state.minRl + i.
+    // When H=0 (hazardLambda=∞) the changepoint entry (i=0) gets log-prob −∞ and is
+    // pruned; the first surviving entry then represents RL state.minRl + firstKept, not 0.
+    const firstKept = keep.indexOf(true);
+    const newMinRl = firstKept <= 0 ? 0 : state.minRl + firstKept;
     const newState = {
         logProbs: prunedLogProbs,
         suffStats: prunedSuffStats,
         t: state.t + 1,
+        minRl: newMinRl,
     };
-    // MAP run length (highest probability)
+    // MAP run length: index in normLogProbs → actual run-length.
+    // normLogProbs[0] → RL 0; normLogProbs[r] (r>0) → RL state.minRl + r.
     let mapR = 0;
     let mapLP = -Infinity;
     for (let r = 0; r < normLogProbs.length; r++) {
@@ -438,10 +488,14 @@ function bocpdUpdate(state, x, prior, hazardLambda = 200) {
             mapR = r;
         }
     }
+    const mapRunLength = mapR === 0 ? 0 : state.minRl + mapR;
+    // normLogProbs[0] can be NaN when all log-probs are NaN (e.g. x=NaN, kappa0=0).
+    // `?? -Infinity` only guards undefined/null, not NaN.  Clamp to 0 explicitly.
+    const rawCp = Math.exp(normLogProbs[0] ?? -Infinity);
     return {
         state: newState,
-        mapRunLength: mapR,
-        cpProbability: Math.exp(normLogProbs[0] ?? -Infinity),
+        mapRunLength,
+        cpProbability: Number.isFinite(rawCp) ? rawCp : 0,
     };
 }
 // ─── Score ────────────────────────────────────────────────────────────────────
@@ -472,7 +526,9 @@ function bocpdUpdate(state, x, prior, hazardLambda = 200) {
  * @param prevRunLength  mapRunLength from the previous bocpdUpdate call.
  */
 function bocpdAnomalyScore(result, prevRunLength = 0) {
-    if (prevRunLength <= 0)
+    // NaN <= 0 is false (IEEE 754), and (Infinity - finite) / Infinity = NaN.
+    // Guard both: require prevRunLength to be a finite positive number.
+    if (!Number.isFinite(prevRunLength) || prevRunLength <= 0)
         return 0;
     const drop = Math.max(0, (prevRunLength - result.mapRunLength) / prevRunLength);
     return 1 / (1 + Math.exp(-(drop - 0.5) * 8));
@@ -536,6 +592,12 @@ class VolumeAnomalyDetector {
     constructor(config = {}) {
         this.cfg = { ...DEFAULTS, ...config };
         if (config.scoreWeights) {
+            if (!config.scoreWeights.every(Number.isFinite)) {
+                throw new Error(`scoreWeights must be finite numbers, got ${config.scoreWeights}`);
+            }
+            if (config.scoreWeights.some((w) => w < 0)) {
+                throw new Error(`scoreWeights must be non-negative, got ${config.scoreWeights}`);
+            }
             const sum = config.scoreWeights.reduce((a, b) => a + b, 0);
             if (Math.abs(sum - 1) > 1e-6) {
                 throw new Error(`scoreWeights must sum to 1, got ${sum}`);

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "volume-anomaly",
-  "version": "0.1.0",
-  "description": "Volume anomaly detection via Hawkes, CUSUM, BOCPD",
+  "version": "1.2.3",
+  "description": "Statistical volume anomaly detection for trade streams - Hawkes process, CUSUM, and Bayesian Online Changepoint Detection (BOCPD). Zero dependencies. TypeScript.",
   "type": "module",
   "main": "./build/index.cjs",
   "module": "./build/index.mjs",
@@ -24,6 +24,19 @@
     "test:watch": "vitest",
     "prepublishOnly": "npm run build"
   },
+  "keywords": [
+    "anomaly-detection",
+    "volume",
+    "order-flow",
+    "hawkes-process",
+    "cusum",
+    "bocpd",
+    "changepoint-detection",
+    "trading",
+    "market-microstructure",
+    "typescript",
+    "zero-dependencies"
+  ],
   "devDependencies": {
     "@rollup/plugin-typescript": "^12.3.0",
     "@types/node": "^20.10.0",