PyPI - panelkit - Versions diffs - 0.2.3__tar.gz → 0.2.5__tar.gz - Mend

panelkit 0.2.3tar.gz → 0.2.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (82) hide show

{panelkit-0.2.3 → panelkit-0.2.5}/Cargo.lock RENAMED Viewed

@@ -462,7 +462,7 @@ checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e"
 [[package]]
 name = "panelkit-estimators"
-version = "0.2.3"
+version = "0.2.5"
 dependencies = [
  "criterion",
  "panelkit-linalg",
@@ -471,7 +471,7 @@ dependencies = [
 [[package]]
 name = "panelkit-geo"
-version = "0.2.3"
+version = "0.2.5"
 dependencies = [
  "panelkit-estimators",
  "panelkit-inference",
@@ -482,7 +482,7 @@ dependencies = [
 [[package]]
 name = "panelkit-inference"
-version = "0.2.3"
+version = "0.2.5"
 dependencies = [
  "panelkit-estimators",
  "panelkit-linalg",
@@ -491,7 +491,7 @@ dependencies = [
 [[package]]
 name = "panelkit-linalg"
-version = "0.2.3"
+version = "0.2.5"
 dependencies = [
  "proptest",
  "rayon",
@@ -623,7 +623,7 @@ dependencies = [
 [[package]]
 name = "pypanelkit"
-version = "0.2.3"
+version = "0.2.5"
 dependencies = [
  "numpy",
  "panelkit-estimators",

{panelkit-0.2.3 → panelkit-0.2.5}/Cargo.toml RENAMED Viewed

@@ -3,7 +3,7 @@ resolver = "2"
 members = ["crates/linalg", "crates/estimators", "crates/inference", "crates/geo", "crates/pypanelkit"]
 [workspace.package]
-version = "0.2.3"
+version = "0.2.5"
 edition = "2021"
 rust-version = "1.74"
 license = "MIT OR Apache-2.0"

{panelkit-0.2.3 → panelkit-0.2.5}/GUIDE.md RENAMED Viewed

@@ -300,10 +300,16 @@ ev.plot_effect_over_time("effect.png")  # pointwise + cumulative over time, w/ C
 ev.lift, ev.cumulative, ev.significant
 ```
-Each estimate gets a confidence interval from a **stationary block bootstrap** of
-its post-period effect path; an **SC in-space placebo** supplies a p-value. The
-ensemble uses the same `weights` choices as `power()` (`"auto"` = inverse-variance
-from each method's bootstrap SE, `"equal"`, or an explicit dict/list). `ev` exposes
+Inference is **in-space placebo** (Abadie): every donor market is refit as if it
+were the treated one, and the spread of *their* post-period effects is the null
+reference — capturing out-of-sample extrapolation error, the real source of
+uncertainty. (A bootstrap of the treated unit's own post-period only sees
+in-sample noise and is wildly anti-conservative — on null data its 90% interval
+falsely flags an effect ~50% of the time; the placebo version sits at/below the
+nominal 10%.) Poorly-fit placebos (pre-period RMSPE > 2× the treated unit's) are
+dropped, per Abadie. The p-value is the placebo rank of the treated effect, and
+`"auto"` ensemble weights are inverse-variance from each method's placebo-null
+spread. `ev` exposes
 `.lift`, `.att`, `.cumulative`, `.significant`, the per-method results in `ev.per`,
 and the ensemble in `ev.ensemble`. Reported numbers: **% lift** (effect ÷
 counterfactual), **per-period ATT**, and **cumulative incremental** over the
@@ -315,13 +321,13 @@ you can see it sits flat (centered on zero) inside the noise band before the tes
 starts (a placebo check) and breaks out after — and the running **cumulative
 incremental**, each as a point estimate with a confidence band. The counterfactual
 is centered on the pre-period, so the gap shows fit quality rather than a level
-offset (SDID matches trends, not levels). The bands come from a **moving-block
-bootstrap** of the pre-period residuals: resampling whole blocks preserves their
-autocorrelation, so the intervals are more conservative than an iid normal
-approximation — the cumulative band in particular widens faster than √k when the
-residuals are positively autocorrelated. Raise `block_len` to capture longer-range
-dependence (wider, more conservative cumulative bands). Pass `exclude=[…]` to drop
-markets from the control pool (e.g. ones you don't trust as donors).
+offset (SDID matches trends, not levels). The bands come from the **in-space
+placebo** distribution: at each horizon, the pointwise band is the spread of the
+donor placebos' per-period effects, and the cumulative band is the spread of their
+cumulative sums (so it fans out with horizon). Placebo inference needs a decent
+donor pool to have power — with only a handful of comparable donors the intervals
+are necessarily wide. Pass `exclude=[…]` to drop markets from the control pool
+(e.g. ones you don't trust as donors).
 ### Choosing a specification — `design.recommend(test_lengths, n_geos_options, target_lift, alphas=…)`

{panelkit-0.2.3 → panelkit-0.2.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: panelkit
-Version: 0.2.3
+Version: 0.2.5
 Classifier: Programming Language :: Rust
 Classifier: Programming Language :: Python :: 3
 Classifier: Topic :: Scientific/Engineering
@@ -273,8 +273,8 @@ per-cell MDE/confidence/holdout report and a combined figure:
 **Evaluate a test that ran.** `evaluate(...)` is the measurement counterpart to
 the power analysis: fit SC / ASC / SDID on a test that already happened, blend
 them into a weighted-average **ensemble** estimate, and report each one's lift,
-confidence interval (stationary block bootstrap), and cumulative incremental —
-with an SC in-space placebo p-value:
+confidence interval (in-space placebo), and cumulative incremental —
+with an in-space placebo p-value:
 ![test evaluation](assets/geo_evaluate.png)
@@ -316,7 +316,7 @@ What you get out of the box:
 - **A weighted-average ensemble** of SC + ASC + SDID (combined per placebo window,
   with auto inverse-variance weights) for a steadier estimate than any one method.
 - **Post-test evaluation** — `evaluate()` measures a test that already ran:
-  per-method + ensemble lift, bootstrap CIs, cumulative incremental, and a p-value.
+  per-method + ensemble lift, in-space placebo CIs, cumulative incremental, and a p-value.
 See [`examples/geo_demo.py`](examples/geo_demo.py).

{panelkit-0.2.3 → panelkit-0.2.5}/README.md RENAMED Viewed

@@ -243,8 +243,8 @@ per-cell MDE/confidence/holdout report and a combined figure:
 **Evaluate a test that ran.** `evaluate(...)` is the measurement counterpart to
 the power analysis: fit SC / ASC / SDID on a test that already happened, blend
 them into a weighted-average **ensemble** estimate, and report each one's lift,
-confidence interval (stationary block bootstrap), and cumulative incremental —
-with an SC in-space placebo p-value:
+confidence interval (in-space placebo), and cumulative incremental —
+with an in-space placebo p-value:
 ![test evaluation](assets/geo_evaluate.png)
@@ -286,7 +286,7 @@ What you get out of the box:
 - **A weighted-average ensemble** of SC + ASC + SDID (combined per placebo window,
   with auto inverse-variance weights) for a steadier estimate than any one method.
 - **Post-test evaluation** — `evaluate()` measures a test that already ran:
-  per-method + ensemble lift, bootstrap CIs, cumulative incremental, and a p-value.
+  per-method + ensemble lift, in-space placebo CIs, cumulative incremental, and a p-value.
 See [`examples/geo_demo.py`](examples/geo_demo.py).

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/sc/sdid.rs RENAMED Viewed

@@ -87,6 +87,10 @@ pub fn fit_at(panel: &Panel, t0: usize, cfg: SdidConfig) -> ScFit {
     let t = panel.n_periods();
     let t_pre = t0;
     let t_post = t - t0;
+    assert!(
+        t_pre >= 1 && t_post >= 1,
+        "SDID needs at least one pre- and one post-period (t0 in 1..n_periods)"
+    );
     let n_tr = treated.len();
     // Treated-average series.

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/opt/simplex.rs RENAMED Viewed

@@ -30,17 +30,14 @@ pub fn project_simplex(v: &[f64]) -> Vec<f64> {
     let mut u = v.to_vec();
     u.sort_by(|a, b| b.partial_cmp(a).unwrap()); // descending
     let mut css = 0.0;
-    let mut rho = 0usize;
     let mut theta = 0.0;
     for (j, &uj) in u.iter().enumerate() {
         css += uj;
         let t = (css - 1.0) / (j as f64 + 1.0);
         if uj - t > 0.0 {
-            rho = j + 1;
             theta = t;
         }
     }
-    let _ = rho;
     v.iter().map(|&vi| (vi - theta).max(0.0)).collect()
 }

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/src/api_sc.rs RENAMED Viewed

@@ -111,13 +111,15 @@ pub fn fit_sdid(
 /// Fit Matrix-Completion NNM (Athey et al. 2021). `max_rank`, when set, uses a
 /// fast randomized truncated SVD inside SoftImpute (big speedup, low-rank cap).
 #[pyfunction]
-#[pyo3(signature = (y, treated, treat_time, lambda=None, max_iter=200, tol=1e-5, seed=0, max_rank=None))]
+// `lambda_` (not `lambda`) so it is usable as a Python keyword argument —
+// `lambda` is a reserved word in Python.
+#[pyo3(signature = (y, treated, treat_time, lambda_=None, max_iter=200, tol=1e-5, seed=0, max_rank=None))]
 #[allow(clippy::too_many_arguments)]
 pub fn fit_mcnnm(
     y: PyReadonlyArray2<f64>,
     treated: Vec<usize>,
     treat_time: usize,
-    lambda: Option<f64>,
+    lambda_: Option<f64>,
     max_iter: usize,
     tol: f64,
     seed: u64,
@@ -125,7 +127,7 @@ pub fn fit_mcnnm(
 ) -> PyResult<PyScResult> {
     let panel = Panel::block(mat_from_numpy(&y), &treated, treat_time);
     let cfg = McnnmConfig {
-        lambda,
+        lambda: lambda_,
         max_iter,
         tol,
         seed,

{panelkit-0.2.3 → panelkit-0.2.5}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "maturin"
 [project]
 name = "panelkit"
-version = "0.2.3"
+version = "0.2.5"
 description = "Fast, from-scratch causal-inference estimators for panel/geo experiments (SC, ASC, SDID, DiD, MC-NNM)."
 readme = "README.md"
 requires-python = ">=3.9"

{panelkit-0.2.3 → panelkit-0.2.5}/python/panelkit/_panelkit.pyi RENAMED Viewed

@@ -82,7 +82,7 @@ def fit_mcnnm(
     y: npt.NDArray[np.float64],
     treated: Sequence[int],
     treat_time: int,
-    lambda_: Optional[float] = ...,
+    lambda_: Optional[float] = ...,  # NOTE: matches the Rust binding's `lambda_`
     max_iter: int = ...,
     tol: float = ...,
     seed: int = ...,

{panelkit-0.2.3 → panelkit-0.2.5}/python/panelkit/design.py RENAMED Viewed

@@ -42,7 +42,8 @@ def _ensemble_weight_arg(spec):
         raise ValueError(f"unknown ensemble_weights {spec!r} (use 'auto', 'equal', "
                          "a dict, or a 3-list)")
     if isinstance(spec, dict):
-        w = [float(spec.get(m, spec.get(m.lower(), 0.0))) for m in _ENSEMBLE_ORDER]
+        norm = {str(k).upper(): v for k, v in spec.items()}  # case-insensitive keys
+        w = [float(norm.get(m, 0.0)) for m in _ENSEMBLE_ORDER]
     else:
         w = [float(x) for x in spec]
         if len(w) != 3:
@@ -52,26 +53,6 @@ def _ensemble_weight_arg(spec):
     return w
-def _placebo_paths(pre_gaps, length, block_len, n_reps, seed):
-    """Moving-block bootstrap of the (centered) pre-period residuals into placebo
-    paths of ``length`` periods. Resampling whole blocks preserves the residual
-    autocorrelation, so the resulting CI bands are more conservative than an iid
-    normal approximation. Returns an ``(n_reps, length)`` array (empty if no
-    pre-period or zero length)."""
-    g = np.asarray(pre_gaps, dtype=float)
-    m = len(g)
-    if m == 0 or length <= 0 or n_reps <= 0:
-        return np.empty((0, max(length, 0)))
-    g = g - g.mean()  # null is "no effect" → center the residuals
-    rng = np.random.default_rng(int(seed))
-    bl = max(1, min(int(block_len), m))
-    n_blocks = int(np.ceil(length / bl))
-    starts = rng.integers(0, m, size=(n_reps, n_blocks))
-    idx = (starts[:, :, None] + np.arange(bl)[None, None, :]) % m  # circular blocks
-    paths = g[idx].reshape(n_reps, n_blocks * bl)[:, :length]
-    return paths
 class _PowerReport:
     """Result of a power analysis across methods, with a report and plots."""
@@ -427,7 +408,7 @@ class GeoDesign:
                              target_power=target_power, recommended=recommended,
                              lookback=lookback, ensemble=ensemble,
                              ensemble_weights=ensemble_weights)
-        idx = self._resolve(treated)
+        idx = list(dict.fromkeys(self._resolve(treated)))  # dedup, preserve order
         names = [self.names[i] for i in idx]
         lifts = list(_DEFAULT_LIFTS if lifts is None else lifts)
         if 0.0 not in lifts:
@@ -463,7 +444,7 @@ class GeoDesign:
             if bad:
                 raise ValueError(f"treated markets were also excluded: {bad}")
             return sub.diagnose(tnames, test_len)
-        idx = self._resolve(treated)
+        idx = list(dict.fromkeys(self._resolve(treated)))  # dedup, preserve order
         names = [self.names[i] for i in idx]
         t0 = self.t - int(test_len)
         diag = _panelkit.geo_diagnostics(self.Y, idx, int(test_len))
@@ -701,8 +682,7 @@ class GeoDesign:
         methods: Sequence[str] = _METHODS,
         weights="auto",
         level: float = 0.90,
-        n_boot: int = 2000,
-        block_len: int = 4,
+        max_placebo: int = 200,
         seed: int = 0,
         exclude=None,
     ) -> "_EvalReport":
@@ -711,9 +691,15 @@ class GeoDesign:
         This is the measurement counterpart to :meth:`power`: given the treated
         markets and the period treatment began (``treat_start``, the first
         post-period column), it fits SC / ASC / SDID, reports each one's effect,
-        and combines them into a weighted-average **ensemble** estimate. Each
-        estimate gets a confidence interval from a stationary block bootstrap of
-        its post-period effect path; an SC in-space placebo supplies a p-value.
+        and combines them into a weighted-average **ensemble** estimate.
+        Inference is **in-space placebo** (Abadie): every donor market is refit as
+        if it were the treated one, and the spread of *their* post-period effects
+        is the null reference. This captures out-of-sample extrapolation error —
+        the dominant source of uncertainty — so the intervals are calibrated
+        (unlike a bootstrap of the treated unit's own post-period, which only sees
+        in-sample noise and is far too narrow). Poorly-fit placebos (pre-period
+        RMSPE > 2× the treated unit's) are dropped, per Abadie.
         Parameters
         ----------
@@ -725,11 +711,13 @@ class GeoDesign:
             Which estimators to fit and blend.
         weights : "auto" | "equal" | dict
             Ensemble weighting. ``"auto"`` is inverse-variance (precision)
-            weighting from each method's bootstrap standard error.
+            weighting from each method's placebo-null spread.
         level : float
             Confidence level for the intervals (e.g. 0.90).
-        n_boot, block_len, seed :
-            Stationary-bootstrap settings for the effect-path CIs.
+        max_placebo : int
+            Cap on the number of donor placebos used (sampled if exceeded).
+        seed : int
+            Seed for placebo sampling when ``max_placebo`` is exceeded.
         Returns
         -------
@@ -745,8 +733,8 @@ class GeoDesign:
             if bad:
                 raise ValueError(f"treated markets were also excluded: {bad}")
             return sub.evaluate(tnames, treat_start, methods=methods, weights=weights,
-                                level=level, n_boot=n_boot, block_len=block_len, seed=seed)
-        idx = self._resolve(treated)
+                                level=level, max_placebo=max_placebo, seed=seed)
+        idx = list(dict.fromkeys(self._resolve(treated)))  # dedup, preserve order
         names = [self.names[i] for i in idx]
         t0 = int(treat_start)
         if not (1 <= t0 < self.t):
@@ -757,27 +745,28 @@ class GeoDesign:
         if unknown:
             raise ValueError(f"unknown methods {unknown}; choose from {_METHODS}")
-        fitters = {
-            "SC": lambda: _panelkit.fit_sc(self.Y, idx, t0, 0.0, False, level),
-            "ASC": lambda: _panelkit.fit_asc(self.Y, idx, t0, 0.0, None),
-            "SDID": lambda: _panelkit.fit_sdid(self.Y, idx, t0, 1.0),
-        }
+        def _fit(method, tr):
+            if method == "SC":
+                return _panelkit.fit_sc(self.Y, tr, t0, 0.0, False, level)
+            if method == "ASC":
+                return _panelkit.fit_asc(self.Y, tr, t0, 0.0, None)
+            return _panelkit.fit_sdid(self.Y, tr, t0, 1.0)
         treated_series = self.Y[idx].mean(axis=0)
+        post_len = self.t - t0
+        order = methods
+        # --- point estimates on the treated set ---
         per = {}
         for m in methods:
-            fit = fitters[m]()
+            fit = _fit(m, idx)
             att_path = np.asarray(fit.att_path, dtype=float)
             cf = np.asarray(fit.counterfactual, dtype=float)
             att = float(fit.att)
             cf_mean = float(np.mean(cf)) if cf.size else float("nan")
-            se, lo, hi = _panelkit.bootstrap_mean(
-                att_path.tolist(), "stationary", int(block_len), int(n_boot),
-                int(seed), float(level))
-            # Full-timeline counterfactual via donor weights (exact for SC; the
-            # dominant term for ASC/SDID). Center on the pre-period so the gap
-            # reflects FIT, not a level offset — SDID is level-agnostic (matches
-            # trends, not levels), so its donor-weighted series sits at a constant
-            # offset that would otherwise look like a non-zero pre-period.
+            # Full-timeline counterfactual via donor weights, centered on the
+            # pre-period so the gap reflects FIT, not a level offset (SDID matches
+            # trends, not levels).
             dids = np.asarray(fit.donor_ids, dtype=int)
             ws = np.asarray(fit.weights, dtype=float)
             if dids.size:
@@ -787,25 +776,40 @@ class GeoDesign:
                 full_cf = np.full(self.t, np.nan)
             per[m] = {
                 "att": att, "att_path": att_path, "counterfactual": cf,
-                "full_cf": full_cf,
-                "cf_mean": cf_mean, "lift": att / cf_mean if cf_mean else float("nan"),
-                "se": se, "att_lo": lo, "att_hi": hi,
-                "lift_lo": lo / cf_mean if cf_mean else float("nan"),
-                "lift_hi": hi / cf_mean if cf_mean else float("nan"),
+                "full_cf": full_cf, "cf_mean": cf_mean,
+                "lift": att / cf_mean if cf_mean else float("nan"),
                 "cumulative": float(att_path.sum()) * n_treated,
                 "pre_rmspe": float(fit.pre_rmspe),
             }
-        # Ensemble: weight-average the post-period effect paths, then summarize.
-        order = methods
+        # --- in-space placebo: refit each donor as if it were treated ---
+        treated_set = set(idx)
+        donors = [u for u in range(self.n) if u not in treated_set]
+        if len(donors) > int(max_placebo):
+            rng = np.random.default_rng(int(seed))
+            donors = sorted(int(j) for j in rng.choice(donors, int(max_placebo), replace=False))
+        pb = {m: [] for m in methods}        # per method: list of (att_path, pre_rmspe)
+        for j in donors:
+            for m in methods:
+                fj = _fit(m, [j])
+                pb[m].append((np.asarray(fj.att_path, dtype=float), float(fj.pre_rmspe)))
+        # --- ensemble weights ---
+        def _placebo_att_sd(m):
+            if not pb[m]:
+                return 1.0
+            vals = np.array([p.mean() for (p, _) in pb[m]])
+            return float(np.std(vals)) if len(vals) > 1 else 1.0
         if isinstance(weights, str) and weights.lower() == "equal":
             wv = [1.0 / len(order)] * len(order)
         elif isinstance(weights, str) and weights.lower() == "auto":
-            prec = [1.0 / max(per[m]["se"] ** 2, 1e-300) for m in order]
+            # inverse-variance from each method's placebo-null spread (precision)
+            prec = [1.0 / max(_placebo_att_sd(m) ** 2, 1e-300) for m in order]
             s = sum(prec)
             wv = [p / s for p in prec] if s > 0 else [1.0 / len(order)] * len(order)
         elif isinstance(weights, dict):
-            raw = [float(weights.get(m, weights.get(m.lower(), 0.0))) for m in order]
+            norm = {str(k).upper(): v for k, v in weights.items()}  # case-insensitive
+            raw = [float(norm.get(m, 0.0)) for m in order]
             s = sum(raw)
             if s <= 0:
                 raise ValueError("ensemble weights must sum to > 0")
@@ -817,71 +821,103 @@ class GeoDesign:
             s = sum(raw)
             wv = [r / s for r in raw]
         wmap = dict(zip(order, wv))
+        a = (1.0 - float(level)) / 2.0
+        def _ci(point, null_samples):
+            """Pivot CI: point estimate ± the placebo null spread (null ≈ 0).
+            Returns NaN when there are too few placebos to form an interval —
+            never a fake zero-width CI."""
+            if len(null_samples) >= 2:
+                return point + float(np.quantile(null_samples, a)), \
+                    point + float(np.quantile(null_samples, 1.0 - a))
+            return float("nan"), float("nan")
+        def _kept_att(samples, treated_pre_m):
+            """Placebo att-means after the Abadie 2x pre-fit filter (fallback to
+            all placebos if too few comparable ones survive)."""
+            keep = [p.mean() for (p, pre) in samples
+                    if treated_pre_m <= 0 or pre <= 2.0 * treated_pre_m]
+            if len(keep) < 5 and samples:
+                keep = [p.mean() for (p, _) in samples]
+            return np.array(keep)
+        # --- per-method point CIs from each method's placebo att spread (same
+        #     2x pre-fit filter as the ensemble, for internal consistency) ---
+        for m in order:
+            mp = _kept_att(pb[m], per[m]["pre_rmspe"])
+            lo, hi = _ci(per[m]["att"], mp)
+            cfm = per[m]["cf_mean"]
+            per[m]["att_lo"], per[m]["att_hi"] = lo, hi
+            per[m]["lift_lo"] = lo / cfm if cfm else float("nan")
+            per[m]["lift_hi"] = hi / cfm if cfm else float("nan")
+        # --- ensemble estimate + ensemble placebo paths (Abadie pre-fit filter) ---
         ens_path = sum(wmap[m] * per[m]["att_path"] for m in order)
         ens_cf_mean = float(sum(wmap[m] * per[m]["cf_mean"] for m in order))
         ens_att = float(ens_path.mean())
-        se, lo, hi = _panelkit.bootstrap_mean(
-            ens_path.tolist(), "stationary", int(block_len), int(n_boot),
-            int(seed), float(level))
+        treated_pre = sum(wmap[m] * per[m]["pre_rmspe"] for m in order)
+        ens_pb = []  # (path, pre_rmspe)
+        for di in range(len(donors)):
+            path = sum(wmap[m] * pb[m][di][0] for m in order)
+            pre = sum(wmap[m] * pb[m][di][1] for m in order)
+            ens_pb.append((path, pre))
+        kept = [p for (p, pre) in ens_pb if treated_pre <= 0 or pre <= 2.0 * treated_pre]
+        if len(kept) < 5:                      # too few comparable placebos → use all
+            kept = [p for (p, _) in ens_pb]
+        pb_mat = np.array(kept) if kept else np.zeros((0, post_len))
+        n_pb = pb_mat.shape[0]
+        # pointwise + cumulative + mean CIs, all from the placebo null
+        if n_pb >= 2:
+            point_lo = ens_path + np.quantile(pb_mat, a, axis=0)
+            point_hi = ens_path + np.quantile(pb_mat, 1.0 - a, axis=0)
+            point_hw = float(np.quantile(np.abs(pb_mat), float(level)))
+            cum_pb = np.cumsum(pb_mat, axis=1)
+            run = np.cumsum(ens_path)
+            cum_lo_band = np.quantile(cum_pb, a, axis=0)
+            cum_hi_band = np.quantile(cum_pb, 1.0 - a, axis=0)
+            pb_att = pb_mat.mean(axis=1)
+            p_value = float((1.0 + np.sum(np.abs(pb_att) >= abs(ens_att))) / (1.0 + n_pb))
+        else:
+            # too few comparable placebos → inference undefined (no fake band)
+            run = np.cumsum(ens_path)
+            point_lo = np.full(post_len, np.nan)
+            point_hi = np.full(post_len, np.nan)
+            point_hw = 0.0
+            cum_lo_band = cum_hi_band = np.full(post_len, np.nan)
+            pb_att = np.array([])
+            p_value = None
+        att_lo, att_hi = _ci(ens_att, pb_att)
+        cum_curve = run * n_treated
         ensemble = {
-            "att": ens_att, "att_path": ens_path, "se": se,
-            "att_lo": lo, "att_hi": hi,
+            "att": ens_att, "att_path": ens_path,
+            "att_lo": att_lo, "att_hi": att_hi,
             "lift": ens_att / ens_cf_mean if ens_cf_mean else float("nan"),
-            "lift_lo": lo / ens_cf_mean if ens_cf_mean else float("nan"),
-            "lift_hi": hi / ens_cf_mean if ens_cf_mean else float("nan"),
+            "lift_lo": att_lo / ens_cf_mean if ens_cf_mean else float("nan"),
+            "lift_hi": att_hi / ens_cf_mean if ens_cf_mean else float("nan"),
             "cumulative": float(ens_path.sum()) * n_treated,
-            "weights": wmap,
+            "weights": wmap, "n_placebo": n_pb,
+            "low_power": n_pb < 8,   # too few placebos for reliable inference
         }
-        # Significance: SC in-space placebo p-value.
-        sc = _panelkit.fit_sc(self.Y, idx, t0, 0.0, True, level)
-        p_value = sc.p_value
-        # Full-timeline ensemble counterfactual + gap path (pre-period shows fit,
-        # post-period uses the exact ensemble effect).
+        # full-timeline counterfactual + gap path (pre shows fit; post = effect)
         ens_full_cf = sum(wmap[m] * per[m]["full_cf"] for m in order)
         full_gap = treated_series - ens_full_cf
-        full_gap[t0:] = ens_path                       # exact ensemble post effect
-        counterfactual = treated_series - full_gap     # consistent everywhere
-        pre_gaps = full_gap[:t0]
-        sigma_pre = float(np.std(pre_gaps, ddof=1)) if t0 > 1 else float(np.std(pre_gaps))
-        # CI bands from a MOVING-BLOCK BOOTSTRAP of the pre-period residuals.
-        # Blocks preserve autocorrelation, so the bands are more conservative than
-        # an iid normal approximation — especially the cumulative band, whose
-        # spread grows faster than sqrt(k) under positive autocorrelation.
-        post_len = self.t - t0
-        a = (1.0 - float(level)) / 2.0
-        paths = _placebo_paths(pre_gaps, post_len, int(block_len), int(n_boot), int(seed))
-        if paths.size:
-            point_lo = np.quantile(paths, a, axis=0)
-            point_hi = np.quantile(paths, 1.0 - a, axis=0)
-            point_hw = float(np.quantile(np.abs(paths), float(level)))  # symmetric, full-timeline
-            cum_paths = np.cumsum(paths, axis=1)
-            cum_band_lo = np.quantile(cum_paths, a, axis=0)
-            cum_band_hi = np.quantile(cum_paths, 1.0 - a, axis=0)
-        else:
-            point_lo = point_hi = np.zeros(post_len)
-            point_hw = 0.0
-            cum_band_lo = cum_band_hi = np.zeros(post_len)
-        ens_post = ens_path
-        run = np.cumsum(ens_post)
-        cum_curve = run * n_treated
-        cum_lo_curve = (run + cum_band_lo) * n_treated
-        cum_hi_curve = (run + cum_band_hi) * n_treated
-        ensemble["sigma_pre"] = sigma_pre
+        full_gap[t0:] = ens_path
+        counterfactual = treated_series - full_gap
         ensemble["full_gap"] = full_gap
-        ensemble["point_hw"] = point_hw                       # constant pointwise half-width
-        ensemble["point_lo"] = ens_post + point_lo            # per-period CI on the effect
-        ensemble["point_hi"] = ens_post + point_hi
-        ensemble["cum_curve"] = cum_curve                     # cumulative incremental path
-        ensemble["cum_lo_curve"] = cum_lo_curve
-        ensemble["cum_hi_curve"] = cum_hi_curve
-        ensemble["cum_lo"] = float(cum_lo_curve[-1]) if post_len else float("nan")
-        ensemble["cum_hi"] = float(cum_hi_curve[-1]) if post_len else float("nan")
+        ensemble["sigma_pre"] = (float(np.std(full_gap[:t0], ddof=1)) if t0 > 1
+                                 else float(np.std(full_gap[:t0])))
+        ensemble["point_hw"] = point_hw
+        ensemble["point_lo"] = point_lo
+        ensemble["point_hi"] = point_hi
+        ensemble["cum_curve"] = cum_curve
+        ensemble["cum_lo_curve"] = (run + cum_lo_band) * n_treated
+        ensemble["cum_hi_curve"] = (run + cum_hi_band) * n_treated
+        ensemble["cum_lo"] = float(ensemble["cum_lo_curve"][-1]) if post_len else float("nan")
+        ensemble["cum_hi"] = float(ensemble["cum_hi_curve"][-1]) if post_len else float("nan")
         return _EvalReport(names, t0, n_treated, per, ensemble, p_value, level,
                            treated_series, counterfactual)
@@ -981,11 +1017,14 @@ class _MultiCellReport:
                      f"({', '.join(map(str, self.cells))})")
         lines.append(f"Test duration     : {self.test_len} periods")
         lines.append(f"Shared donor pool : {len(self.donor_names)} markets")
-        lines.append(f"Combined holdout  : {100*self.pooled_holdout:.1f}% of total volume")
+        lines.append(f"Combined holdout  : {100*self.pooled_holdout:.1f}% of total volume "
+                     f"(all cells together)")
         lines.append(f"Powered at {int(100*self.target_power)}% power, "
                      f"{int(100*(1-self.alpha))}% confidence "
                      f"(each cell vs. the shared pool).")
         lines.append("")
+        # Per-cell 'Holdout' is that cell's share of its OWN sub-panel (cell +
+        # shared donors); the Combined holdout above is over the full panel.
         lines.append(f"{'Cell':<14}{'Markets':<28}{'MDE':>8}{'Conf':>7}{'Holdout':>9}")
         lines.append("-" * 64)
         for label, rep in self.cells.items():
@@ -1050,8 +1089,11 @@ class _EvalReport:
     @property
     def significant(self):
-        """True if the ensemble CI excludes zero (effect detected)."""
+        """True if the ensemble CI is well-defined and excludes zero. Returns
+        False when inference is undefined (too few placebos → NaN interval)."""
         lo, hi = self.ensemble["att_lo"], self.ensemble["att_hi"]
+        if not (np.isfinite(lo) and np.isfinite(hi)):
+            return False
         return (lo > 0) or (hi < 0)
     def summary(self) -> str:
@@ -1073,17 +1115,25 @@ class _EvalReport:
         lines.append(f"   ensemble weights: {wstr}")
         lines.append("")
         if self.p_value is not None:
-            lines.append(f"SC in-space placebo p-value : {self.p_value:.3f}")
-        verdict = ("✓ Significant lift — the ensemble interval excludes zero."
-                   if self.significant else
-                   "~ Not distinguishable from zero at this level — the ensemble "
-                   "interval includes zero.")
+            lines.append(f"In-space placebo p-value    : {self.p_value:.3f}  "
+                         f"(ensemble, {e.get('n_placebo', 0)} donors)")
+        if e.get("low_power"):
+            lines.append("⚠ Few comparable donors — inference is low-powered; treat "
+                         "intervals/p-value with caution.")
+        if self.significant:
+            verdict = "✓ Significant lift — the ensemble interval excludes zero."
+        elif not (np.isfinite(e["att_lo"]) and np.isfinite(e["att_hi"])):
+            verdict = ("? Inference undefined — too few comparable donor placebos "
+                       "to form an interval.")
+        else:
+            verdict = ("~ Not distinguishable from zero at this level — the ensemble "
+                       "interval includes zero.")
         lines.append(f"Headline (ensemble)         : {100*e['lift']:+.2f}% lift, "
                      f"{e['cumulative']:,.0f} cumulative incremental")
         if "cum_lo" in e:
             lines.append(f"Cumulative {cl}% CI          : "
                          f"[{e['cum_lo']:,.0f}, {e['cum_hi']:,.0f}]  "
-                         f"(moving-block bootstrap, block_len-aware)")
+                         f"(in-space placebo, {e.get('n_placebo', 0)} donors)")
         lines.append(verdict)
         lines.append("=" * 66)
         return "\n".join(lines)
@@ -1569,7 +1619,7 @@ def _plot_eval(rep: "_EvalReport", path):
     axc.set_title("Lift by method", fontweight="bold")
     axc.grid(True, axis="x", alpha=0.25)
-    pv = f"  ·  SC placebo p={rep.p_value:.3f}" if rep.p_value is not None else ""
+    pv = f"  ·  placebo p={rep.p_value:.3f}" if rep.p_value is not None else ""
     verdict = "significant" if rep.significant else "not significant"
     fig.suptitle(f"panelkit · test evaluation — ensemble lift "
                  f"{100*rep.ensemble['lift']:+.2f}% ({verdict}){pv}",
@@ -1582,10 +1632,10 @@ def _plot_eval(rep: "_EvalReport", path):
 def _plot_eval_timeline(rep: "_EvalReport", path):
     """Pointwise + cumulative effect over the full timeline, with CI bands.
-    Bands come from a moving-block bootstrap of the pre-period residuals (so they
-    capture autocorrelation): the pointwise band is the per-period placebo spread
-    around the estimate; the cumulative band grows with horizon as the bootstrap
-    placebo cumulative-sums spread out."""
+    Bands come from the in-space placebo distribution (every donor refit as if
+    treated): the pointwise band is the per-period placebo spread around the
+    estimate; the cumulative band grows with horizon as the placebo
+    cumulative-sums spread out."""
     _, plt = _require_mpl()
     import numpy as _np
     from matplotlib.gridspec import GridSpec
@@ -1632,7 +1682,7 @@ def _plot_eval_timeline(rep: "_EvalReport", path):
     cum = e["cum_curve"]
     axc.axvspan(-0.5, t0 - 0.5, color="#f3f4f6", alpha=0.8)
     axc.fill_between(seg, e["cum_lo_curve"], e["cum_hi_curve"], color=_PK_GREEN,
-                     alpha=0.15, label=f"{cl}% band (block bootstrap)")
+                     alpha=0.15, label=f"{cl}% band (in-space placebo)")
     axc.plot(seg, cum, color=_PK_GREEN, lw=2.4, label="cumulative incremental")
     axc.axhline(0, color="#111827", lw=1.0)
     axc.axvline(t0 - 0.5, color="#374151", lw=1.2, ls=":")

{panelkit-0.2.3 → panelkit-0.2.5}/BENCHMARKS.md RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/LICENSE-APACHE RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/LICENSE-MIT RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/Cargo.toml RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/benches/estimators.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/did/bacon.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/did/callaway.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/did/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/did/sunab.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/did/twfe.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/fe/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/fe/within.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/lib.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/mcnnm/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/mcnnm/softimpute.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/panel.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/result.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/sc/augmented.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/sc/cpasc.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/sc/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/src/sc/synthetic.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/tests/cpasc.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/tests/did.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/tests/sc.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/estimators/tests/sc_family.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/Cargo.toml RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/src/diagnostics.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/src/lib.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/src/power.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/src/selection.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/src/types.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/geo/tests/geo.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/Cargo.toml RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/src/batch.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/src/bootstrap.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/src/ci.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/src/lib.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/src/parallel.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/src/placebo.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/inference/tests/inference.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/Cargo.toml RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/error.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/cholesky.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/eig_sym.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/qr.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/randomized.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/svd.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/factor/svd_gram.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/lib.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/matrix.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/ops/matmul.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/ops/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/ops/norms.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/ops/transform.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/opt/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/opt/softthresh.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/rng.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/solve/lstsq.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/solve/mod.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/src/solve/spd.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/linalg/tests/numerics.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/Cargo.toml RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/src/api_did.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/src/api_geo.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/src/convert.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/src/lib.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/crates/pypanelkit/src/results.rs RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/python/panelkit/__init__.py RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/python/panelkit/estimators.py RENAMED Viewed

File without changes

{panelkit-0.2.3 → panelkit-0.2.5}/python/panelkit/py.typed RENAMED Viewed

File without changes

panelkit 0.2.3__tar.gz → 0.2.5__tar.gz

panelkit 0.2.3tar.gz → 0.2.5tar.gz