PyPI - proscore - Versions diffs - 0.2.1__tar.gz → 0.2.2__tar.gz - Mend

proscore 0.2.1tar.gz → 0.2.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (73) hide show

{proscore-0.2.1/src/proscore.egg-info → proscore-0.2.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: proscore
-Version: 0.2.1
+Version: 0.2.2
 Summary: Production-grade scorecard development toolkit
 Author: Liqiwei
 License-Expression: MIT
@@ -45,10 +45,47 @@ Dynamic: license-file
 **生产级评分卡开发工具包**
 端到端的确定性评分卡建模管线，为银行和金融机构的信用评分卡建模场景设计, 满足对可解释性、合规性和稳定性的要求。
+## Why ProScore
+ProScore 不是通用机器学习框架，而是面向金融评分卡落地的**工程化工具包**。
+目标是把“能建模”升级为“可评审、可复现、可上线、可监控”。
+适合以下场景：
+- 银行/消金/互金团队做信用评分卡开发与迭代
+- 研发与业务分析师需要通过 Python + Excel 协同建模
+- 需要输出监管/评审材料，并建立投产后监控闭环
+## 核心亮点
+1. **单调性工程化（关键差异）**
+   - 支持变量级单调方向配置（increasing/decreasing/u/inverted_u/none）
+   - 支持自动单调调整，减少人工反复调箱
+   - 单调配置可模板化复用，跨项目保持一致性
+2. **端到端确定性流程**
+   - `detect -> prefilter -> bin -> refine -> transform -> select -> fit -> evaluate -> diagnose -> report -> monitor`
+   - 同样输入得到同样输出，便于审计、复盘和团队协作
+3. **三种使用方式统一口径**
+   - 模块化 API（灵活）
+   - 链式 API（高效）
+   - Excel 配置驱动（零代码）
+   - 三种入口共享同一建模逻辑，减少“口径不一致”
+4. **诊断与报告一体化**
+   - `diagnose()` 提供 4 层结构化诊断（区分力/过拟合/稳定性/变量质量）
+   - 支持阈值自定义（`thresholds=...`）
+   - `ReportBuilder` 自动纳入诊断章节，提升评审效率
+5. **投产后监控闭环**
+   - 支持 PSI、KS 衰减、规则告警、分期追踪
+   - 帮助形成“上线—监控—重训”的持续运营机制
 ---
 ## 目录
+- [入门教程（Notebook）](#入门教程notebook)
 - [三种使用方式](#三种使用方式)
 - [核心功能概览](#核心功能概览)
 - [安装](#安装)
@@ -57,6 +94,17 @@ Dynamic: license-file
 ---
+## 入门教程（Notebook）
+推荐按下面顺序阅读，先跑通再深入：
+| Notebook | 适合谁 | 你会得到什么 |
+|----------|--------|--------------|
+| [**ProScore快速开始**](notebooks/ProScore快速开始.ipynb) | 第一次上手 | 5–10 分钟链式单路径，只看 KS/AUC/PSI、入模变量、诊断摘要 |
+| [**ProScore完整建模流程**](notebooks/ProScore完整建模流程.ipynb) | 准备落地生产 | 模块化 + 链式对照、CFG 参数单一真源、规则挖掘、监控、报告、诊断 |
+> 快速开始刻意保持精简（不含规则挖掘等可选步骤）；完整版是权威样例，含 `[主线]` / `[可选]` 章节导航与一致性断言。
 ## 三种使用方式
 ProScore 提供三种递进的使用方式，从零代码到完全自定义，按需选择。
@@ -106,9 +154,11 @@ p = (
 > `train` 必传，`test` 和 `oot` 可选。分箱/WOE 只在 train 上拟合；逐步回归用 test 监控过拟合；OOT 仅用于最终评估。
 >
-> 完整教程见 [notebooks/ProScore完整建模流程.ipynb](notebooks/ProScore完整建模流程.ipynb)
+> Notebook 教程见上方 [入门教程](#入门教程notebook)。
+>
+> **诊断增强**（v0.2+）：`.evaluate().diagnose()` 生成 4 层结构化健康报告（含根因变量），支持 `thresholds=...` 自定义阈值。
 >
-> **诊断增强**（v0.2+）：`.evaluate().diagnose()` 生成 4 层结构化健康报告（含根因变量），支持 `thresholds=...` 自定义阈值，适配不同机构/产品风控偏好。
+> **参数单一真源（推荐）**：`CFG` + `PipelineSpec`（`apply(spec)`）确保模块化与链式同参同结果，详见 [pipeline-spec.md](docs/使用指南/pipeline-spec.md)。
 ### C. Excel 配置驱动
@@ -140,6 +190,7 @@ proscore run my_project/pipeline_template.xlsx --output-script run.py
 |------------|-----------------------------------------------|---------------------------------------|
 | 数据探查   | IV/AUC/KS 三指标 + PSI 时序稳定性 + 相关性/VIF | 快速筛选优质变量，识别分布漂移风险    |
 | 分箱       | 4 种单调趋势 + 5 种分箱方法 + 两阶段趋势校验   | 确保 WOE 趋势符合业务逻辑，满足监管   |
+| 规则挖掘   | 单变量/交叉规则 + Lift/Precision/Recall 联合筛选 | 产出可解释策略规则，与评分卡变量互斥   |
 | 逐步回归   | 双向选择 + 五重约束（p值/符号/VIF/相关/来源） | 严谨的多重共线性控制与维度归属管理    |
 | 模型监控   | Score/Feature PSI + 规则引擎告警 + JSON 持久化 | 投产后持续验证，自动风险预警          |
 | 报告生成   | 7 章自动 Markdown 报告（含图表）              | 银保监合规文档一键生成                |

{proscore-0.2.1 → proscore-0.2.2}/README.md RENAMED Viewed

@@ -7,10 +7,47 @@
 **生产级评分卡开发工具包**
 端到端的确定性评分卡建模管线，为银行和金融机构的信用评分卡建模场景设计, 满足对可解释性、合规性和稳定性的要求。
+## Why ProScore
+ProScore 不是通用机器学习框架，而是面向金融评分卡落地的**工程化工具包**。
+目标是把“能建模”升级为“可评审、可复现、可上线、可监控”。
+适合以下场景：
+- 银行/消金/互金团队做信用评分卡开发与迭代
+- 研发与业务分析师需要通过 Python + Excel 协同建模
+- 需要输出监管/评审材料，并建立投产后监控闭环
+## 核心亮点
+1. **单调性工程化（关键差异）**
+   - 支持变量级单调方向配置（increasing/decreasing/u/inverted_u/none）
+   - 支持自动单调调整，减少人工反复调箱
+   - 单调配置可模板化复用，跨项目保持一致性
+2. **端到端确定性流程**
+   - `detect -> prefilter -> bin -> refine -> transform -> select -> fit -> evaluate -> diagnose -> report -> monitor`
+   - 同样输入得到同样输出，便于审计、复盘和团队协作
+3. **三种使用方式统一口径**
+   - 模块化 API（灵活）
+   - 链式 API（高效）
+   - Excel 配置驱动（零代码）
+   - 三种入口共享同一建模逻辑，减少“口径不一致”
+4. **诊断与报告一体化**
+   - `diagnose()` 提供 4 层结构化诊断（区分力/过拟合/稳定性/变量质量）
+   - 支持阈值自定义（`thresholds=...`）
+   - `ReportBuilder` 自动纳入诊断章节，提升评审效率
+5. **投产后监控闭环**
+   - 支持 PSI、KS 衰减、规则告警、分期追踪
+   - 帮助形成“上线—监控—重训”的持续运营机制
 ---
 ## 目录
+- [入门教程（Notebook）](#入门教程notebook)
 - [三种使用方式](#三种使用方式)
 - [核心功能概览](#核心功能概览)
 - [安装](#安装)
@@ -19,6 +56,17 @@
 ---
+## 入门教程（Notebook）
+推荐按下面顺序阅读，先跑通再深入：
+| Notebook | 适合谁 | 你会得到什么 |
+|----------|--------|--------------|
+| [**ProScore快速开始**](notebooks/ProScore快速开始.ipynb) | 第一次上手 | 5–10 分钟链式单路径，只看 KS/AUC/PSI、入模变量、诊断摘要 |
+| [**ProScore完整建模流程**](notebooks/ProScore完整建模流程.ipynb) | 准备落地生产 | 模块化 + 链式对照、CFG 参数单一真源、规则挖掘、监控、报告、诊断 |
+> 快速开始刻意保持精简（不含规则挖掘等可选步骤）；完整版是权威样例，含 `[主线]` / `[可选]` 章节导航与一致性断言。
 ## 三种使用方式
 ProScore 提供三种递进的使用方式，从零代码到完全自定义，按需选择。
@@ -68,9 +116,11 @@ p = (
 > `train` 必传，`test` 和 `oot` 可选。分箱/WOE 只在 train 上拟合；逐步回归用 test 监控过拟合；OOT 仅用于最终评估。
 >
-> 完整教程见 [notebooks/ProScore完整建模流程.ipynb](notebooks/ProScore完整建模流程.ipynb)
+> Notebook 教程见上方 [入门教程](#入门教程notebook)。
+>
+> **诊断增强**（v0.2+）：`.evaluate().diagnose()` 生成 4 层结构化健康报告（含根因变量），支持 `thresholds=...` 自定义阈值。
 >
-> **诊断增强**（v0.2+）：`.evaluate().diagnose()` 生成 4 层结构化健康报告（含根因变量），支持 `thresholds=...` 自定义阈值，适配不同机构/产品风控偏好。
+> **参数单一真源（推荐）**：`CFG` + `PipelineSpec`（`apply(spec)`）确保模块化与链式同参同结果，详见 [pipeline-spec.md](docs/使用指南/pipeline-spec.md)。
 ### C. Excel 配置驱动
@@ -102,6 +152,7 @@ proscore run my_project/pipeline_template.xlsx --output-script run.py
 |------------|-----------------------------------------------|---------------------------------------|
 | 数据探查   | IV/AUC/KS 三指标 + PSI 时序稳定性 + 相关性/VIF | 快速筛选优质变量，识别分布漂移风险    |
 | 分箱       | 4 种单调趋势 + 5 种分箱方法 + 两阶段趋势校验   | 确保 WOE 趋势符合业务逻辑，满足监管   |
+| 规则挖掘   | 单变量/交叉规则 + Lift/Precision/Recall 联合筛选 | 产出可解释策略规则，与评分卡变量互斥   |
 | 逐步回归   | 双向选择 + 五重约束（p值/符号/VIF/相关/来源） | 严谨的多重共线性控制与维度归属管理    |
 | 模型监控   | Score/Feature PSI + 规则引擎告警 + JSON 持久化 | 投产后持续验证，自动风险预警          |
 | 报告生成   | 7 章自动 Markdown 报告（含图表）              | 银保监合规文档一键生成                |

{proscore-0.2.1 → proscore-0.2.2}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "proscore"
-version = "0.2.1"
+version = "0.2.2"
 description = "Production-grade scorecard development toolkit"
 readme = "README.md"
 license = "MIT"

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/__init__.py RENAMED Viewed

@@ -19,7 +19,7 @@ from proscore.rules import RuleMiner
 from proscore.selection import Filter, StepwiseSelector, assess_screen
 from proscore.transform import WOETransformer
-__version__ = "0.2.1"
+__version__ = "0.2.2"
 class ProScore:

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/_pipeline_config.py RENAMED Viewed

@@ -157,6 +157,10 @@ _PARAM_SPEC = {
                           "决策树最大深度（tree 模式）", "rules"),
     "rm_min_lift": (3.0, None, "float", 1.0, 10.0,
                     "最小 Lift（precision / 整体坏账率）", "rules"),
+    "rm_min_precision": (None, None, "float", 0.0, 1.0,
+                         "最小 Precision（留空表示不启用）", "rules"),
+    "rm_min_recall": (None, None, "float", 0.0, 1.0,
+                      "最小 Recall（留空表示不启用）", "rules"),
     "rm_min_hit_rate": (0.02, None, "float", 0.001, 0.5,
                         "最小命中率（覆盖样本占比）", "rules"),
     "rm_max_hit_rate": (0.20, None, "float", 0.01, 0.8,
@@ -364,7 +368,8 @@ class PipelineConfig:
             # ── Rules section: bare Excel keys → rm_ prefixed _PARAM_SPEC keys ──
             if section == "rules":
                 valid = ("method", "max_depth", "max_tree_depth",
-                         "min_lift", "min_hit_rate", "max_hit_rate",
+                         "min_lift", "min_precision", "min_recall",
+                         "min_hit_rate", "max_hit_rate",
                          "max_rules", "random_state", "export_csv")
                 # Accept both bare keys and legacy rm_ prefixed keys
                 bare_key = key.removeprefix("rm_")
@@ -872,7 +877,8 @@ class PipelineConfig:
         kw: dict[str, Any] = {}
         cfg = self.rules_cfg
         for key in ("method", "max_depth", "max_tree_depth", "min_lift",
-                     "min_hit_rate", "max_hit_rate", "max_rules", "random_state"):
+                     "min_precision", "min_recall", "min_hit_rate",
+                     "max_hit_rate", "max_rules", "random_state"):
             if key in cfg:
                 kw[key] = cfg[key]
         return kw
@@ -1188,7 +1194,8 @@ def generate_template(out_dir: str = ".") -> str:
         # ── Rules ───────────────────────────────────────────────────────────
         _write_params_sheet(writer, "Rules",
-                            ["method", "max_depth", "max_tree_depth", "min_lift", "min_hit_rate",
+                            ["method", "max_depth", "max_tree_depth", "min_lift",
+                             "min_precision", "min_recall", "min_hit_rate",
                              "max_hit_rate", "max_rules", "random_state", "export_csv"],
                             section="rules")

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/evaluate/_diagnose.py RENAMED Viewed

@@ -381,8 +381,8 @@ def _oot_decay_suggestion(
     period_eval: pd.DataFrame | None,
 ) -> str:
     parts = []
-    if stability is not None and "stability" in stability.columns:
-        unstable = stability[stability["stability"].isin(["unstable", "trending_down"])]
+    if stability is not None and "psi_flag" in stability.columns:
+        unstable = stability[stability["psi_flag"] == "unstable"]
         if len(unstable) > 0:
             u_vars = unstable["variable"].unique()[:3]
             parts.append(f"不稳定变量: {', '.join(u_vars)}")

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/inspect/__init__.py RENAMED Viewed

@@ -1,9 +1,9 @@
 from proscore.inspect._correlation import correlation, vif
 from proscore.inspect._detect import detect
 from proscore.inspect._quality import list_supported_estimators, quality
-from proscore.inspect._stability import stability, stability_summary
+from proscore.inspect._stability import period_bad_rate, stability, stability_summary
 __all__ = [
     "correlation", "detect", "list_supported_estimators",
-    "quality", "stability", "stability_summary", "vif",
+    "quality", "stability", "stability_summary", "period_bad_rate", "vif",
 ]

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/inspect/_stability.py RENAMED Viewed

@@ -1,4 +1,8 @@
-"""Time-series variable stability analysis: bad_rate trend, PSI drift."""
+"""Time-series stability analysis.
+- ``stability``: variable-level distribution stability (PSI only)
+- ``period_bad_rate``: portfolio-level target bad-rate trend by period
+"""
 from __future__ import annotations
@@ -15,20 +19,15 @@ def stability(
     time_col: str,
     features: list[str] | None = None,
     n_bins: int = 5,
-    bad_rate_trend_threshold: float = 0.5,
     psi_warn_threshold: float = 0.1,
 ) -> pd.DataFrame:
     """
-    Time-series stability analysis for each feature.
-    For each time period, computes sample count, bad rate, distribution PSI
-    (vs first period and vs previous period), and **two independent flags**:
+    Variable-level time-series stability analysis.
-    - ``psi_flag``: distribution drift vs the first period (PSI).
-    - ``bad_rate_flag``: bad-rate trend vs the first period (relative change).
-    ``bad_rate_change`` is the relative change from the first period:
-    ``(bad_rate[t] - bad_rate[0]) / bad_rate[0]``.
+    This function focuses on **feature distribution drift only**:
+    PSI (vs first period / previous period) and PSI-based stability flag.
+    Portfolio-level target bad-rate trend is intentionally separated into
+    :func:`period_bad_rate`.
     Parameters
     ----------
@@ -45,19 +44,14 @@ def stability(
         (excludes *target* and *time_col*).
     n_bins : int
         Number of equal-frequency bins for continuous PSI calculation.
-    bad_rate_trend_threshold : float
-        Relative change in bad rate that triggers a ``"trending"`` flag on
-        ``bad_rate_flag``.  For example, 0.5 means a 50% increase or decrease
-        from the first period is flagged.
     psi_warn_threshold : float
         PSI threshold above which ``psi_flag`` is set to ``"unstable"``.
     Returns
     -------
     pd.DataFrame
-        Columns: ``variable | time_period | n | bad_rate |
-        bad_rate_change | psi_vs_first | psi_vs_prev | mean | std |
-        psi_flag | bad_rate_flag``.
+        Columns: ``variable | time_period | n | psi_vs_first | psi_vs_prev |
+        mean | std | psi_flag``.
     """
     if target not in df.columns:
         raise KeyError(f"target column {target!r} not in DataFrame")
@@ -89,28 +83,13 @@ def stability(
         base_bins = _distribution_bins(base_data, cat, n_bins, all_cats)
         base_dist = _distribution(base_data, base_bins, cat)
-        first_bad_rate = None
         prev_dist = None
         for p_idx, period in enumerate(periods):
             mask = df[time_col] == period
-            # Exclude rows where target is NaN for bad-rate calculation
-            sub = df.loc[mask, [target, col]].dropna(subset=[target])
-            sub_target = sub[target]
             sub_data = series[mask].dropna()  # full data for PSI/distribution
-            n = len(sub_target)
-            bad = sub_target.sum()
-            bad_rate = bad / n if n > 0 else np.nan
-            if p_idx == 0:
-                first_bad_rate = bad_rate
-            # Bad rate change vs first period
-            if first_bad_rate is not None and first_bad_rate > 0 and not np.isnan(bad_rate):
-                br_change = (bad_rate - first_bad_rate) / first_bad_rate
-            else:
-                br_change = np.nan
+            n = len(sub_data)
             # PSI
             cur_dist = _distribution(sub_data, base_bins, cat)
@@ -129,16 +108,11 @@ def stability(
                 "variable": col,
                 "time_period": period,
                 "n": int(n),
-                "bad_rate": round(bad_rate, 6) if not np.isnan(bad_rate) else np.nan,
-                "bad_rate_change": round(br_change, 4) if not np.isnan(br_change) else np.nan,
                 "psi_vs_first": round(psi_first, 6),
                 "psi_vs_prev": round(psi_prev, 6) if not np.isnan(psi_prev) else np.nan,
                 "mean": round(float(mean_val), 4) if not np.isnan(mean_val) else np.nan,
                 "std": round(float(std_val), 4) if not np.isnan(std_val) else np.nan,
                 "psi_flag": _psi_flag(p_idx, psi_first, psi_warn_threshold),
-                "bad_rate_flag": _bad_rate_flag(
-                    p_idx, br_change, bad_rate_trend_threshold,
-                ),
             })
             prev_dist = cur_dist
@@ -153,7 +127,7 @@ def stability(
 def stability_summary(
     stability_result: pd.DataFrame,
     *,
-    metric: str = "bad_rate",
+    metric: str = "psi_vs_first",
 ) -> pd.DataFrame:
     """
     Pivot long-form :func:`stability` output to one row per variable.
@@ -163,14 +137,14 @@ def stability_summary(
     stability_result : pd.DataFrame
         Output of :func:`stability`.
     metric : str
-        Column to pivot: ``"bad_rate"``, ``"psi_vs_first"``, or
-        ``"bad_rate_change"``.
+        Column to pivot: ``"psi_vs_first"``, ``"psi_vs_prev"``,
+        ``"mean"``, ``"std"``, or ``"n"``.
     Returns
     -------
     pd.DataFrame
-        Index ``variable``; columns are time periods; extra columns
-        ``latest_psi_flag`` and ``latest_bad_rate_flag`` from the last period.
+        Index ``variable``; columns are time periods; extra column
+        ``latest_psi_flag`` from the last period.
     """
     if metric not in stability_result.columns:
         raise KeyError(f"metric {metric!r} not in stability result columns")
@@ -189,10 +163,78 @@ def stability_summary(
         .last()
     )
     wide["latest_psi_flag"] = latest["psi_flag"].reindex(wide.index)
-    wide["latest_bad_rate_flag"] = latest["bad_rate_flag"].reindex(wide.index)
     return wide.reset_index()
+def period_bad_rate(
+    df: pd.DataFrame,
+    target: str,
+    time_col: str,
+    *,
+    bad_rate_trend_threshold: float = 0.5,
+) -> pd.DataFrame:
+    """Portfolio-level target bad-rate trend by time period.
+    Parameters
+    ----------
+    df : pd.DataFrame
+        Input data containing *target* and *time_col*.
+    target : str
+        Binary target (1=bad).
+    time_col : str
+        Time period column.
+    bad_rate_trend_threshold : float
+        Relative change vs first period that triggers trend flags.
+    Returns
+    -------
+    pd.DataFrame
+        Columns: ``time_period | n | bad | bad_rate | bad_rate_change |
+        bad_rate_flag``.
+    """
+    if target not in df.columns:
+        raise KeyError(f"target column {target!r} not in DataFrame")
+    if time_col not in df.columns:
+        raise KeyError(f"time_col {time_col!r} not in DataFrame")
+    periods = sorted(df[time_col].dropna().unique())
+    if len(periods) < 2:
+        raise ValueError(f"Need at least 2 distinct time periods; got {len(periods)}")
+    rows: list[dict] = []
+    first_bad_rate = None
+    for p_idx, period in enumerate(periods):
+        sub = df.loc[df[time_col] == period, [target]].dropna(subset=[target])
+        n = len(sub)
+        bad = int(sub[target].sum()) if n > 0 else 0
+        bad_rate = (bad / n) if n > 0 else np.nan
+        if p_idx == 0:
+            first_bad_rate = bad_rate
+        if first_bad_rate is not None and first_bad_rate > 0 and not np.isnan(bad_rate):
+            br_change = (bad_rate - first_bad_rate) / first_bad_rate
+        else:
+            br_change = np.nan
+        rows.append(
+            {
+                "time_period": period,
+                "n": int(n),
+                "bad": int(bad),
+                "bad_rate": round(bad_rate, 6) if not np.isnan(bad_rate) else np.nan,
+                "bad_rate_change": round(br_change, 4) if not np.isnan(br_change) else np.nan,
+                "bad_rate_flag": _bad_rate_flag(p_idx, br_change, bad_rate_trend_threshold),
+            }
+        )
+    result = pd.DataFrame(rows)
+    result["time_period"] = pd.Categorical(
+        result["time_period"], categories=periods, ordered=True
+    )
+    return result.sort_values("time_period").reset_index(drop=True)
 # ── internal helpers ───────────────────────────────────────────────────────

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/rules/_miner.py RENAMED Viewed

@@ -45,6 +45,10 @@ class RuleMiner:
         Maximum depth of the decision tree (tree mode).
     min_lift : float
         Minimum Lift (precision / overall_bad_rate).
+    min_precision : float or None
+        Minimum precision threshold. ``None`` disables this filter.
+    min_recall : float or None
+        Minimum recall threshold. ``None`` disables this filter.
     min_hit_rate : float
         Minimum fraction of total samples a rule must cover.
     max_hit_rate : float
@@ -61,6 +65,8 @@ class RuleMiner:
         max_depth: int = 3,
         max_tree_depth: int = 4,
         min_lift: float = 3.0,
+        min_precision: float | None = None,
+        min_recall: float | None = None,
         min_hit_rate: float = 0.01,
         max_hit_rate: float = 0.20,
         max_rules: int = 20,
@@ -69,10 +75,16 @@ class RuleMiner:
         _valid = {"exhaustive", "tree", "apriori"}
         if method not in _valid:
             raise ValueError(f"Unknown method: {method!r}. Valid: {sorted(_valid)}")
+        if min_precision is not None and not (0.0 <= min_precision <= 1.0):
+            raise ValueError("min_precision must be within [0, 1] or None")
+        if min_recall is not None and not (0.0 <= min_recall <= 1.0):
+            raise ValueError("min_recall must be within [0, 1] or None")
         self.method = method
         self.max_depth = max_depth
         self.max_tree_depth = max_tree_depth
         self.min_lift = min_lift
+        self.min_precision = min_precision
+        self.min_recall = min_recall
         self.min_hit_rate = min_hit_rate
         self.max_hit_rate = max_hit_rate
         self.max_rules = max_rules
@@ -314,6 +326,10 @@ class RuleMiner:
         if lift < self.min_lift:
             return None
+        if self.min_precision is not None and precision < self.min_precision:
+            return None
+        if self.min_recall is not None and recall < self.min_recall:
+            return None
         rstr = self._format_rule(feat_bounds)

{proscore-0.2.1 → proscore-0.2.2/src/proscore.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: proscore
-Version: 0.2.1
+Version: 0.2.2
 Summary: Production-grade scorecard development toolkit
 Author: Liqiwei
 License-Expression: MIT
@@ -45,10 +45,47 @@ Dynamic: license-file
 **生产级评分卡开发工具包**
 端到端的确定性评分卡建模管线，为银行和金融机构的信用评分卡建模场景设计, 满足对可解释性、合规性和稳定性的要求。
+## Why ProScore
+ProScore 不是通用机器学习框架，而是面向金融评分卡落地的**工程化工具包**。
+目标是把“能建模”升级为“可评审、可复现、可上线、可监控”。
+适合以下场景：
+- 银行/消金/互金团队做信用评分卡开发与迭代
+- 研发与业务分析师需要通过 Python + Excel 协同建模
+- 需要输出监管/评审材料，并建立投产后监控闭环
+## 核心亮点
+1. **单调性工程化（关键差异）**
+   - 支持变量级单调方向配置（increasing/decreasing/u/inverted_u/none）
+   - 支持自动单调调整，减少人工反复调箱
+   - 单调配置可模板化复用，跨项目保持一致性
+2. **端到端确定性流程**
+   - `detect -> prefilter -> bin -> refine -> transform -> select -> fit -> evaluate -> diagnose -> report -> monitor`
+   - 同样输入得到同样输出，便于审计、复盘和团队协作
+3. **三种使用方式统一口径**
+   - 模块化 API（灵活）
+   - 链式 API（高效）
+   - Excel 配置驱动（零代码）
+   - 三种入口共享同一建模逻辑，减少“口径不一致”
+4. **诊断与报告一体化**
+   - `diagnose()` 提供 4 层结构化诊断（区分力/过拟合/稳定性/变量质量）
+   - 支持阈值自定义（`thresholds=...`）
+   - `ReportBuilder` 自动纳入诊断章节，提升评审效率
+5. **投产后监控闭环**
+   - 支持 PSI、KS 衰减、规则告警、分期追踪
+   - 帮助形成“上线—监控—重训”的持续运营机制
 ---
 ## 目录
+- [入门教程（Notebook）](#入门教程notebook)
 - [三种使用方式](#三种使用方式)
 - [核心功能概览](#核心功能概览)
 - [安装](#安装)
@@ -57,6 +94,17 @@ Dynamic: license-file
 ---
+## 入门教程（Notebook）
+推荐按下面顺序阅读，先跑通再深入：
+| Notebook | 适合谁 | 你会得到什么 |
+|----------|--------|--------------|
+| [**ProScore快速开始**](notebooks/ProScore快速开始.ipynb) | 第一次上手 | 5–10 分钟链式单路径，只看 KS/AUC/PSI、入模变量、诊断摘要 |
+| [**ProScore完整建模流程**](notebooks/ProScore完整建模流程.ipynb) | 准备落地生产 | 模块化 + 链式对照、CFG 参数单一真源、规则挖掘、监控、报告、诊断 |
+> 快速开始刻意保持精简（不含规则挖掘等可选步骤）；完整版是权威样例，含 `[主线]` / `[可选]` 章节导航与一致性断言。
 ## 三种使用方式
 ProScore 提供三种递进的使用方式，从零代码到完全自定义，按需选择。
@@ -106,9 +154,11 @@ p = (
 > `train` 必传，`test` 和 `oot` 可选。分箱/WOE 只在 train 上拟合；逐步回归用 test 监控过拟合；OOT 仅用于最终评估。
 >
-> 完整教程见 [notebooks/ProScore完整建模流程.ipynb](notebooks/ProScore完整建模流程.ipynb)
+> Notebook 教程见上方 [入门教程](#入门教程notebook)。
+>
+> **诊断增强**（v0.2+）：`.evaluate().diagnose()` 生成 4 层结构化健康报告（含根因变量），支持 `thresholds=...` 自定义阈值。
 >
-> **诊断增强**（v0.2+）：`.evaluate().diagnose()` 生成 4 层结构化健康报告（含根因变量），支持 `thresholds=...` 自定义阈值，适配不同机构/产品风控偏好。
+> **参数单一真源（推荐）**：`CFG` + `PipelineSpec`（`apply(spec)`）确保模块化与链式同参同结果，详见 [pipeline-spec.md](docs/使用指南/pipeline-spec.md)。
 ### C. Excel 配置驱动
@@ -140,6 +190,7 @@ proscore run my_project/pipeline_template.xlsx --output-script run.py
 |------------|-----------------------------------------------|---------------------------------------|
 | 数据探查   | IV/AUC/KS 三指标 + PSI 时序稳定性 + 相关性/VIF | 快速筛选优质变量，识别分布漂移风险    |
 | 分箱       | 4 种单调趋势 + 5 种分箱方法 + 两阶段趋势校验   | 确保 WOE 趋势符合业务逻辑，满足监管   |
+| 规则挖掘   | 单变量/交叉规则 + Lift/Precision/Recall 联合筛选 | 产出可解释策略规则，与评分卡变量互斥   |
 | 逐步回归   | 双向选择 + 五重约束（p值/符号/VIF/相关/来源） | 严谨的多重共线性控制与维度归属管理    |
 | 模型监控   | Score/Feature PSI + 规则引擎告警 + JSON 持久化 | 投产后持续验证，自动风险预警          |
 | 报告生成   | 7 章自动 Markdown 报告（含图表）              | 银保监合规文档一键生成                |

{proscore-0.2.1 → proscore-0.2.2}/tests/test_inspect.py RENAMED Viewed

@@ -5,7 +5,7 @@ from __future__ import annotations
 import pandas as pd
 import pytest
-from proscore.inspect import correlation, detect, quality, stability, vif
+from proscore.inspect import correlation, detect, period_bad_rate, quality, stability, vif
 class TestDetect:
@@ -84,17 +84,16 @@ class TestStability:
         has_psi_col = "psi" in result.columns or "psi_vs_first" in result.columns
         assert has_psi_col
-    def test_has_separate_stability_flags(self, full_df):
+    def test_has_psi_flag_only(self, full_df):
         result = stability(
             full_df, target="bad_flag", time_col="apply_date",
             features=["income"],
         )
         assert "psi_flag" in result.columns
-        assert "bad_rate_flag" in result.columns
-        assert "stability" not in result.columns
+        assert "bad_rate_flag" not in result.columns
+        assert "bad_rate" not in result.columns
-    def test_psi_and_bad_rate_flags_independent(self, full_df):
-        """PSI unstable does not force bad_rate trending, and vice versa."""
+    def test_psi_flag_values(self, full_df):
         result = stability(
             full_df, target="bad_flag", time_col="apply_date",
             features=["income", "debt_ratio"],
@@ -102,11 +101,7 @@ class TestStability:
         non_base = result[result["time_period"] != result["time_period"].min()]
         if len(non_base) == 0:
             return
-        # Columns are evaluated separately — no merged label
         assert set(non_base["psi_flag"].unique()).issubset({"stable", "unstable"})
-        assert set(non_base["bad_rate_flag"].unique()).issubset(
-            {"stable", "trending_up", "trending_down"}
-        )
     def test_multiple_periods(self, full_df):
         result = stability(
@@ -118,6 +113,27 @@ class TestStability:
         assert result[time_col].nunique() >= n_periods - 1
+class TestPeriodBadRate:
+    def test_returns_dataframe(self, full_df):
+        result = period_bad_rate(full_df, target="bad_flag", time_col="apply_date")
+        assert isinstance(result, pd.DataFrame)
+        assert not result.empty
+    def test_has_expected_columns(self, full_df):
+        result = period_bad_rate(full_df, target="bad_flag", time_col="apply_date")
+        expected = {"time_period", "n", "bad", "bad_rate", "bad_rate_change", "bad_rate_flag"}
+        assert expected.issubset(set(result.columns))
+    def test_flag_values(self, full_df):
+        result = period_bad_rate(full_df, target="bad_flag", time_col="apply_date")
+        non_base = result[result["time_period"] != result["time_period"].min()]
+        if len(non_base) == 0:
+            return
+        assert set(non_base["bad_rate_flag"].unique()).issubset(
+            {"stable", "trending_up", "trending_down"}
+        )
 class TestVIF:
     def test_returns_dataframe(self, sample_df):
         num_cols = ["x1", "x2", "x4"]

{proscore-0.2.1 → proscore-0.2.2}/tests/test_pipeline_rules.py RENAMED Viewed

@@ -44,9 +44,13 @@ class TestRulesExcel:
         cfg = PipelineConfig.from_excel(str(template_xlsx))
         cfg.rules_cfg["method"] = "apriori"
         cfg.rules_cfg["max_rules"] = 5
+        cfg.rules_cfg["min_precision"] = 0.12
+        cfg.rules_cfg["min_recall"] = 0.03
         kw = cfg._build_rules_kw()
         assert kw["method"] == "apriori"
         assert kw["max_rules"] == 5
+        assert kw["min_precision"] == pytest.approx(0.12)
+        assert kw["min_recall"] == pytest.approx(0.03)
         assert all(not k.startswith("rm_") for k in kw)
     def test_mine_rules_requires_refine(self, template_xlsx: Path, tmp_path: Path) -> None:

{proscore-0.2.1 → proscore-0.2.2}/tests/test_rules.py RENAMED Viewed

@@ -104,6 +104,36 @@ class TestRuleMiner:
         with pytest.raises(ValueError):
             RuleMiner(method="invalid")
+    def test_min_precision_filter(self, rule_data, bin_table):
+        df, y = rule_data
+        baseline = RuleMiner(method="exhaustive", min_lift=1.0, max_rules=100)
+        baseline.fit(df, y, bin_table=bin_table)
+        assert len(baseline.rules_table_) > 0
+        strict = RuleMiner(
+            method="exhaustive",
+            min_lift=1.0,
+            min_precision=0.95,
+            max_rules=100,
+        )
+        strict.fit(df, y, bin_table=bin_table)
+        if len(strict.rules_table_) > 0:
+            assert (strict.rules_table_["precision"] >= 0.95).all()
+        assert len(strict.rules_table_) <= len(baseline.rules_table_)
+    def test_min_recall_filter(self, rule_data, bin_table):
+        df, y = rule_data
+        rm = RuleMiner(method="exhaustive", min_lift=1.0, min_recall=0.1, max_rules=100)
+        rm.fit(df, y, bin_table=bin_table)
+        if len(rm.rules_table_) > 0:
+            assert (rm.rules_table_["recall"] >= 0.1).all()
+    def test_invalid_precision_recall_threshold_raises(self):
+        with pytest.raises(ValueError, match="min_precision"):
+            RuleMiner(min_precision=1.2)
+        with pytest.raises(ValueError, match="min_recall"):
+            RuleMiner(min_recall=-0.1)
     def test_empty_rules_table(self, rule_data, bin_table):
         df, y = rule_data
         rm = RuleMiner(method="exhaustive", min_lift=100.0)

{proscore-0.2.1 → proscore-0.2.2}/LICENSE RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/setup.cfg RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/__main__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/_data/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/_spec.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_adjust.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_base.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_binning.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_categorical.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_chi.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_distance.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_frequency.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_tree.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/binning/_woe.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/evaluate/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/evaluate/_metrics.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/inspect/_correlation.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/inspect/_detect.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/inspect/_quality.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/modeling/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/modeling/_scorecard.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/monitor/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/monitor/_monitor.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/report/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/report/_builder.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/rules/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/selection/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/selection/_filter.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/selection/_screen.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/selection/_stepwise.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/transform/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/transform/_woe.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/utils/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/utils/_config.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/utils/_exceptions.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/utils/_presets.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/utils/_psi.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/viz/__init__.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore/viz/_plots.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore.egg-info/entry_points.txt RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore.egg-info/requires.txt RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/src/proscore.egg-info/top_level.txt RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_binning.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_diagnose.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_docs_examples.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_evaluate.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_evaluate_period.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_filter.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_pipeline.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_presets.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_report.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_scorecard.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_screen.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_spec.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_stepwise.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_transform.py RENAMED Viewed

File without changes

{proscore-0.2.1 → proscore-0.2.2}/tests/test_woe.py RENAMED Viewed

File without changes

proscore 0.2.1__tar.gz → 0.2.2__tar.gz

proscore 0.2.1tar.gz → 0.2.2tar.gz