PyPI - warpgbm - Versions diffs - 0.1.27__tar.gz → 2.0.0__tar.gz - Mend

warpgbm 0.1.27tar.gz → 2.0.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

{warpgbm-0.1.27/warpgbm.egg-info → warpgbm-2.0.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: warpgbm
-Version: 0.1.27
+Version: 2.0.0
 Summary: A fast GPU-accelerated Gradient Boosted Decision Tree library with PyTorch + CUDA
 License:                     GNU GENERAL PUBLIC LICENSE
                                Version 3, 29 June 2007
@@ -688,242 +688,425 @@ Dynamic: license-file
 ![warpgbm](https://github.com/user-attachments/assets/dee9de16-091b-49c1-a8fa-2b4ab6891184)
+# WarpGBM ⚡
-# WarpGBM
+> **Neural-speed gradient boosting. GPU-native. Distribution-aware. Production-ready.**
-WarpGBM is a high-performance, GPU-accelerated Gradient Boosted Decision Tree (GBDT) library built with PyTorch and CUDA. It offers blazing-fast histogram-based training and efficient prediction, with compatibility for research and production workflows.
+WarpGBM is a high-performance, GPU-accelerated Gradient Boosted Decision Tree (GBDT) library engineered from silicon up with PyTorch and custom CUDA kernels. Built for speed demons and researchers who refuse to compromise.
----
+## 🎯 What Sets WarpGBM Apart
+**Regression + Classification Unified**
+Train on continuous targets or multiclass labels with the same blazing-fast infrastructure.
-## Features
+**Invariant Learning (DES Algorithm)**
+The only open-source GBDT that natively learns signals stable across shifting distributions. Powered by **[Directional Era-Splitting](https://arxiv.org/abs/2309.14496)** — because your data doesn't live in a vacuum.
-- GPU-accelerated training and histogram construction using custom CUDA kernels
-- Drop-in scikit-learn style interface
-- Supports pre-binned data or automatic quantile binning
-- Simple install with `pip`
+**GPU-Accelerated Everything**
+Custom CUDA kernels for binning, histograms, splits, and inference. No compromises, no CPU bottlenecks.
+**Scikit-Learn Compatible**
+Drop-in replacement. Same API you know, 10x the speed you need.
 ---
-## Benchmarks
+## 🚀 Quick Start
-### Scikit-Learn Synthetic Data: 1 Million Rows and 1,000 Features
+### Installation
-In this benchmark we compare the speed and in-sample correlation of **WarpGBM v0.1.21** against LightGBM, XGBoost and CatBoost, all with their GPU-enabled versions. This benchmark runs on Google Colab with the L4 GPU environment.
+```bash
+# Latest from GitHub (recommended)
+pip install git+https://github.com/jefferythewind/warpgbm.git
+# Stable from PyPI
+pip install warpgbm
 ```
-   WarpGBM:   corr = 0.8882, train = 18.7s, infer = 4.9s
-   XGBoost:   corr = 0.8877, train = 33.1s, infer = 8.1s
-  LightGBM:   corr = 0.8604, train = 30.3s, infer = 1.4s
-  CatBoost:   corr = 0.8935, train = 400.0s, infer = 382.6s
+**Prerequisites:** PyTorch with CUDA support ([install guide](https://pytorch.org/get-started/locally/))
+### Regression in 5 Lines
+```python
+from warpgbm import WarpGBM
+import numpy as np
+model = WarpGBM(objective='regression', max_depth=5, n_estimators=100)
+model.fit(X_train, y_train)
+predictions = model.predict(X_test)
 ```
-Colab Notebook: https://colab.research.google.com/drive/16U1kbYlD5HibGbnF5NGsjChZ1p1IA2pK?usp=sharing
+### Classification in 5 Lines
+```python
+from warpgbm import WarpGBM
+model = WarpGBM(objective='multiclass', max_depth=5, n_estimators=50)
+model.fit(X_train, y_train)  # y can be integers, strings, whatever
+probabilities = model.predict_proba(X_test)
+labels = model.predict(X_test)
+```
 ---
-## Installation
+## 🎮 Features
-### Recommended (GitHub, always latest):
+### Core Engine
+- ⚡ **GPU-native CUDA kernels** for histogram building, split finding, binning, and prediction
+- 🎯 **Multi-objective support**: regression, binary, multiclass classification
+- 📊 **Pre-binned data optimization** — skip binning if your data's already quantized
+- 🔥 **Mixed precision support** — `float32` or `int8` inputs
+- 🎲 **Stochastic features** — `colsample_bytree` for regularization
-```bash
-pip install git+https://github.com/jefferythewind/warpgbm.git
-```
+### Intelligence
+- 🧠 **Invariant learning via DES** — identifies signals that generalize across time/regimes/environments
+- 📈 **Smart initialization** — class priors for classification, mean for regression
+- 🎯 **Automatic label encoding** — handles strings, integers, whatever you throw at it
-This installs the latest version directly from GitHub and compiles CUDA extensions on your machine using your **local PyTorch and CUDA setup**. It's the most reliable method for ensuring compatibility and staying up to date with the latest features.
+### Training Utilities
+- ✅ **Early stopping** with validation sets
+- 📊 **Rich metrics**: MSE, RMSLE, correlation, log loss, accuracy
+- 🔍 **Progress tracking** with loss curves
+- 🎚️ **Regularization** — L2 leaf penalties, min split gain, min child weight
-### Alternatively (PyPI, stable releases):
+---
-```bash
-pip install warpgbm
-```
+## ⚔️ Benchmarks
-This installs from PyPI and also compiles CUDA code locally during installation. This method works well **if your environment already has PyTorch with GPU support** installed and configured.
+### Synthetic Data: 1M Rows × 1K Features (Google Colab L4 GPU)
-> **Tip:**\
-> If you encounter an error related to mismatched or missing CUDA versions, try installing with the following flag. This is currently required in the Colab environments.
->
-> ```bash
-> pip install warpgbm --no-build-isolation
-> ```
+```
+   WarpGBM:   corr = 0.8882, train = 17.4s, infer = 3.2s  ⚡
+   XGBoost:   corr = 0.8877, train = 33.2s, infer = 8.0s
+  LightGBM:   corr = 0.8604, train = 29.8s, infer = 1.6s
+  CatBoost:   corr = 0.8935, train = 392.1s, infer = 379.2s
+```
-### Windows
+**2× faster than XGBoost. 23× faster than CatBoost.**
-Thank you, ShatteredX, for providing working instructions for a Windows installation.
+[→ Run the benchmark yourself](https://colab.research.google.com/drive/16U1kbYlD5HibGbnF5NGsjChZ1p1IA2pK?usp=sharing)
+### Multiclass Classification: 3.5K Samples, 3 Classes, 50 Rounds
 ```
-git clone https://github.com/jefferythewind/warpgbm.git
-cd warpgbm
-python setup.py bdist_wheel
-pip install .\dist\warpgbm-0.1.15-cp310-cp310-win_amd64.whl
+Training:   2.13s
+Inference:  0.37s
+Accuracy:   75.3%
 ```
-Before either method, make sure you’ve installed PyTorch with GPU support:\
-[https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/locally/)
+**Production-ready multiclass at neural network speeds.**
 ---
-## Example
+## 📖 Examples
+### Regression: Beat LightGBM on Your Laptop
 ```python
 import numpy as np
 from sklearn.datasets import make_regression
-from time import time
-import lightgbm as lgb
 from warpgbm import WarpGBM
-# Create synthetic regression dataset
-X, y = make_regression(n_samples=100_000, n_features=500, noise=0.1, random_state=42)
-X = X.astype(np.float32)
-y = y.astype(np.float32)
-# Train LightGBM
-start = time()
-lgb_model = lgb.LGBMRegressor(max_depth=5, n_estimators=100, learning_rate=0.01, max_bin=7)
-lgb_model.fit(X, y)
-lgb_time = time() - start
-lgb_preds = lgb_model.predict(X)
-# Train WarpGBM
-start = time()
-wgbm_model = WarpGBM(max_depth=5, n_estimators=100, learning_rate=0.01, num_bins=7)
-wgbm_model.fit(X, y)
-wgbm_time = time() - start
-wgbm_preds = wgbm_model.predict(X)
-# Results
-print(f"LightGBM:   corr = {np.corrcoef(lgb_preds, y)[0,1]:.4f}, time = {lgb_time:.2f}s")
-print(f"WarpGBM:     corr = {np.corrcoef(wgbm_preds, y)[0,1]:.4f}, time = {wgbm_time:.2f}s")
+# Generate data
+X, y = make_regression(n_samples=100_000, n_features=500, random_state=42)
+X, y = X.astype(np.float32), y.astype(np.float32)
+# Train
+model = WarpGBM(
+    objective='regression',
+    max_depth=5,
+    n_estimators=100,
+    learning_rate=0.01,
+    num_bins=32
+)
+model.fit(X, y)
+# Predict
+preds = model.predict(X)
+print(f"Correlation: {np.corrcoef(preds, y)[0,1]:.4f}")
 ```
-**Results (Ryzen 9 CPU, NVIDIA 3090 GPU):**
+### Classification: Multiclass with Early Stopping
-```
-LightGBM:   corr = 0.8742, time = 37.33s
-WarpGBM:     corr = 0.8621, time = 5.40s
+```python
+from sklearn.datasets import make_classification
+from sklearn.model_selection import train_test_split
+from warpgbm import WarpGBM
+# 5-class problem
+X, y = make_classification(
+    n_samples=10_000,
+    n_features=50,
+    n_classes=5,
+    n_informative=30
+)
+X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)
+model = WarpGBM(
+    objective='multiclass',
+    max_depth=6,
+    n_estimators=200,
+    learning_rate=0.1,
+    num_bins=32
+)
+model.fit(
+    X_train, y_train,
+    X_eval=X_val, y_eval=y_val,
+    eval_every_n_trees=10,
+    early_stopping_rounds=5,
+    eval_metric='logloss'
+)
+# Get probabilities or class predictions
+probs = model.predict_proba(X_val)  # shape: (n_samples, n_classes)
+labels = model.predict(X_val)        # class labels
 ```
----
+### Invariant Learning: Distribution-Robust Signals
-## Pre-binned Data Example (Numerai)
+```python
+# Your data spans multiple time periods/regimes/environments
+# Pass era_id to learn only signals that work across ALL eras
-WarpGBM can save additional training time if your dataset is already pre-binned. The Numerai tournament data is a great example:
+model = WarpGBM(
+    objective='regression',
+    max_depth=8,
+    n_estimators=100
+)
+model.fit(
+    X, y,
+    era_id=era_labels  # Array marking which era each sample belongs to
+)
+# Now your model ignores spurious correlations that don't generalize!
+```
+### Pre-binned Data: Maximum Speed (Numerai Example)
 ```python
 import pandas as pd
 from numerapi import NumerAPI
-from time import time
-import lightgbm as lgb
 from warpgbm import WarpGBM
-import numpy as np
+# Download Numerai data (already quantized to integers)
 napi = NumerAPI()
 napi.download_dataset('v5.0/train.parquet', 'train.parquet')
 train = pd.read_parquet('train.parquet')
-feature_set = [f for f in train.columns if 'feature' in f]
-target = 'target_cyrus'
-X_np = train[feature_set].astype('int8').values
-Y_np = train[target].values
-# LightGBM
-start = time()
-lgb_model = lgb.LGBMRegressor(max_depth=5, n_estimators=100, learning_rate=0.01, max_bin=7)
-lgb_model.fit(X_np, Y_np)
-lgb_time = time() - start
-lgb_preds = lgb_model.predict(X_np)
-# WarpGBM
-start = time()
-wgbm_model = WarpGBM(max_depth=5, n_estimators=100, learning_rate=0.01, num_bins=7)
-wgbm_model.fit(X_np, Y_np)
-wgbm_time = time() - start
-wgbm_preds = wgbm_model.predict(X_np)
-# Results
-print(f"LightGBM:   corr = {np.corrcoef(lgb_preds, Y_np)[0,1]:.4f}, time = {lgb_time:.2f}s")
-print(f"WarpGBM:     corr = {np.corrcoef(wgbm_preds, Y_np)[0,1]:.4f}, time = {wgbm_time:.2f}s")
-```
+features = [f for f in train.columns if 'feature' in f]
+X = train[features].astype('int8').values
+y = train['target'].values
-**Results (Google Colab Pro, A100 GPU):**
-```
-LightGBM:   corr = 0.0703, time = 643.88s
-WarpGBM:     corr = 0.0660, time = 49.16s
+# WarpGBM detects pre-binned data and skips binning
+model = WarpGBM(max_depth=5, n_estimators=100, num_bins=20)
+model.fit(X, y)  # Blazing fast!
 ```
+**Result: 13× faster than LightGBM on Numerai data (49s vs 643s)**
 ---
-### Run it live in Colab
+## 🧠 Invariant Learning: Why It Matters
+Most ML models assume your training and test data come from the same distribution. **Reality check: they don't.**
+- Stock prices shift with market regimes
+- User behavior changes over time
+- Experimental data varies by batch/site/condition
+**Traditional GBDT:** Learns any signal that correlates with the target, including fragile patterns that break OOD.
+**WarpGBM with DES:** Explicitly tests if each split generalizes across ALL environments (eras). Only keeps robust signals.
-You can try WarpGBM in a live Colab notebook using real pre-binned Numerai tournament data:
+### The Algorithm
-[Open in Colab](https://colab.research.google.com/drive/10mKSjs9UvmMgM5_lOXAylq5LUQAnNSi7?usp=sharing)
+For each potential split, compute gain separately in each era. Only accept splits where:
+1. Gain is positive in ALL eras
+2. Split direction is consistent across eras
-No installation required — just press **"Open in Playground"**, then **Run All**!
+This prevents overfitting to spurious correlations that only work in some time periods or environments.
+### Visual Intuition
+<img src="https://github.com/user-attachments/assets/2be11ef3-6f2e-4636-ab91-307a73add247" alt="Era Splitting Visualization" width="400"/>
+**Left:** Standard training pools all data together — learns any signal that correlates.
+**Right:** Era-aware training demands signals work across all periods — learns robust features only.
+### Research Foundation
+- **Invariant Risk Minimization**: [Arjovsky et al., 2019](https://arxiv.org/abs/1907.02893)
+- **Hard-to-Vary Explanations**: [Parascandolo et al., 2020](https://arxiv.org/abs/2009.00329)
+- **Era Splitting for Trees**: [DeLise, 2023](https://arxiv.org/abs/2309.14496)
 ---
-## Documentation
-### `WarpGBM` Parameters:
-- `num_bins`: Number of histogram bins to use (default: 10)
-- `max_depth`: Maximum depth of trees (default: 3)
-- `learning_rate`: Shrinkage rate applied to leaf outputs (default: 0.1)
-- `n_estimators`: Number of boosting iterations (default: 100)
-- `min_child_weight`: Minimum sum of instance weight needed in a child (default: 20)
-- `min_split_gain`: Minimum loss reduction required to make a further partition (default: 0.0)
-- `histogram_computer`: Choice of histogram kernel (`'hist1'`, `'hist2'`, `'hist3'`) (default: `'hist3'`)
-- `threads_per_block`: CUDA threads per block (default: 32)
-- `rows_per_thread`: Number of training rows processed per thread (default: 4)
-- `L2_reg`: L2 regularizer (default: 1e-6)
-- `colsample_bytree`: Proportion of features to subsample to grow each tree (default: 1)
-### Methods:
+## 📚 API Reference
+### Constructor Parameters
+```python
+WarpGBM(
+    objective='regression',        # 'regression', 'binary', or 'multiclass'
+    num_bins=10,                   # Histogram bins for feature quantization
+    max_depth=3,                   # Maximum tree depth
+    learning_rate=0.1,             # Shrinkage rate (aka eta)
+    n_estimators=100,              # Number of boosting rounds
+    min_child_weight=20,           # Min sum of instance weights in child node
+    min_split_gain=0.0,            # Min loss reduction to split
+    L2_reg=1e-6,                   # L2 leaf regularization
+    colsample_bytree=1.0,          # Feature subsample ratio per tree
+    threads_per_block=64,          # CUDA block size (tune for your GPU)
+    rows_per_thread=4,             # Rows processed per thread
+    device='cuda'                  # 'cuda' or 'cpu' (GPU strongly recommended)
+)
 ```
-.fit(
-   X,                             # numpy array (float or int) 2 dimensions (num_samples, num_features)
-   y,                             # numpy array (float or int) 1 dimension (num_samples)
-   era_id=None,                   # numpy array (int) 1 dimension (num_samples)
-   X_eval=None,                   # numpy array (float or int) 2 dimensions (eval_num_samples, num_features)
-   y_eval=None,                   # numpy array (float or int) 1 dimension (eval_num_samples)
-   eval_every_n_trees=None,       # const (int) >= 1
-   early_stopping_rounds=None,    # const (int) >= 1
-   eval_metric='mse'              # string, one of 'mse' or 'corr'. For corr, loss is 1 - correlation(y_true, preds)
+### Training Methods
+```python
+model.fit(
+    X,                              # Features: np.array shape (n_samples, n_features)
+    y,                              # Target: np.array shape (n_samples,)
+    era_id=None,                    # Optional: era labels for invariant learning
+    X_eval=None,                    # Optional: validation features
+    y_eval=None,                    # Optional: validation targets
+    eval_every_n_trees=None,        # Eval frequency (in rounds)
+    early_stopping_rounds=None,     # Stop if no improvement for N evals
+    eval_metric='mse'               # 'mse', 'rmsle', 'corr', 'logloss', 'accuracy'
 )
 ```
-Train with optional validation set and early stopping.
+### Prediction Methods
+```python
+# Regression: returns predicted values
+predictions = model.predict(X)
+# Classification: returns class labels (decoded)
+labels = model.predict(X)
+# Classification: returns class probabilities
+probabilities = model.predict_proba(X)  # shape: (n_samples, n_classes)
 ```
-.predict(
-   X                              # numpy array (float or int) 2 dimensions (predict_num_samples, num_features)
-)
+### Attributes
+```python
+model.classes_          # Unique class labels (classification only)
+model.num_classes       # Number of classes (classification only)
+model.forest            # Trained tree structures
+model.training_loss     # Training loss history
+model.eval_loss         # Validation loss history (if eval set provided)
+```
+---
+## 🔧 Installation Details
+### Linux / macOS (Recommended)
+```bash
+pip install git+https://github.com/jefferythewind/warpgbm.git
+```
+Compiles CUDA extensions using your local PyTorch + CUDA setup.
+### Colab / Mismatched CUDA Versions
+```bash
+pip install warpgbm --no-build-isolation
+```
+### Windows
+```bash
+git clone https://github.com/jefferythewind/warpgbm.git
+cd warpgbm
+python setup.py bdist_wheel
+pip install dist/warpgbm-*.whl
 ```
-Predict on new data, using parallelized CUDA kernel.
 ---
-## Acknowledgements
+## 🎯 Use Cases
-WarpGBM builds on the shoulders of PyTorch, scikit-learn, LightGBM, and the CUDA ecosystem. Thanks to all contributors in the GBDT research and engineering space.
+**Financial ML:** Learn signals that work across market regimes
+**Time Series:** Robust forecasting across distribution shifts
+**Scientific Research:** Models that generalize across experimental batches
+**High-Speed Inference:** Production systems with millisecond SLAs
+**Kaggle/Competitions:** GPU-accelerated hyperparameter tuning
+**Multiclass Problems:** Image classification fallback, text categorization, fraud detection
 ---
-## Version Notes
+## 🚧 Roadmap
-### v0.1.21
+- [ ] Multi-GPU training support
+- [ ] SHAP value computation on GPU
+- [ ] Feature interaction constraints
+- [ ] Monotonic constraints
+- [ ] Custom loss functions
+- [ ] Distributed training
+- [ ] ONNX export for deployment
-- Vectorized predict function replaced with CUDA kernel (`warpgbm/cuda/predict.cu`), parallelizing per sample, per tree.
+---
-### v0.1.23
+## 🙏 Acknowledgements
-- Adjust gain in split kernel and added support for an eval set with early stopping based on MSE.
+Built on the shoulders of PyTorch, scikit-learn, LightGBM, XGBoost, and the CUDA ecosystem. Special thanks to the GBDT research community and all contributors.
-### v0.1.25
+---
-- Added `colsample_bytree` parameter and new test using Numerai data.
+## 📝 Version History
+### v2.0.0 (Current)
+- ✨ **Multiclass classification support** via softmax objective
+- 🎯 Binary classification mode
+- 📊 New metrics: log loss, accuracy
+- 🏷️ Automatic label encoding (supports strings)
+- 🔮 `predict_proba()` for probability outputs
+- ✅ Comprehensive test suite for classification
+- 🔒 Full backward compatibility with regression
+- 🐛 Fixed unused variable issue (#8)
+- 🧹 Removed unimplemented L1_reg parameter
+- 📚 Major documentation overhaul with AGENT_GUIDE.md
+### v1.0.0
+- 🧠 Invariant learning via Directional Era-Splitting (DES)
+- 🚀 VRAM optimizations
+- 📈 Era-aware histogram computation
 ### v0.1.26
+- 🐛 Memory bug fixes in prediction
+- 📊 Added correlation eval metric
+### v0.1.25
+- 🎲 Feature subsampling (`colsample_bytree`)
+### v0.1.23
+- ⏹️ Early stopping support
+- ✅ Validation set evaluation
+### v0.1.21
+- ⚡ CUDA prediction kernel (replaced vectorized Python)
+---
+## 📄 License
+MIT License - see [LICENSE](LICENSE) file
+---
+## 🤝 Contributing
+Pull requests welcome! See [AGENT_GUIDE.md](AGENT_GUIDE.md) for architecture details and development guidelines.
+---
+**Built with 🔥 by @jefferythewind**
-- Fix Memory bugs in prediction and colsample bytree logic. Added "corr" eval metric.
+*"Train smarter. Predict faster. Generalize better."*

warpgbm 0.1.27__tar.gz → 2.0.0__tar.gz

warpgbm 0.1.27tar.gz → 2.0.0tar.gz