warpgbm 0.1.10__tar.gz → 0.1.12__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: warpgbm
3
- Version: 0.1.10
3
+ Version: 0.1.12
4
4
  License-File: LICENSE
5
5
  Requires-Dist: torch
6
6
  Requires-Dist: numpy
@@ -0,0 +1,167 @@
1
+ # WarpGBM
2
+
3
+ WarpGBM is a high-performance, GPU-accelerated Gradient Boosted Decision Tree (GBDT) library built with PyTorch and CUDA. It offers blazing-fast histogram-based training and efficient prediction, with compatibility for research and production workflows.
4
+
5
+ ---
6
+
7
+ ## Features
8
+
9
+ - GPU-accelerated training and histogram construction using custom CUDA kernels
10
+ - Drop-in scikit-learn style interface
11
+ - Supports pre-binned data or automatic quantile binning
12
+ - Fully differentiable prediction path
13
+ - Simple install with `pip`
14
+
15
+ ---
16
+
17
+ ## Performance Note
18
+
19
+ In our initial tests on an NVIDIA 3090 (local) and A100 (Google Colab Pro), WarpGBM achieves **14x to 20x faster training times** compared to LightGBM using default configurations. It also consumes **significantly less RAM and CPU**. These early results hint at more thorough benchmarking to come.
20
+
21
+ ---
22
+
23
+ ## Installation
24
+
25
+ ### 🔧 Recommended (GitHub, always latest):
26
+
27
+ ```bash
28
+ pip install git+https://github.com/jefferythewind/warpgbm.git
29
+ ```
30
+
31
+ This installs the latest version directly from GitHub and compiles CUDA extensions on your machine using your **local PyTorch and CUDA setup**. It's the most reliable method for ensuring compatibility and staying up to date with the latest features.
32
+
33
+ ### 📦 Alternatively (PyPI, stable releases):
34
+
35
+ ```bash
36
+ pip install warpgbm
37
+ ```
38
+
39
+ This installs from PyPI and also compiles CUDA code locally during installation. This method works well **if your environment already has PyTorch with GPU support** installed and configured.
40
+
41
+ > 💡 **Tip:**\
42
+ > If you encounter an error related to mismatched or missing CUDA versions, try installing with the following flag:
43
+ >
44
+ > ```bash
45
+ > pip install warpgbm --no-build-isolation
46
+ > ```
47
+
48
+ Before either method, make sure you’ve installed PyTorch with GPU support:\
49
+ 👉 [https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/locally/)
50
+
51
+ ---
52
+
53
+ ## Example
54
+
55
+ ```python
56
+ import numpy as np
57
+ from sklearn.datasets import make_regression
58
+ from time import time
59
+ import lightgbm as lgb
60
+ from warpgbm import WarpGBM
61
+
62
+ # Create synthetic regression dataset
63
+ X, y = make_regression(n_samples=100_000, n_features=500, noise=0.1, random_state=42)
64
+ X = X.astype(np.float32)
65
+ y = y.astype(np.float32)
66
+
67
+ # Train LightGBM
68
+ start = time()
69
+ lgb_model = lgb.LGBMRegressor(max_depth=5, n_estimators=100, learning_rate=0.01, max_bin=7)
70
+ lgb_model.fit(X, y)
71
+ lgb_time = time() - start
72
+ lgb_preds = lgb_model.predict(X)
73
+
74
+ # Train WarpGBM
75
+ start = time()
76
+ wgbm_model = WarpGBM(max_depth=5, n_estimators=100, learning_rate=0.01, num_bins=7)
77
+ wgbm_model.fit(X, y)
78
+ wgbm_time = time() - start
79
+ wgbm_preds = wgbm_model.predict(X)
80
+
81
+ # Results
82
+ print(f"LightGBM: corr = {np.corrcoef(lgb_preds, y)[0,1]:.4f}, time = {lgb_time:.2f}s")
83
+ print(f"WarpGBM: corr = {np.corrcoef(wgbm_preds, y)[0,1]:.4f}, time = {wgbm_time:.2f}s")
84
+ ```
85
+
86
+ **🧪 Results (Ryzen 9 CPU, NVIDIA 3090 GPU):**
87
+
88
+ ```
89
+ LightGBM: corr = 0.8742, time = 37.33s
90
+ WarpGBM: corr = 0.8621, time = 5.40s
91
+ ```
92
+
93
+ ---
94
+
95
+ ## Pre-binned Data Example (Numerai)
96
+
97
+ WarpGBM can save additional training time if your dataset is already pre-binned. The Numerai tournament data is a great example:
98
+
99
+ ```python
100
+ import pandas as pd
101
+ from numerapi import NumerAPI
102
+ from time import time
103
+ import lightgbm as lgb
104
+ from warpgbm import WarpGBM
105
+ import numpy as np
106
+
107
+ napi = NumerAPI()
108
+ napi.download_dataset('v5.0/train.parquet', 'train.parquet')
109
+ train = pd.read_parquet('train.parquet')
110
+
111
+ feature_set = [f for f in train.columns if 'feature' in f]
112
+ target = 'target_cyrus'
113
+
114
+ X_np = train[feature_set].astype('int8').values
115
+ Y_np = train[target].values
116
+
117
+ # LightGBM
118
+ start = time()
119
+ lgb_model = lgb.LGBMRegressor(max_depth=5, n_estimators=100, learning_rate=0.01, max_bin=7)
120
+ lgb_model.fit(X_np, Y_np)
121
+ lgb_time = time() - start
122
+ lgb_preds = lgb_model.predict(X_np)
123
+
124
+ # WarpGBM
125
+ start = time()
126
+ wgbm_model = WarpGBM(max_depth=5, n_estimators=100, learning_rate=0.01, num_bins=7)
127
+ wgbm_model.fit(X_np, Y_np)
128
+ wgbm_time = time() - start
129
+ wgbm_preds = wgbm_model.predict(X_np)
130
+
131
+ # Results
132
+ print(f"LightGBM: corr = {np.corrcoef(lgb_preds, Y_np)[0,1]:.4f}, time = {lgb_time:.2f}s")
133
+ print(f"WarpGBM: corr = {np.corrcoef(wgbm_preds, Y_np)[0,1]:.4f}, time = {wgbm_time:.2f}s")
134
+ ```
135
+
136
+ ---
137
+
138
+ ## Documentation
139
+
140
+ ### `WarpGBM` Parameters:
141
+ - `num_bins`: Number of histogram bins to use (default: 10)
142
+ - `max_depth`: Maximum depth of trees (default: 3)
143
+ - `learning_rate`: Shrinkage rate applied to leaf outputs (default: 0.1)
144
+ - `n_estimators`: Number of boosting iterations (default: 100)
145
+ - `min_child_weight`: Minimum sum of instance weight needed in a child (default: 20)
146
+ - `min_split_gain`: Minimum loss reduction required to make a further partition (default: 0.0)
147
+ - `verbosity`: Whether to print training logs (default: True)
148
+ - `histogram_computer`: Choice of histogram kernel (`'hist1'`, `'hist2'`, `'hist3'`) (default: `'hist3'`)
149
+ - `threads_per_block`: CUDA threads per block (default: 32)
150
+ - `rows_per_thread`: Number of training rows processed per thread (default: 4)
151
+ - `device`: Device to train on (`'cuda'` or `'cpu'`, default: `'cuda'`)
152
+ - `split_type`: Algorithm used to choose best split (`'v1'` = CUDA kernel, `'v2'` = torch-based) (default: `'v2'`)
153
+
154
+ ### Methods:
155
+ - `.fit(X, y, era_id=None)`: Train the model. `X` can be raw floats or pre-binned `int8` data. `era_id` is optional and used internally.
156
+ - `.predict(X)`: Predict on new raw float or pre-binned data.
157
+ - `.predict_data(bin_indices)`: Predict from binned data directly (NumPy `int8` matrix).
158
+ - `.grow_forest()`: Manually triggers tree construction loop (usually not needed).
159
+
160
+ ---
161
+
162
+ ## Acknowledgements
163
+
164
+ WarpGBM builds on the shoulders of PyTorch, scikit-learn, LightGBM, and the CUDA ecosystem. Thanks to all contributors in the GBDT research and engineering space.
165
+
166
+ ---
167
+
@@ -0,0 +1,68 @@
1
+ import numpy as np
2
+ from warpgbm import WarpGBM
3
+
4
+ def test_fit_predict_correlation():
5
+ np.random.seed(42)
6
+ N = 500
7
+ F = 5
8
+ X = np.random.randn(N, F).astype(np.float32)
9
+ true_weights = np.array([0.5, -1.0, 2.0, 0.0, 1.0])
10
+ noise = 0.1 * np.random.randn(N)
11
+ y = (X @ true_weights + noise).astype(np.float32)
12
+ era = np.zeros(N, dtype=np.int32)
13
+ corrs = []
14
+
15
+ model = WarpGBM(
16
+ max_depth = 10,
17
+ num_bins = 10,
18
+ n_estimators = 10,
19
+ learning_rate = 1,
20
+ verbosity=False,
21
+ histogram_computer='hist1',
22
+ threads_per_block=32,
23
+ rows_per_thread=4
24
+ )
25
+
26
+ model.fit(X, y, era_id=era)
27
+ preds = model.predict(X)
28
+
29
+ # Pearson correlation in-sample
30
+ corr = np.corrcoef(preds, y)[0, 1]
31
+ corrs.append(corr)
32
+
33
+ model = WarpGBM(
34
+ max_depth = 10,
35
+ num_bins = 10,
36
+ n_estimators = 10,
37
+ learning_rate = 1,
38
+ verbosity=False,
39
+ histogram_computer='hist2',
40
+ threads_per_block=32,
41
+ rows_per_thread=4
42
+ )
43
+
44
+ model.fit(X, y, era_id=era)
45
+ preds = model.predict(X)
46
+
47
+ # Pearson correlation in-sample
48
+ corr = np.corrcoef(preds, y)[0, 1]
49
+ corrs.append(corr)
50
+
51
+ model = WarpGBM(
52
+ max_depth = 10,
53
+ num_bins = 10,
54
+ n_estimators = 10,
55
+ learning_rate = 1,
56
+ verbosity=False,
57
+ histogram_computer='hist3',
58
+ threads_per_block=32,
59
+ rows_per_thread=4
60
+ )
61
+
62
+ model.fit(X, y, era_id=era)
63
+ preds = model.predict(X)
64
+
65
+ # Pearson correlation in-sample
66
+ corr = np.corrcoef(preds, y)[0, 1]
67
+ corrs.append(corr)
68
+ assert ( np.array(corrs) > 0.95 ).all(), f"In-sample correlation too low: {corr:.4f}"
@@ -0,0 +1 @@
1
+ 0.1.12
@@ -20,9 +20,10 @@ class WarpGBM(BaseEstimator, RegressorMixin):
20
20
  min_child_weight=20,
21
21
  min_split_gain=0.0,
22
22
  verbosity=True,
23
- histogram_computer='hist1',
24
- threads_per_block=256,
25
- rows_per_thread=1,
23
+ histogram_computer='hist3',
24
+ threads_per_block=64,
25
+ rows_per_thread=4,
26
+ L2_reg = 1e-6,
26
27
  device = 'cuda'
27
28
  ):
28
29
  self.num_bins = num_bins
@@ -52,6 +53,7 @@ class WarpGBM(BaseEstimator, RegressorMixin):
52
53
  self.compute_histogram = histogram_kernels[histogram_computer]
53
54
  self.threads_per_block = threads_per_block
54
55
  self.rows_per_thread = rows_per_thread
56
+ self.L2_reg = L2_reg
55
57
 
56
58
 
57
59
  def fit(self, X, y, era_id=None):
@@ -124,9 +126,9 @@ class WarpGBM(BaseEstimator, RegressorMixin):
124
126
  hessian_histogram.contiguous(),
125
127
  self.num_features,
126
128
  self.num_bins,
127
- 0.0, # L2 reg
128
- 1.0, # L1 reg
129
- 1e-6, # hess cap
129
+ self.min_split_gain,
130
+ self.min_child_weight,
131
+ self.L2_reg,
130
132
  self.out_feature,
131
133
  self.out_bin
132
134
  )
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: warpgbm
3
- Version: 0.1.10
3
+ Version: 0.1.12
4
4
  License-File: LICENSE
5
5
  Requires-Dist: torch
6
6
  Requires-Dist: numpy
warpgbm-0.1.10/README.md DELETED
@@ -1,60 +0,0 @@
1
- # WarpGBM
2
-
3
- WarpGBM is a high-performance, GPU-accelerated Gradient Boosted Decision Tree (GBDT) library built with PyTorch and CUDA. It offers blazing-fast histogram-based training and efficient prediction, with compatibility for research and production workflows.
4
-
5
- ---
6
-
7
- ## Features
8
-
9
- - GPU-accelerated training and histogram construction using custom CUDA kernels
10
- - Drop-in scikit-learn style interface
11
- - Supports pre-binned data or automatic quantile binning
12
- - Fully differentiable prediction path
13
- - Simple install with `pip`
14
-
15
- ---
16
-
17
- ## Performance Note
18
-
19
- In our initial tests on an NVIDIA 3090 (local) and A100 (Google Colab Pro), WarpGBM achieves **14x to 20x faster training times** compared to LightGBM using default configurations. It also consumes **significantly less RAM and CPU**. These early results hint at more thorough benchmarking to come.
20
-
21
- ---
22
-
23
- ## Installation
24
-
25
- First, install PyTorch for your system with GPU support:
26
- https://pytorch.org/get-started/locally/
27
-
28
- Then:
29
-
30
- ```bash
31
- pip install warpgbm
32
- ```
33
-
34
- Note: WarpGBM will compile custom CUDA extensions at install time using your installed PyTorch.
35
-
36
- ---
37
-
38
- ## Example
39
-
40
- ```python
41
- import numpy as np
42
- from warpgbm import WarpGBM
43
-
44
- # Generate a simple regression dataset
45
- X = np.random.randn(100, 5).astype(np.float32)
46
- w = np.array([0.5, -1.0, 2.0, 0.0, 1.0])
47
- y = (X @ w + 0.1 * np.random.randn(100)).astype(np.float32)
48
-
49
- model = WarpGBM(max_depth=3, n_estimators=10)
50
- model.fit(X, y) # era_id is optional
51
- preds = model.predict(X)
52
- ```
53
-
54
- ---
55
-
56
- ## Acknowledgements
57
-
58
- WarpGBM builds on the shoulders of PyTorch, scikit-learn, LightGBM, and the CUDA ecosystem. Thanks to all contributors in the GBDT research and engineering space.
59
-
60
- ---
@@ -1,29 +0,0 @@
1
- import numpy as np
2
- from warpgbm import WarpGBM
3
-
4
- def test_fit_predict_correlation():
5
- np.random.seed(42)
6
-
7
- N = 200
8
- F = 5
9
- X = np.random.randn(N, F).astype(np.float32)
10
- true_weights = np.array([0.5, -1.0, 2.0, 0.0, 1.0])
11
- noise = 0.1 * np.random.randn(N)
12
- y = (X @ true_weights + noise).astype(np.float32)
13
- era = np.zeros(N, dtype=np.int32)
14
-
15
- model = WarpGBM(
16
- num_bins=16,
17
- max_depth=3,
18
- n_estimators=10,
19
- learning_rate=0.2,
20
- verbosity=False,
21
- device='cuda'
22
- )
23
-
24
- model.fit(X, y, era_id=era)
25
- preds = model.predict(X)
26
-
27
- # Pearson correlation in-sample
28
- corr = np.corrcoef(preds, y)[0, 1]
29
- assert corr > 0.95, f"In-sample correlation too low: {corr:.4f}"
@@ -1 +0,0 @@
1
- 0.1.10
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes