@wlearn/lightgbm 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,14 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.0
4
+
5
+ - Depend on the CommonJS `@wlearn/core` release
6
+ - Add package homepage and GitHub issue metadata
7
+
8
+ - Wrap LGBModel with `createModelClass` for unified task detection
9
+ - Add `task` parameter: `'classification'` or `'regression'`, auto-detected from labels if omitted
10
+ - When both `task` and `objective` are set, `objective` takes precedence
11
+
3
12
  ## 0.1.0
4
13
 
5
14
  - Initial release
package/LICENSE CHANGED
@@ -1,6 +1,7 @@
1
- MIT License
1
+ The MIT License (MIT)
2
2
 
3
- Copyright (c) 2025 StatSim
3
+ Copyright (c) Microsoft Corporation (LightGBM)
4
+ Copyright (c) 2026 Anton Zemlyansky (WASM port)
4
5
 
5
6
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
7
  of this software and associated documentation files (the "Software"), to deal
package/README.md CHANGED
@@ -1,41 +1,138 @@
1
1
  # @wlearn/lightgbm
2
2
 
3
- LightGBM WASM port for wlearn. Gradient boosting for classification and regression,
4
- running in browser and Node.js via WebAssembly.
3
+ LightGBM v4.6.0 compiled to WebAssembly. Gradient boosting for classification and regression in browsers and Node.js.
5
4
 
6
- ## Installation
5
+ Part of [wlearn](https://wlearn.org) ([GitHub](https://github.com/wlearn-org), [all packages](https://github.com/wlearn-org/wlearn#repository-structure)). Based on [LightGBM v4.6.0](https://github.com/microsoft/LightGBM) (MIT). Zero dependencies. CommonJS.
6
+
7
+ ## Install
7
8
 
8
9
  ```bash
9
10
  npm install @wlearn/lightgbm
10
11
  ```
11
12
 
12
- ## Usage
13
+ ## Quick start
13
14
 
14
- ```javascript
15
- import { LGBModel } from '@wlearn/lightgbm'
15
+ ```js
16
+ const { LGBModel } = require('@wlearn/lightgbm')
16
17
 
17
- // Create and train
18
18
  const model = await LGBModel.create({
19
- objective: 'binary',
19
+ task: 'classification', // or 'regression'; auto-detected from labels if omitted
20
20
  learning_rate: 0.05,
21
21
  num_leaves: 31,
22
22
  numRound: 100
23
23
  })
24
- model.fit(X, y)
24
+
25
+ // Train -- accepts number[][] or { data: Float32Array, rows, cols }
26
+ model.fit(
27
+ [[1, 2], [3, 4], [5, 6], [7, 8]],
28
+ [0, 0, 1, 1]
29
+ )
25
30
 
26
31
  // Predict
27
- const predictions = model.predict(X_test)
28
- const probabilities = model.predictProba(X_test)
29
- const accuracy = model.score(X_test, y_test)
32
+ const preds = model.predict([[2, 3], [6, 7]]) // Float64Array
33
+
34
+ // Probabilities
35
+ const probs = model.predictProba([[2, 3], [6, 7]]) // Float64Array (nrow * nclass)
36
+
37
+ // Score
38
+ const accuracy = model.score([[2, 3], [6, 7]], [0, 1])
30
39
 
31
- // Save and load
32
- const bundle = model.save()
33
- const loaded = await LGBModel.load(bundle)
40
+ // Save / load
41
+ const buf = model.save() // Uint8Array (WLRN bundle)
42
+ const model2 = await LGBModel.load(buf)
34
43
 
35
- // Clean up
44
+ // Clean up -- required, WASM memory is not garbage collected
36
45
  model.dispose()
46
+ model2.dispose()
47
+ ```
48
+
49
+ ## Typed matrix input
50
+
51
+ For best performance, pass pre-formatted typed arrays instead of `number[][]`:
52
+
53
+ ```js
54
+ const X = {
55
+ data: new Float32Array([1, 2, 3, 4, 5, 6, 7, 8]),
56
+ rows: 4,
57
+ cols: 2
58
+ }
59
+ model.fit(X, [0, 0, 1, 1])
37
60
  ```
38
61
 
62
+ Note: LightGBM uses Float32 internally. If you pass Float64Array it will be converted.
63
+
64
+ ## Task parameter
65
+
66
+ Instead of specifying LightGBM objective strings directly, you can use the unified `task` parameter:
67
+
68
+ ```js
69
+ // These are equivalent:
70
+ await LGBModel.create({ task: 'classification' }) // auto-selects 'binary' or 'multiclass'
71
+ await LGBModel.create({ objective: 'binary' })
72
+
73
+ await LGBModel.create({ task: 'regression' })
74
+ await LGBModel.create({ objective: 'regression' })
75
+ ```
76
+
77
+ When `task: 'classification'` is set, the objective is chosen automatically based on the number of unique labels in y: `binary` for 2 classes, `multiclass` for 3+.
78
+
79
+ Setting both `task` and `objective` throws an error.
80
+
81
+ ## API
82
+
83
+ ### `LGBModel.create(params?)`
84
+
85
+ Async factory. Loads WASM module, returns a ready-to-use model.
86
+
87
+ Parameters:
88
+ - `objective` -- LightGBM objective string (default: `'regression'`)
89
+ - `task` -- `'classification'` or `'regression'` (alternative to `objective`)
90
+ - `learning_rate` -- step size shrinkage (default: `0.1`)
91
+ - `num_leaves` -- max leaves per tree (default: `31`)
92
+ - `numRound` -- number of boosting rounds (default: `100`)
93
+ - `max_depth` -- max tree depth, -1 for no limit (default: `-1`)
94
+ - `subsample` -- row subsampling ratio (default: `1.0`)
95
+ - `colsample_bytree` -- column subsampling ratio (default: `1.0`)
96
+ - `min_child_weight` -- minimum sum of instance weight in a child (default: `1e-3`)
97
+ - `reg_lambda` -- L2 regularization (default: `0.0`)
98
+ - `reg_alpha` -- L1 regularization (default: `0.0`)
99
+ - `num_class` -- number of classes for multiclass (auto-set when using `task`)
100
+ - `verbosity` -- -1 = fatal, 0 = error, 1 = info (default: `-1`)
101
+
102
+ ### `model.fit(X, y)`
103
+
104
+ Train on data. Returns `this`.
105
+ - `X` -- `number[][]` or `{ data: Float32Array, rows, cols }`
106
+ - `y` -- `number[]` or typed array
107
+
108
+ ### `model.predict(X)`
109
+
110
+ Returns `Float64Array` of predicted labels (classification) or values (regression).
111
+
112
+ ### `model.predictProba(X)`
113
+
114
+ Returns `Float64Array` of shape `nrow * nclass` (row-major probabilities). Available for `binary`, `multiclass`, and `multiclassova` objectives.
115
+
116
+ ### `model.score(X, y)`
117
+
118
+ Returns accuracy (classification) or R-squared (regression).
119
+
120
+ ### `model.save()` / `LGBModel.load(buffer)`
121
+
122
+ Save to / load from `Uint8Array` (WLRN bundle with LightGBM text model blob).
123
+
124
+ ### `model.dispose()`
125
+
126
+ Free WASM memory. Required. Idempotent.
127
+
128
+ ### `model.getParams()` / `model.setParams(p)`
129
+
130
+ Get/set hyperparameters. Enables AutoML grid search and cloning.
131
+
132
+ ### `LGBModel.defaultSearchSpace()`
133
+
134
+ Returns default hyperparameter search space for AutoML.
135
+
39
136
  ## Supported objectives
40
137
 
41
138
  - `binary` -- binary classification
@@ -44,6 +141,73 @@ model.dispose()
44
141
  - `cross_entropy` -- cross-entropy classification
45
142
  - `regression` -- regression (default)
46
143
 
144
+ All standard LightGBM objectives should work. These are tested in CI.
145
+
146
+ ## Low-level API
147
+
148
+ For direct access to LightGBM's C API, use the lower-level `Dataset` and `Booster` classes:
149
+
150
+ ```js
151
+ const { loadLGB, Dataset, Booster } = require('@wlearn/lightgbm')
152
+
153
+ await loadLGB()
154
+
155
+ const data = new Float32Array([1, 2, 3, 4, 5, 6, 7, 8])
156
+ const ds = new Dataset(data, 4, 2, 'objective=binary verbosity=-1')
157
+ ds.setLabel(new Float32Array([0, 0, 1, 1]))
158
+
159
+ const booster = new Booster(ds.handle, 'objective=binary verbosity=-1')
160
+ for (let i = 0; i < 100; i++) {
161
+ booster.update()
162
+ }
163
+
164
+ const preds = booster.predict(data, 4, 2) // Float64Array
165
+ const modelBytes = booster.saveModel() // Uint8Array (text format)
166
+
167
+ booster.dispose()
168
+ ds.dispose()
169
+ ```
170
+
171
+ ### `Dataset(data, nrow, ncol, params?)`
172
+
173
+ - `data` -- `Float32Array` (row-major)
174
+ - `params` -- LightGBM parameter string (`"key1=value1 key2=value2"`)
175
+ - `.setLabel(labels)` -- set target labels (`Float32Array`)
176
+ - `.dispose()` -- free WASM memory
177
+
178
+ ### `Booster(trainDataHandle, paramsStr)`
179
+
180
+ - `.update()` -- run one training round, returns `true` if training finished
181
+ - `.predict(data, nrow, ncol, opts?)` -- predict, returns `Float64Array`
182
+ - `.saveModel()` -- returns `Uint8Array` (LightGBM text format)
183
+ - `.getNumClasses()` -- number of classes
184
+ - `.dispose()` -- free WASM memory
185
+
186
+ ### `Booster.loadModel(buffer)`
187
+
188
+ Load from `Uint8Array`. Returns a `Booster`.
189
+
190
+ ## Resource management
191
+
192
+ WASM heap memory is not garbage collected. Call `.dispose()` on every `Dataset`, `Booster`, and `LGBModel` when done. A `FinalizationRegistry` safety net warns if you forget, but do not rely on it.
193
+
194
+ ## Build from source
195
+
196
+ Requires [Emscripten](https://emscripten.org/) (emsdk) activated.
197
+
198
+ ```bash
199
+ git clone --recurse-submodules https://github.com/wlearn-org/lightgbm-wasm
200
+ cd lightgbm-wasm
201
+ bash scripts/build-wasm.sh
202
+ node --test test/
203
+ ```
204
+
205
+ If you already cloned without `--recurse-submodules`:
206
+
207
+ ```bash
208
+ git submodule update --init --recursive
209
+ ```
210
+
47
211
  ## License
48
212
 
49
- MIT (upstream LightGBM is MIT-licensed)
213
+ MIT (same as upstream LightGBM)