sparsepixels 0.2.2__tar.gz → 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,188 @@
1
+ Metadata-Version: 2.4
2
+ Name: sparsepixels
3
+ Version: 0.3.0
4
+ Summary: Efficient convolution for sparse data on FPGAs
5
+ Home-page: https://github.com/hftsoi/sparse-pixels
6
+ Author: Ho Fung Tsoi
7
+ Author-email: ho.fung.tsoi@cern.ch
8
+ License: MIT
9
+ Classifier: Programming Language :: Python :: 3
10
+ Classifier: License :: OSI Approved :: MIT License
11
+ Requires-Python: >=3.10
12
+ Description-Content-Type: text/markdown
13
+ License-File: LICENSE
14
+ Requires-Dist: tensorflow
15
+ Requires-Dist: keras>=3.0
16
+ Requires-Dist: HGQ2>=0.1.8
17
+ Requires-Dist: matplotlib
18
+ Dynamic: license-file
19
+
20
+ <p align="center">
21
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/logo.png" width="300" />
22
+ </p>
23
+
24
+ <p align="center">
25
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/sparsepixels.png" width="900"/>
26
+ </p>
27
+
28
+ <p align="center">
29
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_standard.gif" width="400" />
30
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_sparse.gif" width="400" />
31
+ </p>
32
+
33
+ # SparsePixels: Efficient convolution for sparse data on FPGAs
34
+
35
+ [![arXiv](https://img.shields.io/badge/arXiv-2512.06208-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2512.06208)
36
+ [![PyPI - Version](https://img.shields.io/pypi/v/sparsepixels?color=orange&style=flat-square)](https://pypi.org/project/sparsepixels)
37
+
38
+ SparsePixels is a Keras 3 library to build, train, and deploy sparse convolutional neural networks on FPGAs. In many detectors, especially in high-energy physics experiments, the images are almost empty: only a handful of pixels carry a signal (the hits), yet a standard CNN still spends compute on every pixel. A sparse CNN convolves only over the active pixels, so its cost scales with the number of hits rather than the image size, which is what makes low-latency, real-time inference (for example in a trigger) feasible on an FPGA. This library builds quantization-aware (via [HGQ2](https://github.com/calad0i/HGQ2)) sparse CNNs in which the pixel budget and the activity threshold can be learned from data, with a hardware-aware penalty that drives the budget toward the fewest pixels the task tolerates. Trained models convert to FPGA firmware through the [hls4ml](https://github.com/fastmachinelearning/hls4ml) integration, with control over the parallelization of the sparse layers to trade latency against resource usage.
39
+
40
+ ## Installation
41
+
42
+ With Python >= 3.10:
43
+
44
+ ```
45
+ pip install sparsepixels
46
+ ```
47
+
48
+ ## Getting Started
49
+
50
+ Import the sparse layers, the quantization library (HGQ2), and the training utilities:
51
+
52
+ ```python
53
+ import keras
54
+ from keras.layers import Flatten, Activation
55
+ from hgq.layers import QConv2D, QDense
56
+ from hgq.config import QuantizerConfigScope, LayerConfigScope
57
+ from hgq.quantizer.config import QuantizerConfig
58
+ from sparsepixels.layers import InputReduce, QConv2DSparse, AveragePooling2DSparse, MaxPooling2DSparse
59
+ from sparsepixels.utils import (
60
+ active_pixels_vs_threshold, plot_reduced_examples,
61
+ set_sparse_ebops_factor, cosine_lr,
62
+ SparseTrainingMonitor, plot_history,
63
+ print_quantization, plot_quantization,
64
+ )
65
+ ```
66
+
67
+ First, study the data to pick a threshold and an initial pixel budget `n`: how many pixels stay
68
+ active as the threshold rises, and what a candidate `(n, threshold)` keeps on a few images.
69
+
70
+ ```python
71
+ active_pixels_vs_threshold(x_train)
72
+ plot_reduced_examples(x_train, n=20, threshold=0.1, n_examples=4)
73
+ ```
74
+
75
+ Build an example sparse CNN within HGQ2 quantization scopes. A custom input quantizer config with
76
+ higher initial fractional bits (`f0=8`) prevents the default (`f0=2`) from zeroing out sparse signals
77
+ in early training epochs. `InputReduce` keeps the first `n` active pixels (first channel above
78
+ `threshold`); by default `n` and `threshold` are trainable hyperparameters, and a penalty of weight `beta_n`
79
+ nudges the budget smaller, trading a little accuracy for lower FPGA latency and resources.
80
+
81
+ ```python
82
+ iq_conf = QuantizerConfig(place='datalane', q_type='kif', i0=4, f0=8, overflow_mode='WRAP')
83
+
84
+ with (
85
+ QuantizerConfigScope(place='all', default_q_type='kbi', overflow_mode='SAT_SYM'),
86
+ QuantizerConfigScope(place='datalane', default_q_type='kif', overflow_mode='WRAP'),
87
+ LayerConfigScope(enable_ebops=True, enable_iq=True, beta0=1e-5),
88
+ ):
89
+ x_in = keras.Input(shape=(28, 28, 1), name='x_in')
90
+
91
+ # Sparse input reduction
92
+ x, keep_mask = InputReduce(
93
+ n=30, # initial pixel budget
94
+ threshold=0.1, # initial activity threshold
95
+ beta_n=1e-5, # weight of the pixel budget penalty
96
+ learn_n=True, # trainable pixel budget
97
+ learn_threshold=True, # trainable threshold
98
+ name='input_reduce',
99
+ )(x_in)
100
+
101
+ # Sparse convolution
102
+ x = QConv2DSparse(filters=3, kernel_size=3, name='conv1', padding='same', strides=1,
103
+ activation='relu', iq_conf=iq_conf)([x, keep_mask])
104
+
105
+ # Sparse pooling
106
+ x, keep_mask = AveragePooling2DSparse(2, name='pool1')([x, keep_mask])
107
+
108
+ x = Flatten(name='flatten')(x)
109
+ x = QDense(10, name='dense1', activation='relu', iq_conf=iq_conf)(x)
110
+ x = Activation('softmax', name='softmax')(x)
111
+
112
+ model = keras.Model(x_in, x)
113
+ ```
114
+
115
+ Train the model, then read out the learned sparsity to deploy. `set_sparse_ebops_factor` makes the
116
+ EBOPS (a proxy for the quantized hardware cost) reflect the sparse compute rather than a dense one; a
117
+ cosine-decayed learning rate together with `restore_best_weights` keeps the learned budget from
118
+ over-compressing near the end of training. `plot_history` shows the loss breakdown, the learned
119
+ budget/threshold and the EBOPS in one figure, and the values to deploy are `layer.n_max_pixels` and
120
+ `layer.threshold`.
121
+
122
+ ```python
123
+ set_sparse_ebops_factor(model)
124
+
125
+ steps_per_epoch = len(x_train) // 128
126
+ early_stop = keras.callbacks.EarlyStopping(monitor='val_accuracy', mode='max', patience=20, restore_best_weights=True)
127
+ model.compile(
128
+ optimizer=keras.optimizers.Adam(cosine_lr(1e-3, epochs=100, steps_per_epoch=steps_per_epoch)),
129
+ loss='categorical_crossentropy', metrics=['accuracy'],
130
+ )
131
+ history = model.fit(x_train, y_train, validation_data=(x_val, y_val),
132
+ epochs=100, batch_size=128, callbacks=[early_stop, SparseTrainingMonitor()])
133
+
134
+ plot_history(history, early_stopping=early_stop) # loss breakdown, budget, threshold, EBOPS
135
+ print_quantization(model) # per-layer bit-width distribution and EBOPS
136
+ plot_quantization(model)
137
+
138
+ ir = model.get_layer('input_reduce')
139
+ print(f"deploy with n_max_pixels={ir.n_max_pixels}, threshold={ir.threshold:.3f}")
140
+ ```
141
+
142
+ ## Converting a trained model to HLS with hls4ml
143
+
144
+ > **Note:** A [PR](https://github.com/fastmachinelearning/hls4ml/pull/1468) adding `sparsepixels` support to the official [hls4ml](https://github.com/fastmachinelearning/hls4ml) repo has been submitted but is not yet merged. In the meantime you can install hls4ml from the PR branch on this fork to try the converter:
145
+ >
146
+ > ```bash
147
+ > pip install "git+https://github.com/hftsoi/hls4ml.git@sparsepixels"
148
+ > ```
149
+
150
+ Once installed, converting a trained sparsepixels model to HLS is as usual:
151
+
152
+ ```python
153
+ import hls4ml
154
+
155
+ hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')
156
+ hls_config.setdefault('Model', {})['PipelineStyle'] = 'dataflow' # use "#pragma HLS DATAFLOW" (instead of the default "#pragma HLS PIPELINE" for io_parallel)
157
+
158
+ hls_model = hls4ml.converters.convert_from_keras_model(
159
+ model,
160
+ hls_config=hls_config,
161
+ output_dir='hls_proj/my_sparse_cnn',
162
+ backend='Vitis',
163
+ io_type='io_parallel',
164
+ )
165
+ hls_model.write()
166
+ hls_model.compile()
167
+ y_hls = hls_model.predict(x_test)
168
+ ```
169
+
170
+ ## Documentation
171
+
172
+ Coming soon!
173
+
174
+ ## Citation
175
+
176
+ If you find this useful in your research, please consider citing:
177
+
178
+ ```
179
+ @article{Tsoi:2025nvg,
180
+ author = "Tsoi, Ho Fung and Rankin, Dylan and Loncar, Vladimir and Harris, Philip",
181
+ title = "{SparsePixels: Efficient Convolution for Sparse Data on FPGAs}",
182
+ eprint = "2512.06208",
183
+ archivePrefix = "arXiv",
184
+ primaryClass = "cs.AR",
185
+ month = "12",
186
+ year = "2025"
187
+ }
188
+ ```
@@ -0,0 +1,169 @@
1
+ <p align="center">
2
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/logo.png" width="300" />
3
+ </p>
4
+
5
+ <p align="center">
6
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/sparsepixels.png" width="900"/>
7
+ </p>
8
+
9
+ <p align="center">
10
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_standard.gif" width="400" />
11
+ <img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_sparse.gif" width="400" />
12
+ </p>
13
+
14
+ # SparsePixels: Efficient convolution for sparse data on FPGAs
15
+
16
+ [![arXiv](https://img.shields.io/badge/arXiv-2512.06208-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2512.06208)
17
+ [![PyPI - Version](https://img.shields.io/pypi/v/sparsepixels?color=orange&style=flat-square)](https://pypi.org/project/sparsepixels)
18
+
19
+ SparsePixels is a Keras 3 library to build, train, and deploy sparse convolutional neural networks on FPGAs. In many detectors, especially in high-energy physics experiments, the images are almost empty: only a handful of pixels carry a signal (the hits), yet a standard CNN still spends compute on every pixel. A sparse CNN convolves only over the active pixels, so its cost scales with the number of hits rather than the image size, which is what makes low-latency, real-time inference (for example in a trigger) feasible on an FPGA. This library builds quantization-aware (via [HGQ2](https://github.com/calad0i/HGQ2)) sparse CNNs in which the pixel budget and the activity threshold can be learned from data, with a hardware-aware penalty that drives the budget toward the fewest pixels the task tolerates. Trained models convert to FPGA firmware through the [hls4ml](https://github.com/fastmachinelearning/hls4ml) integration, with control over the parallelization of the sparse layers to trade latency against resource usage.
20
+
21
+ ## Installation
22
+
23
+ With Python >= 3.10:
24
+
25
+ ```
26
+ pip install sparsepixels
27
+ ```
28
+
29
+ ## Getting Started
30
+
31
+ Import the sparse layers, the quantization library (HGQ2), and the training utilities:
32
+
33
+ ```python
34
+ import keras
35
+ from keras.layers import Flatten, Activation
36
+ from hgq.layers import QConv2D, QDense
37
+ from hgq.config import QuantizerConfigScope, LayerConfigScope
38
+ from hgq.quantizer.config import QuantizerConfig
39
+ from sparsepixels.layers import InputReduce, QConv2DSparse, AveragePooling2DSparse, MaxPooling2DSparse
40
+ from sparsepixels.utils import (
41
+ active_pixels_vs_threshold, plot_reduced_examples,
42
+ set_sparse_ebops_factor, cosine_lr,
43
+ SparseTrainingMonitor, plot_history,
44
+ print_quantization, plot_quantization,
45
+ )
46
+ ```
47
+
48
+ First, study the data to pick a threshold and an initial pixel budget `n`: how many pixels stay
49
+ active as the threshold rises, and what a candidate `(n, threshold)` keeps on a few images.
50
+
51
+ ```python
52
+ active_pixels_vs_threshold(x_train)
53
+ plot_reduced_examples(x_train, n=20, threshold=0.1, n_examples=4)
54
+ ```
55
+
56
+ Build an example sparse CNN within HGQ2 quantization scopes. A custom input quantizer config with
57
+ higher initial fractional bits (`f0=8`) prevents the default (`f0=2`) from zeroing out sparse signals
58
+ in early training epochs. `InputReduce` keeps the first `n` active pixels (first channel above
59
+ `threshold`); by default `n` and `threshold` are trainable hyperparameters, and a penalty of weight `beta_n`
60
+ nudges the budget smaller, trading a little accuracy for lower FPGA latency and resources.
61
+
62
+ ```python
63
+ iq_conf = QuantizerConfig(place='datalane', q_type='kif', i0=4, f0=8, overflow_mode='WRAP')
64
+
65
+ with (
66
+ QuantizerConfigScope(place='all', default_q_type='kbi', overflow_mode='SAT_SYM'),
67
+ QuantizerConfigScope(place='datalane', default_q_type='kif', overflow_mode='WRAP'),
68
+ LayerConfigScope(enable_ebops=True, enable_iq=True, beta0=1e-5),
69
+ ):
70
+ x_in = keras.Input(shape=(28, 28, 1), name='x_in')
71
+
72
+ # Sparse input reduction
73
+ x, keep_mask = InputReduce(
74
+ n=30, # initial pixel budget
75
+ threshold=0.1, # initial activity threshold
76
+ beta_n=1e-5, # weight of the pixel budget penalty
77
+ learn_n=True, # trainable pixel budget
78
+ learn_threshold=True, # trainable threshold
79
+ name='input_reduce',
80
+ )(x_in)
81
+
82
+ # Sparse convolution
83
+ x = QConv2DSparse(filters=3, kernel_size=3, name='conv1', padding='same', strides=1,
84
+ activation='relu', iq_conf=iq_conf)([x, keep_mask])
85
+
86
+ # Sparse pooling
87
+ x, keep_mask = AveragePooling2DSparse(2, name='pool1')([x, keep_mask])
88
+
89
+ x = Flatten(name='flatten')(x)
90
+ x = QDense(10, name='dense1', activation='relu', iq_conf=iq_conf)(x)
91
+ x = Activation('softmax', name='softmax')(x)
92
+
93
+ model = keras.Model(x_in, x)
94
+ ```
95
+
96
+ Train the model, then read out the learned sparsity to deploy. `set_sparse_ebops_factor` makes the
97
+ EBOPS (a proxy for the quantized hardware cost) reflect the sparse compute rather than a dense one; a
98
+ cosine-decayed learning rate together with `restore_best_weights` keeps the learned budget from
99
+ over-compressing near the end of training. `plot_history` shows the loss breakdown, the learned
100
+ budget/threshold and the EBOPS in one figure, and the values to deploy are `layer.n_max_pixels` and
101
+ `layer.threshold`.
102
+
103
+ ```python
104
+ set_sparse_ebops_factor(model)
105
+
106
+ steps_per_epoch = len(x_train) // 128
107
+ early_stop = keras.callbacks.EarlyStopping(monitor='val_accuracy', mode='max', patience=20, restore_best_weights=True)
108
+ model.compile(
109
+ optimizer=keras.optimizers.Adam(cosine_lr(1e-3, epochs=100, steps_per_epoch=steps_per_epoch)),
110
+ loss='categorical_crossentropy', metrics=['accuracy'],
111
+ )
112
+ history = model.fit(x_train, y_train, validation_data=(x_val, y_val),
113
+ epochs=100, batch_size=128, callbacks=[early_stop, SparseTrainingMonitor()])
114
+
115
+ plot_history(history, early_stopping=early_stop) # loss breakdown, budget, threshold, EBOPS
116
+ print_quantization(model) # per-layer bit-width distribution and EBOPS
117
+ plot_quantization(model)
118
+
119
+ ir = model.get_layer('input_reduce')
120
+ print(f"deploy with n_max_pixels={ir.n_max_pixels}, threshold={ir.threshold:.3f}")
121
+ ```
122
+
123
+ ## Converting a trained model to HLS with hls4ml
124
+
125
+ > **Note:** A [PR](https://github.com/fastmachinelearning/hls4ml/pull/1468) adding `sparsepixels` support to the official [hls4ml](https://github.com/fastmachinelearning/hls4ml) repo has been submitted but is not yet merged. In the meantime you can install hls4ml from the PR branch on this fork to try the converter:
126
+ >
127
+ > ```bash
128
+ > pip install "git+https://github.com/hftsoi/hls4ml.git@sparsepixels"
129
+ > ```
130
+
131
+ Once installed, converting a trained sparsepixels model to HLS is as usual:
132
+
133
+ ```python
134
+ import hls4ml
135
+
136
+ hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')
137
+ hls_config.setdefault('Model', {})['PipelineStyle'] = 'dataflow' # use "#pragma HLS DATAFLOW" (instead of the default "#pragma HLS PIPELINE" for io_parallel)
138
+
139
+ hls_model = hls4ml.converters.convert_from_keras_model(
140
+ model,
141
+ hls_config=hls_config,
142
+ output_dir='hls_proj/my_sparse_cnn',
143
+ backend='Vitis',
144
+ io_type='io_parallel',
145
+ )
146
+ hls_model.write()
147
+ hls_model.compile()
148
+ y_hls = hls_model.predict(x_test)
149
+ ```
150
+
151
+ ## Documentation
152
+
153
+ Coming soon!
154
+
155
+ ## Citation
156
+
157
+ If you find this useful in your research, please consider citing:
158
+
159
+ ```
160
+ @article{Tsoi:2025nvg,
161
+ author = "Tsoi, Ho Fung and Rankin, Dylan and Loncar, Vladimir and Harris, Philip",
162
+ title = "{SparsePixels: Efficient Convolution for Sparse Data on FPGAs}",
163
+ eprint = "2512.06208",
164
+ archivePrefix = "arXiv",
165
+ primaryClass = "cs.AR",
166
+ month = "12",
167
+ year = "2025"
168
+ }
169
+ ```
@@ -1,6 +1,6 @@
1
1
  [metadata]
2
2
  name = sparsepixels
3
- version = 0.2.2
3
+ version = 0.3.0
4
4
  description = Efficient convolution for sparse data on FPGAs
5
5
  author = Ho Fung Tsoi
6
6
  author_email = ho.fung.tsoi@cern.ch
@@ -19,7 +19,8 @@ python_requires = >=3.10
19
19
  install_requires =
20
20
  tensorflow
21
21
  keras>=3.0
22
- HGQ2
22
+ HGQ2>=0.1.8
23
+ matplotlib
23
24
  include_package_data = True
24
25
 
25
26
  [options.package_data]