sparsepixels 0.2.2__tar.gz → 0.3.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- sparsepixels-0.3.0/PKG-INFO +188 -0
- sparsepixels-0.3.0/README.md +169 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/setup.cfg +3 -2
- sparsepixels-0.3.0/sparsepixels/layers.py +325 -0
- sparsepixels-0.3.0/sparsepixels/utils.py +488 -0
- sparsepixels-0.3.0/sparsepixels.egg-info/PKG-INFO +188 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/sparsepixels.egg-info/SOURCES.txt +1 -0
- sparsepixels-0.3.0/sparsepixels.egg-info/requires.txt +4 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/sparsepixels.egg-info/top_level.txt +0 -2
- sparsepixels-0.3.0/tests/test_model.py +93 -0
- sparsepixels-0.2.2/PKG-INFO +0 -108
- sparsepixels-0.2.2/README.md +0 -90
- sparsepixels-0.2.2/sparsepixels/layers.py +0 -162
- sparsepixels-0.2.2/sparsepixels.egg-info/PKG-INFO +0 -108
- sparsepixels-0.2.2/sparsepixels.egg-info/requires.txt +0 -3
- sparsepixels-0.2.2/tests/test_model.py +0 -55
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/LICENSE +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/notebook/utils.py +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/pyproject.toml +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/setup.py +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/sparsepixels/__init__.py +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/sparsepixels/img/logo.png +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/sparsepixels.egg-info/dependency_links.txt +0 -0
- {sparsepixels-0.2.2 → sparsepixels-0.3.0}/tests/__init__.py +0 -0
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: sparsepixels
|
|
3
|
+
Version: 0.3.0
|
|
4
|
+
Summary: Efficient convolution for sparse data on FPGAs
|
|
5
|
+
Home-page: https://github.com/hftsoi/sparse-pixels
|
|
6
|
+
Author: Ho Fung Tsoi
|
|
7
|
+
Author-email: ho.fung.tsoi@cern.ch
|
|
8
|
+
License: MIT
|
|
9
|
+
Classifier: Programming Language :: Python :: 3
|
|
10
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
+
Requires-Python: >=3.10
|
|
12
|
+
Description-Content-Type: text/markdown
|
|
13
|
+
License-File: LICENSE
|
|
14
|
+
Requires-Dist: tensorflow
|
|
15
|
+
Requires-Dist: keras>=3.0
|
|
16
|
+
Requires-Dist: HGQ2>=0.1.8
|
|
17
|
+
Requires-Dist: matplotlib
|
|
18
|
+
Dynamic: license-file
|
|
19
|
+
|
|
20
|
+
<p align="center">
|
|
21
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/logo.png" width="300" />
|
|
22
|
+
</p>
|
|
23
|
+
|
|
24
|
+
<p align="center">
|
|
25
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/sparsepixels.png" width="900"/>
|
|
26
|
+
</p>
|
|
27
|
+
|
|
28
|
+
<p align="center">
|
|
29
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_standard.gif" width="400" />
|
|
30
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_sparse.gif" width="400" />
|
|
31
|
+
</p>
|
|
32
|
+
|
|
33
|
+
# SparsePixels: Efficient convolution for sparse data on FPGAs
|
|
34
|
+
|
|
35
|
+
[](https://arxiv.org/abs/2512.06208)
|
|
36
|
+
[](https://pypi.org/project/sparsepixels)
|
|
37
|
+
|
|
38
|
+
SparsePixels is a Keras 3 library to build, train, and deploy sparse convolutional neural networks on FPGAs. In many detectors, especially in high-energy physics experiments, the images are almost empty: only a handful of pixels carry a signal (the hits), yet a standard CNN still spends compute on every pixel. A sparse CNN convolves only over the active pixels, so its cost scales with the number of hits rather than the image size, which is what makes low-latency, real-time inference (for example in a trigger) feasible on an FPGA. This library builds quantization-aware (via [HGQ2](https://github.com/calad0i/HGQ2)) sparse CNNs in which the pixel budget and the activity threshold can be learned from data, with a hardware-aware penalty that drives the budget toward the fewest pixels the task tolerates. Trained models convert to FPGA firmware through the [hls4ml](https://github.com/fastmachinelearning/hls4ml) integration, with control over the parallelization of the sparse layers to trade latency against resource usage.
|
|
39
|
+
|
|
40
|
+
## Installation
|
|
41
|
+
|
|
42
|
+
With Python >= 3.10:
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
pip install sparsepixels
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Getting Started
|
|
49
|
+
|
|
50
|
+
Import the sparse layers, the quantization library (HGQ2), and the training utilities:
|
|
51
|
+
|
|
52
|
+
```python
|
|
53
|
+
import keras
|
|
54
|
+
from keras.layers import Flatten, Activation
|
|
55
|
+
from hgq.layers import QConv2D, QDense
|
|
56
|
+
from hgq.config import QuantizerConfigScope, LayerConfigScope
|
|
57
|
+
from hgq.quantizer.config import QuantizerConfig
|
|
58
|
+
from sparsepixels.layers import InputReduce, QConv2DSparse, AveragePooling2DSparse, MaxPooling2DSparse
|
|
59
|
+
from sparsepixels.utils import (
|
|
60
|
+
active_pixels_vs_threshold, plot_reduced_examples,
|
|
61
|
+
set_sparse_ebops_factor, cosine_lr,
|
|
62
|
+
SparseTrainingMonitor, plot_history,
|
|
63
|
+
print_quantization, plot_quantization,
|
|
64
|
+
)
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
First, study the data to pick a threshold and an initial pixel budget `n`: how many pixels stay
|
|
68
|
+
active as the threshold rises, and what a candidate `(n, threshold)` keeps on a few images.
|
|
69
|
+
|
|
70
|
+
```python
|
|
71
|
+
active_pixels_vs_threshold(x_train)
|
|
72
|
+
plot_reduced_examples(x_train, n=20, threshold=0.1, n_examples=4)
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
Build an example sparse CNN within HGQ2 quantization scopes. A custom input quantizer config with
|
|
76
|
+
higher initial fractional bits (`f0=8`) prevents the default (`f0=2`) from zeroing out sparse signals
|
|
77
|
+
in early training epochs. `InputReduce` keeps the first `n` active pixels (first channel above
|
|
78
|
+
`threshold`); by default `n` and `threshold` are trainable hyperparameters, and a penalty of weight `beta_n`
|
|
79
|
+
nudges the budget smaller, trading a little accuracy for lower FPGA latency and resources.
|
|
80
|
+
|
|
81
|
+
```python
|
|
82
|
+
iq_conf = QuantizerConfig(place='datalane', q_type='kif', i0=4, f0=8, overflow_mode='WRAP')
|
|
83
|
+
|
|
84
|
+
with (
|
|
85
|
+
QuantizerConfigScope(place='all', default_q_type='kbi', overflow_mode='SAT_SYM'),
|
|
86
|
+
QuantizerConfigScope(place='datalane', default_q_type='kif', overflow_mode='WRAP'),
|
|
87
|
+
LayerConfigScope(enable_ebops=True, enable_iq=True, beta0=1e-5),
|
|
88
|
+
):
|
|
89
|
+
x_in = keras.Input(shape=(28, 28, 1), name='x_in')
|
|
90
|
+
|
|
91
|
+
# Sparse input reduction
|
|
92
|
+
x, keep_mask = InputReduce(
|
|
93
|
+
n=30, # initial pixel budget
|
|
94
|
+
threshold=0.1, # initial activity threshold
|
|
95
|
+
beta_n=1e-5, # weight of the pixel budget penalty
|
|
96
|
+
learn_n=True, # trainable pixel budget
|
|
97
|
+
learn_threshold=True, # trainable threshold
|
|
98
|
+
name='input_reduce',
|
|
99
|
+
)(x_in)
|
|
100
|
+
|
|
101
|
+
# Sparse convolution
|
|
102
|
+
x = QConv2DSparse(filters=3, kernel_size=3, name='conv1', padding='same', strides=1,
|
|
103
|
+
activation='relu', iq_conf=iq_conf)([x, keep_mask])
|
|
104
|
+
|
|
105
|
+
# Sparse pooling
|
|
106
|
+
x, keep_mask = AveragePooling2DSparse(2, name='pool1')([x, keep_mask])
|
|
107
|
+
|
|
108
|
+
x = Flatten(name='flatten')(x)
|
|
109
|
+
x = QDense(10, name='dense1', activation='relu', iq_conf=iq_conf)(x)
|
|
110
|
+
x = Activation('softmax', name='softmax')(x)
|
|
111
|
+
|
|
112
|
+
model = keras.Model(x_in, x)
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Train the model, then read out the learned sparsity to deploy. `set_sparse_ebops_factor` makes the
|
|
116
|
+
EBOPS (a proxy for the quantized hardware cost) reflect the sparse compute rather than a dense one; a
|
|
117
|
+
cosine-decayed learning rate together with `restore_best_weights` keeps the learned budget from
|
|
118
|
+
over-compressing near the end of training. `plot_history` shows the loss breakdown, the learned
|
|
119
|
+
budget/threshold and the EBOPS in one figure, and the values to deploy are `layer.n_max_pixels` and
|
|
120
|
+
`layer.threshold`.
|
|
121
|
+
|
|
122
|
+
```python
|
|
123
|
+
set_sparse_ebops_factor(model)
|
|
124
|
+
|
|
125
|
+
steps_per_epoch = len(x_train) // 128
|
|
126
|
+
early_stop = keras.callbacks.EarlyStopping(monitor='val_accuracy', mode='max', patience=20, restore_best_weights=True)
|
|
127
|
+
model.compile(
|
|
128
|
+
optimizer=keras.optimizers.Adam(cosine_lr(1e-3, epochs=100, steps_per_epoch=steps_per_epoch)),
|
|
129
|
+
loss='categorical_crossentropy', metrics=['accuracy'],
|
|
130
|
+
)
|
|
131
|
+
history = model.fit(x_train, y_train, validation_data=(x_val, y_val),
|
|
132
|
+
epochs=100, batch_size=128, callbacks=[early_stop, SparseTrainingMonitor()])
|
|
133
|
+
|
|
134
|
+
plot_history(history, early_stopping=early_stop) # loss breakdown, budget, threshold, EBOPS
|
|
135
|
+
print_quantization(model) # per-layer bit-width distribution and EBOPS
|
|
136
|
+
plot_quantization(model)
|
|
137
|
+
|
|
138
|
+
ir = model.get_layer('input_reduce')
|
|
139
|
+
print(f"deploy with n_max_pixels={ir.n_max_pixels}, threshold={ir.threshold:.3f}")
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Converting a trained model to HLS with hls4ml
|
|
143
|
+
|
|
144
|
+
> **Note:** A [PR](https://github.com/fastmachinelearning/hls4ml/pull/1468) adding `sparsepixels` support to the official [hls4ml](https://github.com/fastmachinelearning/hls4ml) repo has been submitted but is not yet merged. In the meantime you can install hls4ml from the PR branch on this fork to try the converter:
|
|
145
|
+
>
|
|
146
|
+
> ```bash
|
|
147
|
+
> pip install "git+https://github.com/hftsoi/hls4ml.git@sparsepixels"
|
|
148
|
+
> ```
|
|
149
|
+
|
|
150
|
+
Once installed, converting a trained sparsepixels model to HLS is as usual:
|
|
151
|
+
|
|
152
|
+
```python
|
|
153
|
+
import hls4ml
|
|
154
|
+
|
|
155
|
+
hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')
|
|
156
|
+
hls_config.setdefault('Model', {})['PipelineStyle'] = 'dataflow' # use "#pragma HLS DATAFLOW" (instead of the default "#pragma HLS PIPELINE" for io_parallel)
|
|
157
|
+
|
|
158
|
+
hls_model = hls4ml.converters.convert_from_keras_model(
|
|
159
|
+
model,
|
|
160
|
+
hls_config=hls_config,
|
|
161
|
+
output_dir='hls_proj/my_sparse_cnn',
|
|
162
|
+
backend='Vitis',
|
|
163
|
+
io_type='io_parallel',
|
|
164
|
+
)
|
|
165
|
+
hls_model.write()
|
|
166
|
+
hls_model.compile()
|
|
167
|
+
y_hls = hls_model.predict(x_test)
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
## Documentation
|
|
171
|
+
|
|
172
|
+
Coming soon!
|
|
173
|
+
|
|
174
|
+
## Citation
|
|
175
|
+
|
|
176
|
+
If you find this useful in your research, please consider citing:
|
|
177
|
+
|
|
178
|
+
```
|
|
179
|
+
@article{Tsoi:2025nvg,
|
|
180
|
+
author = "Tsoi, Ho Fung and Rankin, Dylan and Loncar, Vladimir and Harris, Philip",
|
|
181
|
+
title = "{SparsePixels: Efficient Convolution for Sparse Data on FPGAs}",
|
|
182
|
+
eprint = "2512.06208",
|
|
183
|
+
archivePrefix = "arXiv",
|
|
184
|
+
primaryClass = "cs.AR",
|
|
185
|
+
month = "12",
|
|
186
|
+
year = "2025"
|
|
187
|
+
}
|
|
188
|
+
```
|
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/logo.png" width="300" />
|
|
3
|
+
</p>
|
|
4
|
+
|
|
5
|
+
<p align="center">
|
|
6
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/sparsepixels.png" width="900"/>
|
|
7
|
+
</p>
|
|
8
|
+
|
|
9
|
+
<p align="center">
|
|
10
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_standard.gif" width="400" />
|
|
11
|
+
<img src="https://raw.githubusercontent.com/hftsoi/sparse-pixels/main/docs/figs/cnn_sparse.gif" width="400" />
|
|
12
|
+
</p>
|
|
13
|
+
|
|
14
|
+
# SparsePixels: Efficient convolution for sparse data on FPGAs
|
|
15
|
+
|
|
16
|
+
[](https://arxiv.org/abs/2512.06208)
|
|
17
|
+
[](https://pypi.org/project/sparsepixels)
|
|
18
|
+
|
|
19
|
+
SparsePixels is a Keras 3 library to build, train, and deploy sparse convolutional neural networks on FPGAs. In many detectors, especially in high-energy physics experiments, the images are almost empty: only a handful of pixels carry a signal (the hits), yet a standard CNN still spends compute on every pixel. A sparse CNN convolves only over the active pixels, so its cost scales with the number of hits rather than the image size, which is what makes low-latency, real-time inference (for example in a trigger) feasible on an FPGA. This library builds quantization-aware (via [HGQ2](https://github.com/calad0i/HGQ2)) sparse CNNs in which the pixel budget and the activity threshold can be learned from data, with a hardware-aware penalty that drives the budget toward the fewest pixels the task tolerates. Trained models convert to FPGA firmware through the [hls4ml](https://github.com/fastmachinelearning/hls4ml) integration, with control over the parallelization of the sparse layers to trade latency against resource usage.
|
|
20
|
+
|
|
21
|
+
## Installation
|
|
22
|
+
|
|
23
|
+
With Python >= 3.10:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
pip install sparsepixels
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Getting Started
|
|
30
|
+
|
|
31
|
+
Import the sparse layers, the quantization library (HGQ2), and the training utilities:
|
|
32
|
+
|
|
33
|
+
```python
|
|
34
|
+
import keras
|
|
35
|
+
from keras.layers import Flatten, Activation
|
|
36
|
+
from hgq.layers import QConv2D, QDense
|
|
37
|
+
from hgq.config import QuantizerConfigScope, LayerConfigScope
|
|
38
|
+
from hgq.quantizer.config import QuantizerConfig
|
|
39
|
+
from sparsepixels.layers import InputReduce, QConv2DSparse, AveragePooling2DSparse, MaxPooling2DSparse
|
|
40
|
+
from sparsepixels.utils import (
|
|
41
|
+
active_pixels_vs_threshold, plot_reduced_examples,
|
|
42
|
+
set_sparse_ebops_factor, cosine_lr,
|
|
43
|
+
SparseTrainingMonitor, plot_history,
|
|
44
|
+
print_quantization, plot_quantization,
|
|
45
|
+
)
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
First, study the data to pick a threshold and an initial pixel budget `n`: how many pixels stay
|
|
49
|
+
active as the threshold rises, and what a candidate `(n, threshold)` keeps on a few images.
|
|
50
|
+
|
|
51
|
+
```python
|
|
52
|
+
active_pixels_vs_threshold(x_train)
|
|
53
|
+
plot_reduced_examples(x_train, n=20, threshold=0.1, n_examples=4)
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
Build an example sparse CNN within HGQ2 quantization scopes. A custom input quantizer config with
|
|
57
|
+
higher initial fractional bits (`f0=8`) prevents the default (`f0=2`) from zeroing out sparse signals
|
|
58
|
+
in early training epochs. `InputReduce` keeps the first `n` active pixels (first channel above
|
|
59
|
+
`threshold`); by default `n` and `threshold` are trainable hyperparameters, and a penalty of weight `beta_n`
|
|
60
|
+
nudges the budget smaller, trading a little accuracy for lower FPGA latency and resources.
|
|
61
|
+
|
|
62
|
+
```python
|
|
63
|
+
iq_conf = QuantizerConfig(place='datalane', q_type='kif', i0=4, f0=8, overflow_mode='WRAP')
|
|
64
|
+
|
|
65
|
+
with (
|
|
66
|
+
QuantizerConfigScope(place='all', default_q_type='kbi', overflow_mode='SAT_SYM'),
|
|
67
|
+
QuantizerConfigScope(place='datalane', default_q_type='kif', overflow_mode='WRAP'),
|
|
68
|
+
LayerConfigScope(enable_ebops=True, enable_iq=True, beta0=1e-5),
|
|
69
|
+
):
|
|
70
|
+
x_in = keras.Input(shape=(28, 28, 1), name='x_in')
|
|
71
|
+
|
|
72
|
+
# Sparse input reduction
|
|
73
|
+
x, keep_mask = InputReduce(
|
|
74
|
+
n=30, # initial pixel budget
|
|
75
|
+
threshold=0.1, # initial activity threshold
|
|
76
|
+
beta_n=1e-5, # weight of the pixel budget penalty
|
|
77
|
+
learn_n=True, # trainable pixel budget
|
|
78
|
+
learn_threshold=True, # trainable threshold
|
|
79
|
+
name='input_reduce',
|
|
80
|
+
)(x_in)
|
|
81
|
+
|
|
82
|
+
# Sparse convolution
|
|
83
|
+
x = QConv2DSparse(filters=3, kernel_size=3, name='conv1', padding='same', strides=1,
|
|
84
|
+
activation='relu', iq_conf=iq_conf)([x, keep_mask])
|
|
85
|
+
|
|
86
|
+
# Sparse pooling
|
|
87
|
+
x, keep_mask = AveragePooling2DSparse(2, name='pool1')([x, keep_mask])
|
|
88
|
+
|
|
89
|
+
x = Flatten(name='flatten')(x)
|
|
90
|
+
x = QDense(10, name='dense1', activation='relu', iq_conf=iq_conf)(x)
|
|
91
|
+
x = Activation('softmax', name='softmax')(x)
|
|
92
|
+
|
|
93
|
+
model = keras.Model(x_in, x)
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
Train the model, then read out the learned sparsity to deploy. `set_sparse_ebops_factor` makes the
|
|
97
|
+
EBOPS (a proxy for the quantized hardware cost) reflect the sparse compute rather than a dense one; a
|
|
98
|
+
cosine-decayed learning rate together with `restore_best_weights` keeps the learned budget from
|
|
99
|
+
over-compressing near the end of training. `plot_history` shows the loss breakdown, the learned
|
|
100
|
+
budget/threshold and the EBOPS in one figure, and the values to deploy are `layer.n_max_pixels` and
|
|
101
|
+
`layer.threshold`.
|
|
102
|
+
|
|
103
|
+
```python
|
|
104
|
+
set_sparse_ebops_factor(model)
|
|
105
|
+
|
|
106
|
+
steps_per_epoch = len(x_train) // 128
|
|
107
|
+
early_stop = keras.callbacks.EarlyStopping(monitor='val_accuracy', mode='max', patience=20, restore_best_weights=True)
|
|
108
|
+
model.compile(
|
|
109
|
+
optimizer=keras.optimizers.Adam(cosine_lr(1e-3, epochs=100, steps_per_epoch=steps_per_epoch)),
|
|
110
|
+
loss='categorical_crossentropy', metrics=['accuracy'],
|
|
111
|
+
)
|
|
112
|
+
history = model.fit(x_train, y_train, validation_data=(x_val, y_val),
|
|
113
|
+
epochs=100, batch_size=128, callbacks=[early_stop, SparseTrainingMonitor()])
|
|
114
|
+
|
|
115
|
+
plot_history(history, early_stopping=early_stop) # loss breakdown, budget, threshold, EBOPS
|
|
116
|
+
print_quantization(model) # per-layer bit-width distribution and EBOPS
|
|
117
|
+
plot_quantization(model)
|
|
118
|
+
|
|
119
|
+
ir = model.get_layer('input_reduce')
|
|
120
|
+
print(f"deploy with n_max_pixels={ir.n_max_pixels}, threshold={ir.threshold:.3f}")
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
## Converting a trained model to HLS with hls4ml
|
|
124
|
+
|
|
125
|
+
> **Note:** A [PR](https://github.com/fastmachinelearning/hls4ml/pull/1468) adding `sparsepixels` support to the official [hls4ml](https://github.com/fastmachinelearning/hls4ml) repo has been submitted but is not yet merged. In the meantime you can install hls4ml from the PR branch on this fork to try the converter:
|
|
126
|
+
>
|
|
127
|
+
> ```bash
|
|
128
|
+
> pip install "git+https://github.com/hftsoi/hls4ml.git@sparsepixels"
|
|
129
|
+
> ```
|
|
130
|
+
|
|
131
|
+
Once installed, converting a trained sparsepixels model to HLS is as usual:
|
|
132
|
+
|
|
133
|
+
```python
|
|
134
|
+
import hls4ml
|
|
135
|
+
|
|
136
|
+
hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')
|
|
137
|
+
hls_config.setdefault('Model', {})['PipelineStyle'] = 'dataflow' # use "#pragma HLS DATAFLOW" (instead of the default "#pragma HLS PIPELINE" for io_parallel)
|
|
138
|
+
|
|
139
|
+
hls_model = hls4ml.converters.convert_from_keras_model(
|
|
140
|
+
model,
|
|
141
|
+
hls_config=hls_config,
|
|
142
|
+
output_dir='hls_proj/my_sparse_cnn',
|
|
143
|
+
backend='Vitis',
|
|
144
|
+
io_type='io_parallel',
|
|
145
|
+
)
|
|
146
|
+
hls_model.write()
|
|
147
|
+
hls_model.compile()
|
|
148
|
+
y_hls = hls_model.predict(x_test)
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
## Documentation
|
|
152
|
+
|
|
153
|
+
Coming soon!
|
|
154
|
+
|
|
155
|
+
## Citation
|
|
156
|
+
|
|
157
|
+
If you find this useful in your research, please consider citing:
|
|
158
|
+
|
|
159
|
+
```
|
|
160
|
+
@article{Tsoi:2025nvg,
|
|
161
|
+
author = "Tsoi, Ho Fung and Rankin, Dylan and Loncar, Vladimir and Harris, Philip",
|
|
162
|
+
title = "{SparsePixels: Efficient Convolution for Sparse Data on FPGAs}",
|
|
163
|
+
eprint = "2512.06208",
|
|
164
|
+
archivePrefix = "arXiv",
|
|
165
|
+
primaryClass = "cs.AR",
|
|
166
|
+
month = "12",
|
|
167
|
+
year = "2025"
|
|
168
|
+
}
|
|
169
|
+
```
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
[metadata]
|
|
2
2
|
name = sparsepixels
|
|
3
|
-
version = 0.
|
|
3
|
+
version = 0.3.0
|
|
4
4
|
description = Efficient convolution for sparse data on FPGAs
|
|
5
5
|
author = Ho Fung Tsoi
|
|
6
6
|
author_email = ho.fung.tsoi@cern.ch
|
|
@@ -19,7 +19,8 @@ python_requires = >=3.10
|
|
|
19
19
|
install_requires =
|
|
20
20
|
tensorflow
|
|
21
21
|
keras>=3.0
|
|
22
|
-
HGQ2
|
|
22
|
+
HGQ2>=0.1.8
|
|
23
|
+
matplotlib
|
|
23
24
|
include_package_data = True
|
|
24
25
|
|
|
25
26
|
[options.package_data]
|