crunch-synth 0.6.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- crunch_synth-0.6.0/PKG-INFO +334 -0
- crunch_synth-0.6.0/README.md +301 -0
- crunch_synth-0.6.0/crunch_synth/__init__.py +0 -0
- crunch_synth-0.6.0/crunch_synth/__version__.py +9 -0
- crunch_synth-0.6.0/crunch_synth/constants.py +100 -0
- crunch_synth-0.6.0/crunch_synth/examples/__init__.py +0 -0
- crunch_synth-0.6.0/crunch_synth/examples/exampletracker.py +209 -0
- crunch_synth-0.6.0/crunch_synth/price_provider.py +46 -0
- crunch_synth-0.6.0/crunch_synth/prices.py +130 -0
- crunch_synth-0.6.0/crunch_synth/quarantine.py +128 -0
- crunch_synth-0.6.0/crunch_synth/tracker.py +133 -0
- crunch_synth-0.6.0/crunch_synth/tracker_evaluator.py +284 -0
- crunch_synth-0.6.0/crunch_synth/utils/__init__.py +1 -0
- crunch_synth-0.6.0/crunch_synth/utils/data.py +154 -0
- crunch_synth-0.6.0/crunch_synth/utils/densitytosimulations.py +379 -0
- crunch_synth-0.6.0/crunch_synth/utils/distributions.py +95 -0
- crunch_synth-0.6.0/crunch_synth/utils/evaluation_utils.py +11 -0
- crunch_synth-0.6.0/crunch_synth/utils/plots.py +254 -0
- crunch_synth-0.6.0/crunch_synth/utils/tracker_analysis.py +138 -0
- crunch_synth-0.6.0/crunch_synth.egg-info/PKG-INFO +334 -0
- crunch_synth-0.6.0/crunch_synth.egg-info/SOURCES.txt +29 -0
- crunch_synth-0.6.0/crunch_synth.egg-info/dependency_links.txt +1 -0
- crunch_synth-0.6.0/crunch_synth.egg-info/not-zip-safe +1 -0
- crunch_synth-0.6.0/crunch_synth.egg-info/requires.txt +10 -0
- crunch_synth-0.6.0/crunch_synth.egg-info/top_level.txt +1 -0
- crunch_synth-0.6.0/setup.cfg +4 -0
- crunch_synth-0.6.0/setup.py +49 -0
- crunch_synth-0.6.0/tests/test_examples.py +71 -0
- crunch_synth-0.6.0/tests/test_price_provider.py +17 -0
- crunch_synth-0.6.0/tests/test_price_store.py +183 -0
- crunch_synth-0.6.0/tests/test_quarantine_functionality.py +90 -0
|
@@ -0,0 +1,334 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: crunch-synth
|
|
3
|
+
Version: 0.6.0
|
|
4
|
+
Summary: A package for participating in the Crunch-Synth Game on CrunchDAO
|
|
5
|
+
Home-page: https://github.com/crunchdao/crunch-synth
|
|
6
|
+
Author: Abdennour BOUTRIG, Alexis Gassmann
|
|
7
|
+
Author-email: abdennour.boutrig@crunchdao.com
|
|
8
|
+
Keywords: crunchdao,crunch,crunch-synth
|
|
9
|
+
Classifier: Intended Audience :: Developers
|
|
10
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
11
|
+
Requires-Python: >=3.12
|
|
12
|
+
Description-Content-Type: text/markdown
|
|
13
|
+
Requires-Dist: pandas
|
|
14
|
+
Requires-Dist: requests
|
|
15
|
+
Requires-Dist: numpy
|
|
16
|
+
Requires-Dist: tqdm
|
|
17
|
+
Requires-Dist: densitypdf
|
|
18
|
+
Requires-Dist: plotly
|
|
19
|
+
Requires-Dist: sortedcontainers
|
|
20
|
+
Provides-Extra: test
|
|
21
|
+
Requires-Dist: pytest; extra == "test"
|
|
22
|
+
Dynamic: author
|
|
23
|
+
Dynamic: author-email
|
|
24
|
+
Dynamic: classifier
|
|
25
|
+
Dynamic: description
|
|
26
|
+
Dynamic: description-content-type
|
|
27
|
+
Dynamic: home-page
|
|
28
|
+
Dynamic: keywords
|
|
29
|
+
Dynamic: provides-extra
|
|
30
|
+
Dynamic: requires-dist
|
|
31
|
+
Dynamic: requires-python
|
|
32
|
+
Dynamic: summary
|
|
33
|
+
|
|
34
|
+
# Crunch-Synth Game
|
|
35
|
+
|
|
36
|
+
Crunch-Synth is a real-time probabilistic forecasting challenge hosted by CrunchDAO at [crunchdao.com](https://crunchdao.com)
|
|
37
|
+
|
|
38
|
+
The goal is to anticipate how asset prices will evolve by providing not a single forecasted value, but a full probability distribution over the future price change at multiple forecast horizons and steps.
|
|
39
|
+
|
|
40
|
+
**The current crypto assets to model are:**
|
|
41
|
+
- **Bitcoin (BTC)**
|
|
42
|
+
- **Ethereum (ETH)**
|
|
43
|
+
- **Solana (SOL)**
|
|
44
|
+
- **Tether Gold (XAUT)**
|
|
45
|
+
- **SP500 tokenized ETF (SPYX)**
|
|
46
|
+
- **NVIDIA tokenized stock (NVDAX)**
|
|
47
|
+
- **Tesla tokenized stock (TSLAX)**
|
|
48
|
+
- **Apple tokenized stock (AAPLX)**
|
|
49
|
+
- **Alphabet tokenized stock (GOOGLX)**
|
|
50
|
+
|
|
51
|
+
## Install
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
pip install crunch-synth
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## What You Must Predict
|
|
58
|
+
|
|
59
|
+
Trackers must predict the **probability distribution of price changes**, defined as:
|
|
60
|
+
|
|
61
|
+
$$
|
|
62
|
+
r_{t,k} = P_t - P_{t-k}
|
|
63
|
+
$$
|
|
64
|
+
|
|
65
|
+
For each defined step **$k$** (e.g., 5 minutes, 1 hour, …), your tracker must return a full **probability density function (PDF)** over the future price change **$r_{t,k}$**.
|
|
66
|
+
|
|
67
|
+
|
|
68
|
+
## Visualize the challenge
|
|
69
|
+
|
|
70
|
+
The Crunch-Synth game is evaluated on **incremental return predictions**, not raw prices.
|
|
71
|
+
Incremental returns capture the *relative* change in price and produce a stationary series that is easier to model and compare across assets.
|
|
72
|
+
|
|
73
|
+
Below is an example of a **density forecast over incremental returns for the next 24h at 5-minute intervals**:
|
|
74
|
+
|
|
75
|
+

|
|
76
|
+
|
|
77
|
+
Below is a minimal example showing what your tracker might return:
|
|
78
|
+
```python
|
|
79
|
+
>>> model.predict(asset="SOL", horizon=86400, step=300)
|
|
80
|
+
[
|
|
81
|
+
{
|
|
82
|
+
"step": (k + 1) * step,
|
|
83
|
+
"prediction": {
|
|
84
|
+
"type": "builtin",
|
|
85
|
+
"name": "norm",
|
|
86
|
+
"params": {
|
|
87
|
+
"loc": -0.01, # mean return
|
|
88
|
+
"scale": 0.4 # standard deviation of return
|
|
89
|
+
}
|
|
90
|
+
}
|
|
91
|
+
}
|
|
92
|
+
for k in range(0, horizon // step)
|
|
93
|
+
]
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
Here is the **return forecast mapped into price space**:
|
|
97
|
+
|
|
98
|
+

|
|
99
|
+
|
|
100
|
+
|
|
101
|
+
## Create your Tracker
|
|
102
|
+
|
|
103
|
+
A **tracker** is a model that processes real-time asset data to **predict future price changes**. It uses past prices to generate a probabilistic forecast of incremental returns. You can use the data provided by the challenge or any other datasets to improve your predictions.
|
|
104
|
+
|
|
105
|
+
It operates incrementally: prices are pushed to the tracker as they arrive and predictions are requested at specific times by the framework.
|
|
106
|
+
|
|
107
|
+
**To create your tracker, you need to define a class that implements the `TrackerBase` interface, which already handles:**
|
|
108
|
+
|
|
109
|
+
- price storage and alignment via `PriceStore`
|
|
110
|
+
- multi-resolution forecasting through `predict_all()`
|
|
111
|
+
|
|
112
|
+
As a participant, you only need to implement **one method**: `predict()`.
|
|
113
|
+
|
|
114
|
+
1. **Price data handling (already provided)**
|
|
115
|
+
|
|
116
|
+
Each tracker instance contains a `PriceStore` (`self.prices`) that:
|
|
117
|
+
- stores recent historical prices per asset
|
|
118
|
+
- maintains a rolling time window (30 days)
|
|
119
|
+
- provides convenient accessors such as:
|
|
120
|
+
- `get_last_price()`
|
|
121
|
+
- `get_prices(asset, days, resolution)`
|
|
122
|
+
- `get_closest_price(asset, timestamp)`
|
|
123
|
+
|
|
124
|
+
The framework automatically updates the `PriceStore` by calling `tick(self, data: PriceData)` before any prediction request.
|
|
125
|
+
|
|
126
|
+
Data example:
|
|
127
|
+
```python
|
|
128
|
+
data = {
|
|
129
|
+
"BTC": [(timestamp1, price1), (timestamp2, price2)],
|
|
130
|
+
"SOL": [(timestamp1, price1)],
|
|
131
|
+
}
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
When it's called:
|
|
135
|
+
- Typically every minute or when new data is available
|
|
136
|
+
- Before any prediction request
|
|
137
|
+
- Can be called multiple times before a predict
|
|
138
|
+
|
|
139
|
+
2. **Required method: `predict(self, asset: str, horizon: int, step: int)`**
|
|
140
|
+
This is the **only method you must implement.**
|
|
141
|
+
|
|
142
|
+
It must return a sequence of **predictive density distributions** for the **incremental price change** of an asset:
|
|
143
|
+
|
|
144
|
+
- Forecast horizon: horizon seconds into the future
|
|
145
|
+
- Temporal resolution: one density every step seconds
|
|
146
|
+
- Output length: horizon // step
|
|
147
|
+
|
|
148
|
+
Each density prediction must comply with the [density_pdf](https://github.com/microprediction/densitypdf/blob/main/densitypdf/__init__.py) specification.
|
|
149
|
+
|
|
150
|
+
3. **Multi-step forecasts (handled automatically)**
|
|
151
|
+
|
|
152
|
+
You **do not** need to implement multi-step logic.
|
|
153
|
+
|
|
154
|
+
The framework will automatically call your `predict()` method multiple times via `predict_all(asset, horizon, steps)` to construct forecasts at different temporal resolutions.
|
|
155
|
+
|
|
156
|
+
You can refer to the [Tracker examples](crunch_synth/examples) for guidance.
|
|
157
|
+
|
|
158
|
+
```python
|
|
159
|
+
class GaussianStepTracker(TrackerBase):
|
|
160
|
+
"""
|
|
161
|
+
An example tracker that models *future incremental returns* as Gaussian-distributed.
|
|
162
|
+
|
|
163
|
+
For each forecast step, the tracker returns a normal distribution
|
|
164
|
+
r_{t,step} ~ N(a · mu, √a · sigma) where:
|
|
165
|
+
- mu = mean historical return
|
|
166
|
+
- sigma = std historical return
|
|
167
|
+
- a = (step / 300) represents the ratio of the forecast step duration to the historical 5-minute return interval.
|
|
168
|
+
|
|
169
|
+
Multi-resolution forecasts (5min, 1h, 6h, 24h, ...)
|
|
170
|
+
are automatically handled by `TrackerBase.predict_all()`,
|
|
171
|
+
which calls the `predict()` method once per step size.
|
|
172
|
+
|
|
173
|
+
/!/ This is not a price-distribution; it is a distribution over
|
|
174
|
+
incremental returns between consecutive steps /!/
|
|
175
|
+
"""
|
|
176
|
+
def __init__(self):
|
|
177
|
+
super().__init__()
|
|
178
|
+
|
|
179
|
+
def predict(self, asset: str, horizon: int, step: int):
|
|
180
|
+
|
|
181
|
+
# Retrieve recent historical prices sampled at 5-minute resolution
|
|
182
|
+
resolution=300
|
|
183
|
+
pairs = self.prices.get_prices(asset, days=5, resolution=300)
|
|
184
|
+
if not pairs:
|
|
185
|
+
return []
|
|
186
|
+
|
|
187
|
+
_, past_prices = zip(*pairs)
|
|
188
|
+
|
|
189
|
+
if len(past_prices) < 3:
|
|
190
|
+
return []
|
|
191
|
+
|
|
192
|
+
# Compute historical incremental returns (price differences)
|
|
193
|
+
returns = np.diff(past_prices)
|
|
194
|
+
|
|
195
|
+
# Estimate drift (mean return) and volatility (std dev of returns)
|
|
196
|
+
mu = float(np.mean(returns))
|
|
197
|
+
sigma = float(np.std(returns))
|
|
198
|
+
|
|
199
|
+
if sigma <= 0:
|
|
200
|
+
return []
|
|
201
|
+
|
|
202
|
+
num_segments = horizon // step
|
|
203
|
+
|
|
204
|
+
# Construct one predictive distribution per future time step.
|
|
205
|
+
# Each distribution models the incremental return over a `step`-second interval.
|
|
206
|
+
#
|
|
207
|
+
# IMPORTANT:
|
|
208
|
+
# - The returned objects must strictly follow the `density_pdf` specification.
|
|
209
|
+
# - Each entry corresponds to the return between t + (k−1)·step and t + k·step.
|
|
210
|
+
#
|
|
211
|
+
# We use a single-component Gaussian mixture for simplicity:
|
|
212
|
+
# r_{t,k} ~ N( (step / 300) · μ , sqrt(step / 300) · σ )
|
|
213
|
+
#
|
|
214
|
+
# where μ and σ are estimated from historical 5-minute returns.
|
|
215
|
+
distributions = []
|
|
216
|
+
for k in range(1, num_segments + 1):
|
|
217
|
+
distributions.append({
|
|
218
|
+
"step": k * step, # Time offset (in seconds) from forecast origin
|
|
219
|
+
"type": "mixture",
|
|
220
|
+
"components": [{
|
|
221
|
+
"density": {
|
|
222
|
+
"type": "builtin", # Note: use 'builtin' instead of 'scipy' for speed
|
|
223
|
+
"name": "norm",
|
|
224
|
+
"params": {
|
|
225
|
+
"loc": (step/resolution) * mu,
|
|
226
|
+
"scale": np.sqrt(step/resolution) * sigma}
|
|
227
|
+
},
|
|
228
|
+
"weight": 1 # Mixture weight — multiple densities with different weights can be combined
|
|
229
|
+
}]
|
|
230
|
+
})
|
|
231
|
+
|
|
232
|
+
return distributions
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
## Prediction Phase
|
|
236
|
+
|
|
237
|
+
In each prediction round, players must submit **a set of density forecasts.**
|
|
238
|
+
|
|
239
|
+
A **prediction round** is defined by **one asset, one forecast horizon** and **one or more step resolutions.**
|
|
240
|
+
- A **24-hour horizon** forecast
|
|
241
|
+
- Triggered **hourly** for each asset
|
|
242
|
+
- Step resolutions: {5-minute, 1-hour, 6-hour, 24-hour}
|
|
243
|
+
- Supported assets:
|
|
244
|
+
```["BTC", "SOL", "ETH", "XAUT", "SPYX", "NVDAX", "TSLAX", "AAPLX", "GOOGLX"]```
|
|
245
|
+
- A **1-hour horizon** forecast
|
|
246
|
+
- Triggered **every 12 minutes** for each asset
|
|
247
|
+
- Step resolutions: {1-minute, 5-minute, 15-minute, 30-minute, 1-hour}
|
|
248
|
+
- Supported assets:
|
|
249
|
+
```["BTC", "SOL", "ETH", "XAUT"]```
|
|
250
|
+
|
|
251
|
+
All required forecasts for a prediction round must be generated within **40 seconds.**
|
|
252
|
+
|
|
253
|
+
## Scoring
|
|
254
|
+
- Once the full horizon has passed, each prediction is scored using a **CRPS scoring function**.
|
|
255
|
+
- A lower **CRPS score** reflects more accurate predictions.
|
|
256
|
+
- Leaderboard ranking is based on a **7-day rolling average** of CRPS scores across **all assets and horizons**, evaluated **relative to other participants**:
|
|
257
|
+
- for each prediction round, the **best CRPS score receives a normalized score of 1**
|
|
258
|
+
- the **worst 5% of CRPS scores receive a score of 0**
|
|
259
|
+
|
|
260
|
+
## Check your Tracker performance
|
|
261
|
+
|
|
262
|
+
TrackerEvaluator allows you to track your model's performance over time locally before participating in the live game. It maintains:
|
|
263
|
+
|
|
264
|
+
- Overall CRPS score
|
|
265
|
+
- Recent CRPS score
|
|
266
|
+
- Quarantine predictions (predictions stored and evaluated at a later time)
|
|
267
|
+
|
|
268
|
+
**A lower CRPS score reflects more accurate predictions.**
|
|
269
|
+
|
|
270
|
+
```python
|
|
271
|
+
from crunch_synth.tracker_evaluator import TrackerEvaluator
|
|
272
|
+
from crunch_synth.examples.benchmarktracker import GaussianStepTracker # Your custom tracker
|
|
273
|
+
|
|
274
|
+
# Initialize the tracker evaluator with your custom GaussianStepTracker
|
|
275
|
+
tracker_evaluator = TrackerEvaluator(GaussianStepTracker())
|
|
276
|
+
# Feed a new price tick for SOL
|
|
277
|
+
tracker_evaluator.tick({"SOL": [(ts, price)]})
|
|
278
|
+
# You will generate predictive densities for SOL over a 24-hour period (86400s)
|
|
279
|
+
# at multiple step resolutions: 5 minutes, 1 hour, 6 hours and 24 hours
|
|
280
|
+
predictions = tracker_evaluator.predict("SOL", horizon=3600*24,
|
|
281
|
+
steps=[300, 3600, 3600*6, 3600*24])
|
|
282
|
+
|
|
283
|
+
print(f"My overall normalized CRPS score: {tracker_evaluator.overall_score("SOL"):.4f}")
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
|
|
287
|
+
## Tracker examples
|
|
288
|
+
See [Tracker examples](crunch_synth/examples). There are:
|
|
289
|
+
|
|
290
|
+
- Quickstarter Notebooks
|
|
291
|
+
- Self-contained examples
|
|
292
|
+
|
|
293
|
+
|
|
294
|
+
## General Crunch-Synth Game Advice
|
|
295
|
+
|
|
296
|
+
The Crunch-Synth game challenges you to predict the asset location using probabilistic forecasting.
|
|
297
|
+
|
|
298
|
+
### Probabilistic Forecasting
|
|
299
|
+
|
|
300
|
+
Probabilistic forecasting provides **a distribution of possible future values** rather than a single point estimate, allowing for uncertainty quantification. Instead of predicting only the most likely outcome, it estimates a range of potential outcomes along with their probabilities by outputting a **probability distribution**.
|
|
301
|
+
|
|
302
|
+
A probabilistic forecast models the conditional probability distribution of a future value $(Y_t)$ given past observations $(\mathcal{H}_{t-1})$. This can be expressed as:
|
|
303
|
+
|
|
304
|
+
$$P(Y_t \mid \mathcal{H}_{t-1})$$
|
|
305
|
+
|
|
306
|
+
where $(\mathcal{H}_{t-1})$ represents the historical data up to time $(t-1)$. Instead of a single prediction $(\hat{Y}_t)$, the model estimates a full probability distribution $(f(Y_t \mid \mathcal{H}_{t-1}))$, which can take different parametric forms, such as a Gaussian:
|
|
307
|
+
|
|
308
|
+
$$Y_t \mid \mathcal{H}_{t-1} \sim \mathcal{N}(\mu_t, \sigma_t^2)$$
|
|
309
|
+
|
|
310
|
+
where $(\mu_t)$ is the predicted mean and $(\sigma_t^2)$ represents the uncertainty in the forecast.
|
|
311
|
+
|
|
312
|
+
Probabilistic forecasting can be handled through various approaches, including **variance forecasters**, **quantile forecasters**, **interval forecasters** or **distribution forecasters**, each capturing uncertainty differently.
|
|
313
|
+
|
|
314
|
+
For example, you can try to forecast the target location by a gaussian density function (or a mixture), thus the model output follows the form:
|
|
315
|
+
|
|
316
|
+
```python
|
|
317
|
+
{
|
|
318
|
+
"density": {
|
|
319
|
+
"type": "builtin",
|
|
320
|
+
"name": "normal",
|
|
321
|
+
"params": {"loc": y_mean, "scale": y_var}
|
|
322
|
+
},
|
|
323
|
+
"weight": weight
|
|
324
|
+
}
|
|
325
|
+
```
|
|
326
|
+
|
|
327
|
+
A **mixture density**, such as the gaussion mixture $\sum_{i=1}^{K} w_i \mathcal{N}(Y_t | \mu_i, \sigma_i^2)$ allows for capturing multi-modal distributions and approximate more complex distributions.
|
|
328
|
+
|
|
329
|
+

|
|
330
|
+
|
|
331
|
+
### Additional Resources
|
|
332
|
+
|
|
333
|
+
- [Literature](LITERATURE.md)
|
|
334
|
+
- Useful Python [packages](PACKAGES.md)
|
|
@@ -0,0 +1,301 @@
|
|
|
1
|
+
# Crunch-Synth Game
|
|
2
|
+
|
|
3
|
+
Crunch-Synth is a real-time probabilistic forecasting challenge hosted by CrunchDAO at [crunchdao.com](https://crunchdao.com)
|
|
4
|
+
|
|
5
|
+
The goal is to anticipate how asset prices will evolve by providing not a single forecasted value, but a full probability distribution over the future price change at multiple forecast horizons and steps.
|
|
6
|
+
|
|
7
|
+
**The current crypto assets to model are:**
|
|
8
|
+
- **Bitcoin (BTC)**
|
|
9
|
+
- **Ethereum (ETH)**
|
|
10
|
+
- **Solana (SOL)**
|
|
11
|
+
- **Tether Gold (XAUT)**
|
|
12
|
+
- **SP500 tokenized ETF (SPYX)**
|
|
13
|
+
- **NVIDIA tokenized stock (NVDAX)**
|
|
14
|
+
- **Tesla tokenized stock (TSLAX)**
|
|
15
|
+
- **Apple tokenized stock (AAPLX)**
|
|
16
|
+
- **Alphabet tokenized stock (GOOGLX)**
|
|
17
|
+
|
|
18
|
+
## Install
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
pip install crunch-synth
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## What You Must Predict
|
|
25
|
+
|
|
26
|
+
Trackers must predict the **probability distribution of price changes**, defined as:
|
|
27
|
+
|
|
28
|
+
$$
|
|
29
|
+
r_{t,k} = P_t - P_{t-k}
|
|
30
|
+
$$
|
|
31
|
+
|
|
32
|
+
For each defined step **$k$** (e.g., 5 minutes, 1 hour, …), your tracker must return a full **probability density function (PDF)** over the future price change **$r_{t,k}$**.
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
## Visualize the challenge
|
|
36
|
+
|
|
37
|
+
The Crunch-Synth game is evaluated on **incremental return predictions**, not raw prices.
|
|
38
|
+
Incremental returns capture the *relative* change in price and produce a stationary series that is easier to model and compare across assets.
|
|
39
|
+
|
|
40
|
+
Below is an example of a **density forecast over incremental returns for the next 24h at 5-minute intervals**:
|
|
41
|
+
|
|
42
|
+

|
|
43
|
+
|
|
44
|
+
Below is a minimal example showing what your tracker might return:
|
|
45
|
+
```python
|
|
46
|
+
>>> model.predict(asset="SOL", horizon=86400, step=300)
|
|
47
|
+
[
|
|
48
|
+
{
|
|
49
|
+
"step": (k + 1) * step,
|
|
50
|
+
"prediction": {
|
|
51
|
+
"type": "builtin",
|
|
52
|
+
"name": "norm",
|
|
53
|
+
"params": {
|
|
54
|
+
"loc": -0.01, # mean return
|
|
55
|
+
"scale": 0.4 # standard deviation of return
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
for k in range(0, horizon // step)
|
|
60
|
+
]
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Here is the **return forecast mapped into price space**:
|
|
64
|
+
|
|
65
|
+

|
|
66
|
+
|
|
67
|
+
|
|
68
|
+
## Create your Tracker
|
|
69
|
+
|
|
70
|
+
A **tracker** is a model that processes real-time asset data to **predict future price changes**. It uses past prices to generate a probabilistic forecast of incremental returns. You can use the data provided by the challenge or any other datasets to improve your predictions.
|
|
71
|
+
|
|
72
|
+
It operates incrementally: prices are pushed to the tracker as they arrive and predictions are requested at specific times by the framework.
|
|
73
|
+
|
|
74
|
+
**To create your tracker, you need to define a class that implements the `TrackerBase` interface, which already handles:**
|
|
75
|
+
|
|
76
|
+
- price storage and alignment via `PriceStore`
|
|
77
|
+
- multi-resolution forecasting through `predict_all()`
|
|
78
|
+
|
|
79
|
+
As a participant, you only need to implement **one method**: `predict()`.
|
|
80
|
+
|
|
81
|
+
1. **Price data handling (already provided)**
|
|
82
|
+
|
|
83
|
+
Each tracker instance contains a `PriceStore` (`self.prices`) that:
|
|
84
|
+
- stores recent historical prices per asset
|
|
85
|
+
- maintains a rolling time window (30 days)
|
|
86
|
+
- provides convenient accessors such as:
|
|
87
|
+
- `get_last_price()`
|
|
88
|
+
- `get_prices(asset, days, resolution)`
|
|
89
|
+
- `get_closest_price(asset, timestamp)`
|
|
90
|
+
|
|
91
|
+
The framework automatically updates the `PriceStore` by calling `tick(self, data: PriceData)` before any prediction request.
|
|
92
|
+
|
|
93
|
+
Data example:
|
|
94
|
+
```python
|
|
95
|
+
data = {
|
|
96
|
+
"BTC": [(timestamp1, price1), (timestamp2, price2)],
|
|
97
|
+
"SOL": [(timestamp1, price1)],
|
|
98
|
+
}
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
When it's called:
|
|
102
|
+
- Typically every minute or when new data is available
|
|
103
|
+
- Before any prediction request
|
|
104
|
+
- Can be called multiple times before a predict
|
|
105
|
+
|
|
106
|
+
2. **Required method: `predict(self, asset: str, horizon: int, step: int)`**
|
|
107
|
+
This is the **only method you must implement.**
|
|
108
|
+
|
|
109
|
+
It must return a sequence of **predictive density distributions** for the **incremental price change** of an asset:
|
|
110
|
+
|
|
111
|
+
- Forecast horizon: horizon seconds into the future
|
|
112
|
+
- Temporal resolution: one density every step seconds
|
|
113
|
+
- Output length: horizon // step
|
|
114
|
+
|
|
115
|
+
Each density prediction must comply with the [density_pdf](https://github.com/microprediction/densitypdf/blob/main/densitypdf/__init__.py) specification.
|
|
116
|
+
|
|
117
|
+
3. **Multi-step forecasts (handled automatically)**
|
|
118
|
+
|
|
119
|
+
You **do not** need to implement multi-step logic.
|
|
120
|
+
|
|
121
|
+
The framework will automatically call your `predict()` method multiple times via `predict_all(asset, horizon, steps)` to construct forecasts at different temporal resolutions.
|
|
122
|
+
|
|
123
|
+
You can refer to the [Tracker examples](crunch_synth/examples) for guidance.
|
|
124
|
+
|
|
125
|
+
```python
|
|
126
|
+
class GaussianStepTracker(TrackerBase):
|
|
127
|
+
"""
|
|
128
|
+
An example tracker that models *future incremental returns* as Gaussian-distributed.
|
|
129
|
+
|
|
130
|
+
For each forecast step, the tracker returns a normal distribution
|
|
131
|
+
r_{t,step} ~ N(a · mu, √a · sigma) where:
|
|
132
|
+
- mu = mean historical return
|
|
133
|
+
- sigma = std historical return
|
|
134
|
+
- a = (step / 300) represents the ratio of the forecast step duration to the historical 5-minute return interval.
|
|
135
|
+
|
|
136
|
+
Multi-resolution forecasts (5min, 1h, 6h, 24h, ...)
|
|
137
|
+
are automatically handled by `TrackerBase.predict_all()`,
|
|
138
|
+
which calls the `predict()` method once per step size.
|
|
139
|
+
|
|
140
|
+
/!/ This is not a price-distribution; it is a distribution over
|
|
141
|
+
incremental returns between consecutive steps /!/
|
|
142
|
+
"""
|
|
143
|
+
def __init__(self):
|
|
144
|
+
super().__init__()
|
|
145
|
+
|
|
146
|
+
def predict(self, asset: str, horizon: int, step: int):
|
|
147
|
+
|
|
148
|
+
# Retrieve recent historical prices sampled at 5-minute resolution
|
|
149
|
+
resolution=300
|
|
150
|
+
pairs = self.prices.get_prices(asset, days=5, resolution=300)
|
|
151
|
+
if not pairs:
|
|
152
|
+
return []
|
|
153
|
+
|
|
154
|
+
_, past_prices = zip(*pairs)
|
|
155
|
+
|
|
156
|
+
if len(past_prices) < 3:
|
|
157
|
+
return []
|
|
158
|
+
|
|
159
|
+
# Compute historical incremental returns (price differences)
|
|
160
|
+
returns = np.diff(past_prices)
|
|
161
|
+
|
|
162
|
+
# Estimate drift (mean return) and volatility (std dev of returns)
|
|
163
|
+
mu = float(np.mean(returns))
|
|
164
|
+
sigma = float(np.std(returns))
|
|
165
|
+
|
|
166
|
+
if sigma <= 0:
|
|
167
|
+
return []
|
|
168
|
+
|
|
169
|
+
num_segments = horizon // step
|
|
170
|
+
|
|
171
|
+
# Construct one predictive distribution per future time step.
|
|
172
|
+
# Each distribution models the incremental return over a `step`-second interval.
|
|
173
|
+
#
|
|
174
|
+
# IMPORTANT:
|
|
175
|
+
# - The returned objects must strictly follow the `density_pdf` specification.
|
|
176
|
+
# - Each entry corresponds to the return between t + (k−1)·step and t + k·step.
|
|
177
|
+
#
|
|
178
|
+
# We use a single-component Gaussian mixture for simplicity:
|
|
179
|
+
# r_{t,k} ~ N( (step / 300) · μ , sqrt(step / 300) · σ )
|
|
180
|
+
#
|
|
181
|
+
# where μ and σ are estimated from historical 5-minute returns.
|
|
182
|
+
distributions = []
|
|
183
|
+
for k in range(1, num_segments + 1):
|
|
184
|
+
distributions.append({
|
|
185
|
+
"step": k * step, # Time offset (in seconds) from forecast origin
|
|
186
|
+
"type": "mixture",
|
|
187
|
+
"components": [{
|
|
188
|
+
"density": {
|
|
189
|
+
"type": "builtin", # Note: use 'builtin' instead of 'scipy' for speed
|
|
190
|
+
"name": "norm",
|
|
191
|
+
"params": {
|
|
192
|
+
"loc": (step/resolution) * mu,
|
|
193
|
+
"scale": np.sqrt(step/resolution) * sigma}
|
|
194
|
+
},
|
|
195
|
+
"weight": 1 # Mixture weight — multiple densities with different weights can be combined
|
|
196
|
+
}]
|
|
197
|
+
})
|
|
198
|
+
|
|
199
|
+
return distributions
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
## Prediction Phase
|
|
203
|
+
|
|
204
|
+
In each prediction round, players must submit **a set of density forecasts.**
|
|
205
|
+
|
|
206
|
+
A **prediction round** is defined by **one asset, one forecast horizon** and **one or more step resolutions.**
|
|
207
|
+
- A **24-hour horizon** forecast
|
|
208
|
+
- Triggered **hourly** for each asset
|
|
209
|
+
- Step resolutions: {5-minute, 1-hour, 6-hour, 24-hour}
|
|
210
|
+
- Supported assets:
|
|
211
|
+
```["BTC", "SOL", "ETH", "XAUT", "SPYX", "NVDAX", "TSLAX", "AAPLX", "GOOGLX"]```
|
|
212
|
+
- A **1-hour horizon** forecast
|
|
213
|
+
- Triggered **every 12 minutes** for each asset
|
|
214
|
+
- Step resolutions: {1-minute, 5-minute, 15-minute, 30-minute, 1-hour}
|
|
215
|
+
- Supported assets:
|
|
216
|
+
```["BTC", "SOL", "ETH", "XAUT"]```
|
|
217
|
+
|
|
218
|
+
All required forecasts for a prediction round must be generated within **40 seconds.**
|
|
219
|
+
|
|
220
|
+
## Scoring
|
|
221
|
+
- Once the full horizon has passed, each prediction is scored using a **CRPS scoring function**.
|
|
222
|
+
- A lower **CRPS score** reflects more accurate predictions.
|
|
223
|
+
- Leaderboard ranking is based on a **7-day rolling average** of CRPS scores across **all assets and horizons**, evaluated **relative to other participants**:
|
|
224
|
+
- for each prediction round, the **best CRPS score receives a normalized score of 1**
|
|
225
|
+
- the **worst 5% of CRPS scores receive a score of 0**
|
|
226
|
+
|
|
227
|
+
## Check your Tracker performance
|
|
228
|
+
|
|
229
|
+
TrackerEvaluator allows you to track your model's performance over time locally before participating in the live game. It maintains:
|
|
230
|
+
|
|
231
|
+
- Overall CRPS score
|
|
232
|
+
- Recent CRPS score
|
|
233
|
+
- Quarantine predictions (predictions stored and evaluated at a later time)
|
|
234
|
+
|
|
235
|
+
**A lower CRPS score reflects more accurate predictions.**
|
|
236
|
+
|
|
237
|
+
```python
|
|
238
|
+
from crunch_synth.tracker_evaluator import TrackerEvaluator
|
|
239
|
+
from crunch_synth.examples.benchmarktracker import GaussianStepTracker # Your custom tracker
|
|
240
|
+
|
|
241
|
+
# Initialize the tracker evaluator with your custom GaussianStepTracker
|
|
242
|
+
tracker_evaluator = TrackerEvaluator(GaussianStepTracker())
|
|
243
|
+
# Feed a new price tick for SOL
|
|
244
|
+
tracker_evaluator.tick({"SOL": [(ts, price)]})
|
|
245
|
+
# You will generate predictive densities for SOL over a 24-hour period (86400s)
|
|
246
|
+
# at multiple step resolutions: 5 minutes, 1 hour, 6 hours and 24 hours
|
|
247
|
+
predictions = tracker_evaluator.predict("SOL", horizon=3600*24,
|
|
248
|
+
steps=[300, 3600, 3600*6, 3600*24])
|
|
249
|
+
|
|
250
|
+
print(f"My overall normalized CRPS score: {tracker_evaluator.overall_score("SOL"):.4f}")
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
|
|
254
|
+
## Tracker examples
|
|
255
|
+
See [Tracker examples](crunch_synth/examples). There are:
|
|
256
|
+
|
|
257
|
+
- Quickstarter Notebooks
|
|
258
|
+
- Self-contained examples
|
|
259
|
+
|
|
260
|
+
|
|
261
|
+
## General Crunch-Synth Game Advice
|
|
262
|
+
|
|
263
|
+
The Crunch-Synth game challenges you to predict the asset location using probabilistic forecasting.
|
|
264
|
+
|
|
265
|
+
### Probabilistic Forecasting
|
|
266
|
+
|
|
267
|
+
Probabilistic forecasting provides **a distribution of possible future values** rather than a single point estimate, allowing for uncertainty quantification. Instead of predicting only the most likely outcome, it estimates a range of potential outcomes along with their probabilities by outputting a **probability distribution**.
|
|
268
|
+
|
|
269
|
+
A probabilistic forecast models the conditional probability distribution of a future value $(Y_t)$ given past observations $(\mathcal{H}_{t-1})$. This can be expressed as:
|
|
270
|
+
|
|
271
|
+
$$P(Y_t \mid \mathcal{H}_{t-1})$$
|
|
272
|
+
|
|
273
|
+
where $(\mathcal{H}_{t-1})$ represents the historical data up to time $(t-1)$. Instead of a single prediction $(\hat{Y}_t)$, the model estimates a full probability distribution $(f(Y_t \mid \mathcal{H}_{t-1}))$, which can take different parametric forms, such as a Gaussian:
|
|
274
|
+
|
|
275
|
+
$$Y_t \mid \mathcal{H}_{t-1} \sim \mathcal{N}(\mu_t, \sigma_t^2)$$
|
|
276
|
+
|
|
277
|
+
where $(\mu_t)$ is the predicted mean and $(\sigma_t^2)$ represents the uncertainty in the forecast.
|
|
278
|
+
|
|
279
|
+
Probabilistic forecasting can be handled through various approaches, including **variance forecasters**, **quantile forecasters**, **interval forecasters** or **distribution forecasters**, each capturing uncertainty differently.
|
|
280
|
+
|
|
281
|
+
For example, you can try to forecast the target location by a gaussian density function (or a mixture), thus the model output follows the form:
|
|
282
|
+
|
|
283
|
+
```python
|
|
284
|
+
{
|
|
285
|
+
"density": {
|
|
286
|
+
"type": "builtin",
|
|
287
|
+
"name": "normal",
|
|
288
|
+
"params": {"loc": y_mean, "scale": y_var}
|
|
289
|
+
},
|
|
290
|
+
"weight": weight
|
|
291
|
+
}
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
A **mixture density**, such as the gaussion mixture $\sum_{i=1}^{K} w_i \mathcal{N}(Y_t | \mu_i, \sigma_i^2)$ allows for capturing multi-modal distributions and approximate more complex distributions.
|
|
295
|
+
|
|
296
|
+

|
|
297
|
+
|
|
298
|
+
### Additional Resources
|
|
299
|
+
|
|
300
|
+
- [Literature](LITERATURE.md)
|
|
301
|
+
- Useful Python [packages](PACKAGES.md)
|
|
File without changes
|
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
__title__ = 'crunch-synth'
|
|
2
|
+
__description__ = 'A package for participating in the Crunch-Synth Game on CrunchDAO'
|
|
3
|
+
__version__ = '0.6.0'
|
|
4
|
+
__author__ = [
|
|
5
|
+
"Abdennour BOUTRIG",
|
|
6
|
+
"Alexis Gassmann",
|
|
7
|
+
]
|
|
8
|
+
__author_email__ = 'abdennour.boutrig@crunchdao.com'
|
|
9
|
+
__url__ = 'https://github.com/crunchdao/crunch-synth'
|