mockcraft 0.1.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- mockcraft-0.1.1/LICENSE +22 -0
- mockcraft-0.1.1/PKG-INFO +392 -0
- mockcraft-0.1.1/README.md +325 -0
- mockcraft-0.1.1/mockcraft/__init__.py +3 -0
- mockcraft-0.1.1/mockcraft/catalogue.py +471 -0
- mockcraft-0.1.1/mockcraft/source_generator.py +584 -0
- mockcraft-0.1.1/mockcraft.egg-info/PKG-INFO +392 -0
- mockcraft-0.1.1/mockcraft.egg-info/SOURCES.txt +14 -0
- mockcraft-0.1.1/mockcraft.egg-info/dependency_links.txt +1 -0
- mockcraft-0.1.1/mockcraft.egg-info/requires.txt +49 -0
- mockcraft-0.1.1/mockcraft.egg-info/top_level.txt +1 -0
- mockcraft-0.1.1/pyproject.toml +181 -0
- mockcraft-0.1.1/setup.cfg +4 -0
- mockcraft-0.1.1/test/test_catalogue.py +84 -0
- mockcraft-0.1.1/test/test_registry.py +50 -0
- mockcraft-0.1.1/test/test_source_generator.py +102 -0
mockcraft-0.1.1/LICENSE
ADDED
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Bibiana Terres Stumpf, Joao Victor Motta da Silva,
|
|
4
|
+
Pedro Jann Luna, Anne Laure Mealier, Julien Zoubian
|
|
5
|
+
|
|
6
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
7
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
8
|
+
in the Software without restriction, including without limitation the rights
|
|
9
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
10
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
11
|
+
furnished to do so, subject to the following conditions:
|
|
12
|
+
|
|
13
|
+
The above copyright notice and this permission notice shall be included in all
|
|
14
|
+
copies or substantial portions of the Software.
|
|
15
|
+
|
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
17
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
18
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
19
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
20
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
21
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
22
|
+
SOFTWARE.
|
mockcraft-0.1.1/PKG-INFO
ADDED
|
@@ -0,0 +1,392 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: mockcraft
|
|
3
|
+
Version: 0.1.1
|
|
4
|
+
Summary: Reusable Python package for multimodal astronomical source generation using AION
|
|
5
|
+
Author: Bibiana Terres Stumpf, Joao Victor Motta da Silva, Pedro Jann Luna, Anne Laure Mealier, Julien Zoubian
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Repository, https://github.com/CentraleDigitaleLab/mockcraft
|
|
8
|
+
Project-URL: Issues, https://github.com/CentraleDigitaleLab/mockcraft/issues
|
|
9
|
+
Project-URL: Documentation, https://github.com/CentraleDigitaleLab/mockcraft/tree/main/docs
|
|
10
|
+
Keywords: astronomy,astrophysics,machine-learning,generative-ai,multimodal,synthetic-catalogues,AION
|
|
11
|
+
Classifier: Development Status :: 3 - Alpha
|
|
12
|
+
Classifier: Intended Audience :: Science/Research
|
|
13
|
+
Classifier: Operating System :: OS Independent
|
|
14
|
+
Classifier: Programming Language :: Python :: 3
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Topic :: Scientific/Engineering :: Astronomy
|
|
19
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
20
|
+
Requires-Python: <3.13,>=3.10
|
|
21
|
+
Description-Content-Type: text/markdown
|
|
22
|
+
License-File: LICENSE
|
|
23
|
+
Requires-Dist: numpy<2.0,>=1.26
|
|
24
|
+
Requires-Dist: pandas<3.0,>=2.2
|
|
25
|
+
Requires-Dist: scipy<1.16,>=1.11
|
|
26
|
+
Requires-Dist: scikit-learn<1.8,>=1.5
|
|
27
|
+
Requires-Dist: astropy<7.0,>=6.1
|
|
28
|
+
Requires-Dist: astroquery<0.5,>=0.4.7
|
|
29
|
+
Requires-Dist: matplotlib<4,>=3.8
|
|
30
|
+
Requires-Dist: torch<3,>=2.4
|
|
31
|
+
Requires-Dist: einops<1,>=0.8
|
|
32
|
+
Requires-Dist: jaxtyping<0.4,>=0.2.28
|
|
33
|
+
Requires-Dist: huggingface-hub<1.0,>=0.23
|
|
34
|
+
Requires-Dist: tokenizers<1,>=0.19
|
|
35
|
+
Requires-Dist: transformers<5,>=4.40
|
|
36
|
+
Requires-Dist: datasets<4,>=2.18
|
|
37
|
+
Requires-Dist: safetensors<0.6,>=0.4.3
|
|
38
|
+
Requires-Dist: polymathic-aion==0.0.2
|
|
39
|
+
Requires-Dist: filelock<4,>=3.13
|
|
40
|
+
Requires-Dist: tqdm<5,>=4.66
|
|
41
|
+
Requires-Dist: typing-extensions<5,>=4.9
|
|
42
|
+
Provides-Extra: viz
|
|
43
|
+
Requires-Dist: matplotlib<4,>=3.8; extra == "viz"
|
|
44
|
+
Requires-Dist: umap-learn<0.6,>=0.5.5; extra == "viz"
|
|
45
|
+
Provides-Extra: notebooks
|
|
46
|
+
Requires-Dist: ipython<9,>=8.20; extra == "notebooks"
|
|
47
|
+
Requires-Dist: ipykernel<7,>=6.29; extra == "notebooks"
|
|
48
|
+
Requires-Dist: jupyter<2,>=1.0; extra == "notebooks"
|
|
49
|
+
Requires-Dist: ipywidgets<9,>=8.1; extra == "notebooks"
|
|
50
|
+
Requires-Dist: matplotlib<4,>=3.8; extra == "notebooks"
|
|
51
|
+
Requires-Dist: umap-learn<0.6,>=0.5.5; extra == "notebooks"
|
|
52
|
+
Requires-Dist: tqdm<5,>=4.66; extra == "notebooks"
|
|
53
|
+
Provides-Extra: data
|
|
54
|
+
Requires-Dist: pyarrow<20,>=15.0; extra == "data"
|
|
55
|
+
Requires-Dist: datasets<4,>=2.18; extra == "data"
|
|
56
|
+
Provides-Extra: docs
|
|
57
|
+
Requires-Dist: Sphinx<9,>=7.2; extra == "docs"
|
|
58
|
+
Requires-Dist: docutils<0.22,>=0.20; extra == "docs"
|
|
59
|
+
Provides-Extra: network
|
|
60
|
+
Requires-Dist: redis<8,>=5.0; extra == "network"
|
|
61
|
+
Requires-Dist: urllib3<3,>=2.0; extra == "network"
|
|
62
|
+
Requires-Dist: requests<3,>=2.31; extra == "network"
|
|
63
|
+
Provides-Extra: security
|
|
64
|
+
Requires-Dist: cryptography<47,>=42.0; extra == "security"
|
|
65
|
+
Requires-Dist: pyOpenSSL<26,>=24.0; extra == "security"
|
|
66
|
+
Dynamic: license-file
|
|
67
|
+
|
|
68
|
+
# mockcraft
|
|
69
|
+
|
|
70
|
+
`mockcraft` is a Python package for generating synthetic astronomical catalogues using the [AION](https://github.com/PolymathicAI/aion) foundation model. It provides a simple interface for cross-modal generation across astronomical data types and includes a catalogue utility for fetching real sources from Gaia DR3, DESI DR1, and Legacy Survey DR8.
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## What is MockCraft?
|
|
75
|
+
|
|
76
|
+
MockCraft is a pipeline for generating synthetic astronomical mock catalogues using foundation models. Given a set of real astronomical observations as input — redshifts, photometric fluxes, or images — MockCraft can generate realistic synthetic counterparts: spectra, multi-band images, and morphological parameters.
|
|
77
|
+
|
|
78
|
+
[AION](https://github.com/PolymathicAI/aion) is a multimodal foundation model trained on one of the largest astronomical datasets ever assembled (see the [Multimodal Universe paper](https://arxiv.org/pdf/2412.02527)). It learns joint representations across spectra, images, and physical parameters, enabling cross-modal generation: given any subset of observables, it can generate any other. MockCraft uses AION as its generative backbone.
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Installation
|
|
83
|
+
|
|
84
|
+
Requires Python 3.10–3.12.
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
pip install mockcraft
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
**Optional extras:**
|
|
91
|
+
|
|
92
|
+
| Extra | When to use |
|
|
93
|
+
|-------|-------------|
|
|
94
|
+
| `viz` | Visualization helpers (`plot_xp_spectrum`, `plot_image`) and UMAP projections |
|
|
95
|
+
| `notebooks` | Jupyter notebook workflows |
|
|
96
|
+
| `data` | Local Parquet files or HuggingFace datasets |
|
|
97
|
+
|
|
98
|
+
```bash
|
|
99
|
+
pip install mockcraft[viz]
|
|
100
|
+
pip install mockcraft[notebooks]
|
|
101
|
+
pip install mockcraft[viz,notebooks] # both
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
**NVIDIA GPU (CUDA 12.4, Linux) — install PyTorch before the package:**
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
pip install torch --index-url https://download.pytorch.org/whl/cu124
|
|
108
|
+
pip install mockcraft
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
**From source:**
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
git clone https://github.com/CentraleDigitaleLab/mockcraft.git
|
|
115
|
+
cd mockcraft
|
|
116
|
+
pip install -e ".[viz]"
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## Quick Start
|
|
122
|
+
|
|
123
|
+
```python
|
|
124
|
+
from mockcraft import SourceGenerator
|
|
125
|
+
|
|
126
|
+
gen = SourceGenerator(model="aion", seed=42)
|
|
127
|
+
|
|
128
|
+
# Redshift → spectrum + image
|
|
129
|
+
result = gen.generate(
|
|
130
|
+
inputs={"redshift": 0.3},
|
|
131
|
+
outputs=["spectrum", "image"],
|
|
132
|
+
)
|
|
133
|
+
|
|
134
|
+
print(result.spectrum.shape) # (8704,)
|
|
135
|
+
print(result.image.shape) # (3, 128, 128)
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## API Reference
|
|
141
|
+
|
|
142
|
+
### `SourceGenerator`
|
|
143
|
+
|
|
144
|
+
```python
|
|
145
|
+
from mockcraft import SourceGenerator
|
|
146
|
+
|
|
147
|
+
gen = SourceGenerator(model="aion", seed=42)
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
| Parameter | Type | Description |
|
|
151
|
+
|-----------|------|-------------|
|
|
152
|
+
| `model` | `str` | Model identifier. Only `"aion"` is currently supported. |
|
|
153
|
+
| `seed` | `int` or `None` | Fixed random seed for reproducibility. |
|
|
154
|
+
| `device` | `str` or `None` | PyTorch device: `"cuda"`, `"mps"`, or `"cpu"`. Auto-detected if not set. |
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
### `generate(inputs, outputs, cfg=None, type=None, compute_embeddings=False)`
|
|
159
|
+
|
|
160
|
+
The primary generation method. Accepts any combination of input and output modalities.
|
|
161
|
+
|
|
162
|
+
```python
|
|
163
|
+
result = gen.generate(
|
|
164
|
+
inputs={"redshift": 0.3},
|
|
165
|
+
outputs=["spectrum", "image"],
|
|
166
|
+
)
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
| Parameter | Type | Default | Description |
|
|
170
|
+
|-----------|------|---------|-------------|
|
|
171
|
+
| `inputs` | `dict` | — | Modality name → value. Floats for scalars, numpy arrays for spectra/images. |
|
|
172
|
+
| `outputs` | `list[str]` | — | List of modality names to generate. |
|
|
173
|
+
| `cfg` | `float` or `None` | `None` | Classifier-free guidance override. Uses per-modality defaults if not set. |
|
|
174
|
+
| `type` | `str` or `None` | `None` | Morphological type prior: `"elliptical"`, `"spiral"`, or `"irregular"`. |
|
|
175
|
+
| `compute_embeddings` | `bool` | `False` | If `True`, computes AION latent embeddings `(768,)` for generated spectra and images. |
|
|
176
|
+
|
|
177
|
+
Returns a `GeneratedSource` object (see below).
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
### `star(temperature, logg=None, metallicity=None)`
|
|
182
|
+
|
|
183
|
+
Generate a synthetic stellar XP spectrum conditioned on effective temperature.
|
|
184
|
+
|
|
185
|
+
```python
|
|
186
|
+
result = gen.star(temperature=5778.0)
|
|
187
|
+
|
|
188
|
+
print(result.xp_bp.shape) # (55,)
|
|
189
|
+
print(result.xp_rp.shape) # (55,)
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
| Parameter | Type | Description |
|
|
193
|
+
|-----------|------|-------------|
|
|
194
|
+
| `temperature` | `float` | Effective temperature in Kelvin. |
|
|
195
|
+
| `logg` | `float` or `None` | Surface gravity (accepted for API compatibility, not yet used as conditioning). |
|
|
196
|
+
| `metallicity` | `float` or `None` | Metallicity [Fe/H] (accepted for API compatibility, not yet used as conditioning). |
|
|
197
|
+
|
|
198
|
+
Temperature is converted to approximate Gaia G/BP/RP fluxes via bolometric scaling relative to a solar reference (T☉ = 5778 K). Returns a `GeneratedSource` with `xp_bp` and `xp_rp` outputs.
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## Supported Modality Keys
|
|
203
|
+
|
|
204
|
+
| Key | Type | Description |
|
|
205
|
+
|-----|------|-------------|
|
|
206
|
+
| `redshift` | scalar | Spectroscopic redshift |
|
|
207
|
+
| `parallax` | scalar | Gaia parallax (mas) |
|
|
208
|
+
| `flux_g`, `flux_r`, `flux_i`, `flux_z` | scalar | Legacy Survey photometric fluxes (nanomaggies) |
|
|
209
|
+
| `gaia_flux_g`, `gaia_flux_bp`, `gaia_flux_rp` | scalar | Gaia G / BP / RP fluxes |
|
|
210
|
+
| `shape_r`, `shape_e1`, `shape_e2` | scalar | Legacy Survey morphology parameters |
|
|
211
|
+
| `spectrum` | array `(8704,)` | DESI spectrum (flux) |
|
|
212
|
+
| `xp_bp`, `xp_rp` | array `(55,)` | Gaia XP coefficient arrays |
|
|
213
|
+
| `image` | array `(3, 128, 128)` | Legacy Survey 3-band image (g, r, z) |
|
|
214
|
+
|
|
215
|
+
Any of the above can be used as inputs or outputs in `generate()`.
|
|
216
|
+
|
|
217
|
+
---
|
|
218
|
+
|
|
219
|
+
## Return Type
|
|
220
|
+
|
|
221
|
+
`generate()` and `star()` return a `GeneratedSource` object.
|
|
222
|
+
|
|
223
|
+
| Field | Description |
|
|
224
|
+
|-------|-------------|
|
|
225
|
+
| `.outputs` | Dictionary of generated modalities: key → numpy array |
|
|
226
|
+
| `.<key>` | Attribute-style access, e.g. `.spectrum`, `.image`, `.redshift` |
|
|
227
|
+
| `.embedding_spectrum` | AION latent embedding of the generated spectrum `(768,)` — `None` if `compute_embeddings=False` |
|
|
228
|
+
| `.embedding_image` | AION latent embedding of the generated image `(768,)` — `None` if `compute_embeddings=False` |
|
|
229
|
+
|
|
230
|
+
---
|
|
231
|
+
|
|
232
|
+
## Generation Examples
|
|
233
|
+
|
|
234
|
+
### Redshift → spectrum + image
|
|
235
|
+
|
|
236
|
+
```python
|
|
237
|
+
result = gen.generate(
|
|
238
|
+
inputs={"redshift": 0.3},
|
|
239
|
+
outputs=["spectrum", "image"],
|
|
240
|
+
)
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
Runs a two-step chained pipeline: redshift → DESI spectrum (CFG=1.0), then spectrum → Legacy Survey image (CFG=2.0).
|
|
244
|
+
|
|
245
|
+
### Morphological type conditioning
|
|
246
|
+
|
|
247
|
+
```python
|
|
248
|
+
result = gen.generate(
|
|
249
|
+
inputs={"redshift": 0.3},
|
|
250
|
+
outputs=["spectrum", "image"],
|
|
251
|
+
type="elliptical", # or "spiral", "irregular"
|
|
252
|
+
)
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
Internally injects median `shape_r`, `shape_e1`, `shape_e2` values from DESI DR1 as additional conditioning inputs.
|
|
256
|
+
|
|
257
|
+
### Real sources from catalogue → spectrum (with embeddings)
|
|
258
|
+
|
|
259
|
+
```python
|
|
260
|
+
from mockcraft.catalogue import fetch_sources
|
|
261
|
+
|
|
262
|
+
sources = fetch_sources(
|
|
263
|
+
surveys=["desi"],
|
|
264
|
+
region="cosmos",
|
|
265
|
+
columns=["redshift", "flux_g", "flux_r", "flux_z"],
|
|
266
|
+
max_sources=10,
|
|
267
|
+
)
|
|
268
|
+
|
|
269
|
+
for _, row in sources.iterrows():
|
|
270
|
+
result = gen.generate(
|
|
271
|
+
inputs={"redshift": float(row["redshift"]), "flux_g": float(row["flux_g"]),
|
|
272
|
+
"flux_r": float(row["flux_r"]), "flux_z": float(row["flux_z"])},
|
|
273
|
+
outputs=["spectrum"],
|
|
274
|
+
compute_embeddings=True,
|
|
275
|
+
)
|
|
276
|
+
print(result.spectrum.shape) # (8704,)
|
|
277
|
+
print(result.embedding_spectrum.shape) # (768,)
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## Catalogue Utility
|
|
283
|
+
|
|
284
|
+
```python
|
|
285
|
+
from mockcraft.catalogue import fetch_sources
|
|
286
|
+
|
|
287
|
+
sources = fetch_sources(
|
|
288
|
+
surveys=["gaia", "desi"],
|
|
289
|
+
region="cosmos",
|
|
290
|
+
columns=["ra", "dec", "magnitude", "redshift"],
|
|
291
|
+
max_sources=100,
|
|
292
|
+
)
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
`region` accepts either a named field or an explicit `(RA, Dec, radius_deg)` tuple:
|
|
296
|
+
|
|
297
|
+
```python
|
|
298
|
+
# Named region
|
|
299
|
+
fetch_sources(surveys=["desi"], region="cosmos", ...)
|
|
300
|
+
|
|
301
|
+
# Explicit coordinates
|
|
302
|
+
fetch_sources(surveys=["gaia"], region=(150.1, 2.18, 0.18), ...)
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
### Supported surveys and columns
|
|
306
|
+
|
|
307
|
+
| Survey | Supported columns |
|
|
308
|
+
|--------|-------------------|
|
|
309
|
+
| `"gaia"` | `ra`, `dec`, `magnitude`, `gaia_flux_g`, `gaia_flux_bp`, `gaia_flux_rp`, `gaia_parallax` |
|
|
310
|
+
| `"desi"` | `ra`, `dec`, `redshift`, `flux_g`, `flux_r`, `flux_z`, `targetid`, `otype` |
|
|
311
|
+
| `"legacy"` | `ra`, `dec`, `redshift`, `type` |
|
|
312
|
+
|
|
313
|
+
When combining multiple surveys, columns not available in a given survey are filled with `NaN`. DESI results are automatically filtered to `ZWARN == 0` (good redshift quality only).
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Visualization
|
|
318
|
+
|
|
319
|
+
Requires the `viz` extra (`pip install mockcraft[viz]`).
|
|
320
|
+
|
|
321
|
+
```python
|
|
322
|
+
from mockcraft import plot_xp_spectrum, plot_image
|
|
323
|
+
|
|
324
|
+
# Plot a predicted Gaia XP spectrum
|
|
325
|
+
result = gen.star(temperature=5778.0)
|
|
326
|
+
plot_xp_spectrum(result)
|
|
327
|
+
|
|
328
|
+
# Plot a predicted Legacy Survey image
|
|
329
|
+
result = gen.generate(inputs={"redshift": 0.3}, outputs=["image"])
|
|
330
|
+
plot_image(result)
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
---
|
|
334
|
+
|
|
335
|
+
## Model and Hyperparameters
|
|
336
|
+
|
|
337
|
+
The package loads `polymathic-ai/aion-base` automatically on first use.
|
|
338
|
+
|
|
339
|
+
| Parameter | Value | Role |
|
|
340
|
+
|-----------|-------|------|
|
|
341
|
+
| `CFG_SPEC` | `1.0` | Guidance scale for anything → spectrum |
|
|
342
|
+
| `CFG_GALAXY` | `2.0` | Guidance scale for spectrum → image |
|
|
343
|
+
| `MASKGIT_STEPS` | `8` | Number of iterative decoding steps |
|
|
344
|
+
| `TEMPERATURE` | `0.8` | Sampling temperature |
|
|
345
|
+
| `N_ROAR_DRAWS` | `50` | Posterior samples for redshift estimation (higher = more accurate, slower) |
|
|
346
|
+
|
|
347
|
+
---
|
|
348
|
+
|
|
349
|
+
## Embeddings and Validation
|
|
350
|
+
|
|
351
|
+
Setting `compute_embeddings=True` re-encodes generated spectra and images through AION's encoder to produce latent vectors of shape `(768,)`. These can be used for:
|
|
352
|
+
|
|
353
|
+
- comparing generated vs real source distributions in embedding space (cosine similarity, MMD)
|
|
354
|
+
- UMAP projection to inspect coverage of the latent space
|
|
355
|
+
- detecting mode collapse or out-of-distribution generation
|
|
356
|
+
|
|
357
|
+
Embeddings are disabled by default because they add a second forward pass per modality.
|
|
358
|
+
|
|
359
|
+
---
|
|
360
|
+
|
|
361
|
+
## Repository Structure
|
|
362
|
+
|
|
363
|
+
```
|
|
364
|
+
mockcraft/
|
|
365
|
+
├── mockcraft/
|
|
366
|
+
│ ├── __init__.py # Public API exports
|
|
367
|
+
│ ├── source_generator.py # SourceGenerator, GeneratedSource, plot helpers
|
|
368
|
+
│ └── catalogue.py # fetch_sources — query Gaia, DESI, Legacy via VizieR
|
|
369
|
+
├── pyproject.toml
|
|
370
|
+
├── README.md
|
|
371
|
+
└── LICENSE
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
---
|
|
375
|
+
|
|
376
|
+
## License
|
|
377
|
+
|
|
378
|
+
MIT — see [LICENSE](LICENSE).
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
## Citation
|
|
383
|
+
|
|
384
|
+
If you use MockCraft in your research, please cite the AION paper:
|
|
385
|
+
|
|
386
|
+
```bibtex
|
|
387
|
+
@article{multimodal_universe_2024,
|
|
388
|
+
title = {Multimodal Universe: Enabling Large-Scale Machine Learning with 100TB of Astronomical Scientific Data},
|
|
389
|
+
year = {2024},
|
|
390
|
+
url = {https://arxiv.org/pdf/2412.02527}
|
|
391
|
+
}
|
|
392
|
+
```
|