euler-inference 2.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,13 @@
1
+ Metadata-Version: 2.4
2
+ Name: euler-inference
3
+ Version: 2.0.1
4
+ Summary: Modality-agnostic inference pipeline using euler-loading
5
+ Author-email: Daniel Rothenpieler <rothenpielerdaniel@gmail.com>
6
+ Requires-Python: >=3.10
7
+ Requires-Dist: torch>=2.0.0
8
+ Requires-Dist: numpy
9
+ Requires-Dist: Pillow
10
+ Requires-Dist: euler-loading
11
+ Requires-Dist: tqdm
12
+ Provides-Extra: dev
13
+ Requires-Dist: pytest; extra == "dev"
@@ -0,0 +1,471 @@
1
+ # euler-inference
2
+
3
+ A modality-agnostic inference pipeline for running models against [euler-loading](https://github.com/d-rothen/euler-loading) datasets. Your model receives all loaded modalities as a dict and returns predictions — the pipeline handles data loading, source-aware output writing, and dataset indexing.
4
+
5
+ ## Install
6
+
7
+ ```sh
8
+ pip install -e .
9
+ uv pip install "euler-inference @ git+https://github.com/d-rothen/euler-inference"
10
+ ```
11
+
12
+ ## How it works
13
+
14
+ 1. You point the pipeline at a **model** (a `.py` file with a `Model` class) and a **dataset** (modality paths indexed by [ds-crawler](https://github.com/d-rothen/ds-crawler))
15
+ 2. [euler-loading](https://github.com/d-rothen/euler-loading) loads each sample, auto-resolving loaders from each dataset's `output.json` metadata
16
+ 3. All loaded modalities are passed to your model's `predict()` as a flat dict
17
+ 4. Predictions either mirror an input modality via `euler-loading` writers, or fall back to the legacy serializer when no source modality is available
18
+
19
+ ## Table of Contents
20
+
21
+ - [Quick Start](#quick-start)
22
+ - [Model Contract](#model-contract)
23
+ - [Model Cards](#model-cards)
24
+ - [Configuration](#configuration)
25
+ - [JSON Config](#json-config)
26
+ - [CLI Flags](#cli-flags)
27
+ - [Python API](#python-api)
28
+ - [Output Behaviour](#output-behaviour)
29
+ - [SLURM / HPC Usage](#slurm--hpc-usage)
30
+ - [Testing](#testing)
31
+ - [Troubleshooting](#troubleshooting)
32
+
33
+ ## Quick Start
34
+
35
+ ### With a model card (recommended)
36
+
37
+ ```bash
38
+ euler-inference \
39
+ --model-card model_card.json \
40
+ --set weights=/path/to/checkpoint.pt \
41
+ --data rgb=/data/vkitti2/rgb \
42
+ -o /output/predictions
43
+ ```
44
+
45
+ ### With a JSON config
46
+
47
+ ```bash
48
+ euler-inference -c config.json
49
+ ```
50
+
51
+ ### From Python
52
+
53
+ ```python
54
+ from euler_inference.api import infer
55
+
56
+ infer(
57
+ model_path="/path/to/model.py",
58
+ output_base_path="/output/predictions",
59
+ dataset_modalities={"rgb": "/data/vkitti2/rgb"},
60
+ )
61
+ ```
62
+
63
+ ## Model Contract
64
+
65
+ Your model file must define a class named `Model` with the following interface. The file is loaded in-process via `importlib` — no subprocesses, no serialization.
66
+
67
+ ### Required
68
+
69
+ ```python
70
+ class Model:
71
+ def __init__(self, config: dict, device: str | None = None):
72
+ """Called once when the pipeline starts."""
73
+ ...
74
+
75
+ def predict(self, inputs: dict) -> dict:
76
+ """Called once per sample. Receives all loaded modalities, returns predictions."""
77
+ ...
78
+ ```
79
+
80
+ **`__init__`** receives:
81
+ - `config` — the `model_config` dict from your config (empty `{}` if omitted)
82
+ - `device` — device string (`"cuda"`, `"cpu"`, `"mps"`) or `None` for auto-detect
83
+
84
+ **`predict`** receives a dict of **all loaded modalities by name**. The keys match the modality names from the dataset config. For example, with `{"rgb": "/data/rgb", "depth": "/data/depth"}` and hierarchical `{"textgt": "/data/textgt"}`:
85
+
86
+ ```python
87
+ inputs = {
88
+ "rgb": <loaded rgb data>, # numpy array, tensor, etc.
89
+ "depth": <loaded depth data>,
90
+ "textgt": {"intrinsics": ...}, # hierarchical: dict of file_id -> data
91
+ }
92
+ ```
93
+
94
+ The exact types depend on the loaders configured in each dataset's `output.json` (resolved automatically by euler-loading). If euler-loading has access to torch/CUDA, it will use GPU loaders where available.
95
+
96
+ **`predict`** must return a dict whose keys match the `key` fields in your `outputs` config. Each value should be a `np.ndarray`.
97
+
98
+ ### Optional metadata
99
+
100
+ Models can declare class-level attributes so pipelines don't need to know model internals:
101
+
102
+ ```python
103
+ class Model:
104
+ OUTPUTS = [
105
+ {"key": "depth", "type": "npy"},
106
+ {"key": "confidence", "type": "png"},
107
+ ]
108
+
109
+ DEFAULT_CONFIG = {
110
+ "backbone": "resnet50",
111
+ "num_scales": 4,
112
+ }
113
+
114
+ def __init__(self, config, device=None): ...
115
+ def predict(self, inputs): ...
116
+ ```
117
+
118
+ **`OUTPUTS`** — When the config omits `outputs`, the pipeline reads `OUTPUTS` from the Model class. If both omit it, the pipeline default (`[{"key": "depth", "type": "npy"}]`) is used.
119
+
120
+ **`DEFAULT_CONFIG`** — Merged under user-provided `model_config` (user values win). For example, `DEFAULT_CONFIG = {"backbone": "resnet50", "num_scales": 4}` with user config `{"backbone": "efficientnet"}` produces `{"backbone": "efficientnet", "num_scales": 4}`.
121
+
122
+ ### Rules and gotchas
123
+
124
+ - The class **must** be named `Model` (case-sensitive)
125
+ - Relative imports don't work (loaded via importlib). Add your model's directory to `sys.path`:
126
+ ```python
127
+ import sys
128
+ from pathlib import Path
129
+ sys.path.insert(0, str(Path(__file__).parent))
130
+ ```
131
+ - Your model runs **in the same process** — all dependencies must be importable in the active environment
132
+ - `predict` is called **once per sample** (no batching by the pipeline)
133
+ - Lazy-loading weights inside `predict` on first call is a recommended pattern
134
+
135
+ A full template is at [`examples/model_template.py`](examples/model_template.py).
136
+
137
+ ## Model Cards
138
+
139
+ Model cards separate the model's self-description from runtime configuration. The model author ships a `model_card.json` alongside `model.py`; the pipeline operator provides runtime values via placeholder bindings.
140
+
141
+ ### Placeholder syntax
142
+
143
+ ```
144
+ {{type:name}}
145
+ ```
146
+
147
+ - `type` is informational (used by external UIs for typed pickers): `checkpoint`, `modality`, `hierarchical_modality`, `simple_path`, etc.
148
+ - `name` is the binding key used for resolution
149
+
150
+ ### Example card
151
+
152
+ ```json
153
+ {
154
+ "model": "./model.py",
155
+ "checkpoint": "{{checkpoint:weights}}",
156
+ "config": {
157
+ "backbone": "resnet50"
158
+ },
159
+ "inputs": {
160
+ "rgb": "{{modality:rgb}}"
161
+ },
162
+ "hierarchical_inputs": {
163
+ "textgt": "{{hierarchical_modality:textgt}}"
164
+ },
165
+ "outputs": [
166
+ {"key": "depth", "type": "npy"}
167
+ ]
168
+ }
169
+ ```
170
+
171
+ ### CLI usage
172
+
173
+ ```bash
174
+ euler-inference \
175
+ --model-card model_card.json \
176
+ --set weights=/path/to/checkpoint.pt \
177
+ --data rgb=/data/vkitti2/rgb \
178
+ --hierarchical-data textgt=/data/vkitti2/textgt \
179
+ -o /output/predictions
180
+ ```
181
+
182
+ ### Python usage
183
+
184
+ ```python
185
+ from euler_inference.api import infer
186
+
187
+ # From a card file
188
+ infer(
189
+ model_card="model_card.json",
190
+ bindings={"weights": "/path/to/checkpoint.pt"},
191
+ data={"rgb": "/data/rgb"},
192
+ output_base_path="/output",
193
+ )
194
+
195
+ # From an already-resolved dict (e.g. from a server)
196
+ infer(
197
+ model_card={"model": "/abs/path/model.py", "inputs": {"rgb": "/data/rgb"}, ...},
198
+ output_base_path="/output",
199
+ )
200
+ ```
201
+
202
+ ### Card fields
203
+
204
+ | Field | Required | Description |
205
+ |-------|----------|-------------|
206
+ | `model` | Yes | Relative path to model.py (resolved from card directory) |
207
+ | `checkpoint` | No | Checkpoint path (injected into model_config as `config["checkpoint"]`) |
208
+ | `config` | No | Model-specific config dict (merged with checkpoint) |
209
+ | `inputs` | Yes | Modality name -> path mapping |
210
+ | `hierarchical_inputs` | No | Hierarchical modality name -> path mapping |
211
+ | `outputs` | No | Output configuration (falls back to model `OUTPUTS` or pipeline default) |
212
+
213
+ ## Configuration
214
+
215
+ ### JSON Config
216
+
217
+ For simpler setups without placeholders, use a monolithic JSON config:
218
+
219
+ ```json
220
+ {
221
+ "external_model": {
222
+ "model_path": "/absolute/path/to/model.py",
223
+ "model_config": {
224
+ "checkpoint": "/path/to/weights.pt"
225
+ }
226
+ },
227
+ "dataset": {
228
+ "modalities": {
229
+ "rgb": "/data/vkitti2/rgb"
230
+ },
231
+ "hierarchical_modalities": {
232
+ "textgt": "/data/vkitti2/textgt"
233
+ }
234
+ },
235
+ "outputs": [
236
+ {"key": "depth", "type": "npy"},
237
+ {"key": "confidence", "type": "png", "suffix": ""}
238
+ ],
239
+ "output_base_path": "/output/predictions",
240
+ "device": "cuda",
241
+ "max_samples": null,
242
+ "zip": false,
243
+ "strict": true
244
+ }
245
+ ```
246
+
247
+ #### Field reference
248
+
249
+ | Field | Type | Required | Description |
250
+ |-------|------|----------|-------------|
251
+ | `external_model.model_path` | string | Yes | **Absolute** path to model `.py` file |
252
+ | `external_model.model_config` | object/string | No | Model-specific config (dict or path to JSON). Passed as `config` to `Model.__init__`. Defaults to `{}`. |
253
+ | `dataset.modalities` | object | Yes | Map of modality names to their root paths |
254
+ | `dataset.hierarchical_modalities` | object | No | Map of hierarchical modality names to paths |
255
+ | `outputs` | list | No | What to save from the model's output dict. Defaults to `[{"key": "depth", "type": "npy"}]`. |
256
+ | `outputs[].key` | string | Yes | Key in the dict returned by `Model.predict()` |
257
+ | `outputs[].type` | string | Yes | Legacy file format: `npy`, `png`, `jpg`, `jpeg`, `exr`. Used when no source-backed writer is available. |
258
+ | `outputs[].suffix` | string | No | Legacy filename suffix before extension. Defaults to `_<key>`. Ignored for source-backed outputs. |
259
+ | `outputs[].source_modality` | string | No | Regular input modality whose `euler-loading` writer/path should be mirrored under `output_base_path/<key>/`. Defaults to `key` when it matches a regular input modality. |
260
+ | `outputs[].writer` | object | No | Override ds-crawler writer metadata for legacy outputs (see [Writer metadata](#writer-metadata)) |
261
+ | `output_base_path` | string | Yes* | Base directory for predictions. Can be omitted if supplied via `-o`. |
262
+ | `device` | string | No | `"cuda"`, `"cpu"`, `"mps"`. Auto-detected if omitted. |
263
+ | `max_samples` | int | No | Limit samples to process. `null` for all. |
264
+ | `zip` | bool | No | Write outputs as `.zip` archives instead of directories. Source-backed outputs preserve source filenames and extensions inside the archive. Default `false`. |
265
+ | `strict` | bool | No | Enforce writer metadata for known modality types. Default `true`. |
266
+
267
+ ### CLI Flags
268
+
269
+ Runtime overrides so configs don't need to embed pipeline-specific values:
270
+
271
+ ```bash
272
+ euler-inference -c config.json \
273
+ -o /scratch/predictions \
274
+ -d cuda \
275
+ -n 1000 \
276
+ --zip \
277
+ --no-strict
278
+ ```
279
+
280
+ | Flag | Description |
281
+ |------|-------------|
282
+ | `-c`, `--config` | Path to JSON config file |
283
+ | `--model-card` | Path to model card JSON (mutually exclusive with `-c`) |
284
+ | `--set KEY=VALUE` | Set a placeholder binding |
285
+ | `--data KEY=VALUE` | Set an input modality path binding |
286
+ | `--hierarchical-data KEY=VALUE` | Set a hierarchical input path binding |
287
+ | `-o`, `--output-base-path` | Override output directory |
288
+ | `-d`, `--device` | Override device |
289
+ | `-n`, `--max-samples` | Override max samples |
290
+ | `--zip` | Write outputs as zip archives |
291
+ | `--no-strict` | Disable strict writer metadata validation |
292
+ | `-v`, `--verbose` | Enable verbose logging |
293
+
294
+ ### Python API
295
+
296
+ ```python
297
+ from euler_inference.api import infer
298
+
299
+ infer(
300
+ model_path="/path/to/model.py",
301
+ output_base_path="/output",
302
+ dataset_modalities={"rgb": "/data/rgb"},
303
+ dataset_hierarchical_modalities={"textgt": "/data/textgt"},
304
+ model_config={"checkpoint": "/path/to/weights.pt"},
305
+ outputs=[{"key": "depth", "type": "npy"}],
306
+ device="cuda",
307
+ max_samples=100,
308
+ zip=False,
309
+ strict=True,
310
+ verbose=True,
311
+ )
312
+ ```
313
+
314
+ When `outputs` is omitted, the pipeline resolves outputs from the model's `OUTPUTS` attribute, then falls back to the pipeline default.
315
+
316
+ ## Output Behaviour
317
+
318
+ ### Source-backed outputs
319
+
320
+ When `outputs[].source_modality` is set, or the output key matches a regular input modality, the pipeline uses `MultiModalDataset.write_sample()` and the source modality's `euler-loading` writer. Files are written under `output_base_path/<output_key>/` using the source modality's relative path, basename, and extension:
321
+
322
+ ```
323
+ output_base_path/
324
+ └── rgb/
325
+ ├── .ds-crawler/output.json # ds-crawler index
326
+ └── Scene01/clone/Camera_0/
327
+ ├── 00001.png
328
+ ├── 00002.png
329
+ └── ...
330
+ ```
331
+
332
+ This keeps the output format aligned with the read-in format. The output dataset metadata is initialized from the source modality index, so the mirrored output stays loadable by `euler-loading`.
333
+
334
+ With `--zip`, the same source-backed output is written into one archive per output key:
335
+
336
+ ```
337
+ output_base_path/
338
+ └── rgb.zip
339
+ ├── .ds-crawler/output.json
340
+ └── Scene01/clone/Camera_0/
341
+ ├── 00001.png
342
+ ├── 00002.png
343
+ └── ...
344
+ ```
345
+
346
+ ### Legacy outputs
347
+
348
+ If no source modality can be resolved for an output, the pipeline falls back to the legacy serializer and writes files as `{sample_id}{suffix}.{type}`:
349
+
350
+ ```
351
+ output_base_path/
352
+ └── depth/
353
+ ├── .ds-crawler/output.json
354
+ └── Scene01/clone/Camera_0/
355
+ ├── 00001_depth.npy
356
+ └── ...
357
+ ```
358
+
359
+ ### Writer metadata
360
+
361
+ Source-backed outputs use ds-crawler writer backends configured from the source modality's `output.json`, so their dataset metadata mirrors the read-in modality.
362
+
363
+ Legacy outputs are written via ds-crawler's `DatasetWriter`/`ZipDatasetWriter`, which generates an `output.json` index for the output dataset. The writer receives these fields:
364
+
365
+ | Field | Default | Override via |
366
+ |-------|---------|-------------|
367
+ | `name` | Input dataset's name (from its `output.json`), or the output key if unavailable | `outputs[].writer.name` |
368
+ | `type` | Output key (e.g. `"depth"`) | `outputs[].writer.type` |
369
+ | `euler_train` | `{"used_as": "target", "modality_type": "<key>"}` | `outputs[].writer.euler_train` |
370
+ | `euler_loading` | *(omitted)* | `outputs[].writer.euler_loading` |
371
+ | `separator` | `null` | — |
372
+ | `meta` | *(omitted)* | `outputs[].writer.meta` |
373
+
374
+ In `--no-strict` mode, `modality_type` defaults to `"other"` instead of the output key, which bypasses ds-crawler's metadata validation for known types (depth, rgb, etc.).
375
+
376
+ #### `euler_loading`
377
+
378
+ When the output dataset should be loadable by [euler-loading](https://github.com/d-rothen/euler-loading) without an explicit loader, set `euler_loading` with the `loader` (module name) and `function` (callable name) that euler-loading should use to auto-resolve a loader from the output's `output.json`. Available loaders: `vkitti2`, `real_drive_sim`, `generic_dense_depth`.
379
+
380
+ You can also include `used_as`, `modality_type`, `slot`, and `task` — these are consulted by euler-loading's `describe_for_runlog()` for experiment metadata.
381
+
382
+ Example with explicit writer overrides:
383
+
384
+ ```json
385
+ {
386
+ "outputs": [
387
+ {
388
+ "key": "depth",
389
+ "type": "npy",
390
+ "writer": {
391
+ "name": "my_dataset",
392
+ "euler_train": {"used_as": "target", "modality_type": "depth"},
393
+ "euler_loading": {"loader": "generic_dense_depth", "function": "depth"},
394
+ "meta": {"radial_depth": false, "scale_to_meters": 1.0, "range": [0, 100]}
395
+ }
396
+ }
397
+ ]
398
+ }
399
+ ```
400
+
401
+ ### Supported output formats
402
+
403
+ | Format | Extension | Notes |
404
+ |--------|-----------|-------|
405
+ | NumPy | `npy` | Preserves full float precision |
406
+ | PNG | `png` | Float arrays clipped to [0,1] and scaled to uint8. Supports grayscale, RGB, RGBA. |
407
+ | JPEG | `jpg`/`jpeg` | Same as PNG but lossy. |
408
+ | OpenEXR | `exr` | Full float precision. Requires `pip install OpenEXR`. Not supported in zip mode. |
409
+
410
+ ### Image output conversion
411
+
412
+ When saving to `png`/`jpg`:
413
+ - `float32`/`float64` arrays are clipped to [0, 1] and scaled to uint8
414
+ - Other dtypes are cast to `uint8`
415
+ - 2D arrays are saved as grayscale, (H, W, 3) as RGB, (H, W, 4) as RGBA
416
+
417
+ ## SLURM / HPC Usage
418
+
419
+ ```bash
420
+ #!/bin/bash
421
+ #SBATCH --job-name=inference
422
+ #SBATCH --gres=gpu:1
423
+ #SBATCH --cpus-per-task=4
424
+ #SBATCH --mem=32G
425
+
426
+ source /path/to/venv/bin/activate
427
+
428
+ euler-inference \
429
+ --model-card /path/to/model_card.json \
430
+ --set weights=/path/to/checkpoint.pt \
431
+ --data rgb=/data/vkitti2/rgb \
432
+ -o /scratch/$SLURM_JOB_ID/output
433
+ ```
434
+
435
+ Key points:
436
+ - Your model runs in-process, so the active environment must have all dependencies (both euler-inference's and your model's)
437
+ - GPU allocation from SLURM (`CUDA_VISIBLE_DEVICES`) is available directly
438
+ - euler-loading will automatically use GPU loaders when torch/CUDA is available
439
+
440
+ ## Testing
441
+
442
+ ```bash
443
+ pytest tests/ -v
444
+ ```
445
+
446
+ ## Troubleshooting
447
+
448
+ ### "model_path must be absolute"
449
+ Use absolute paths in your config:
450
+ ```json
451
+ "model_path": "/hpc/users/me/models/model.py"
452
+ ```
453
+
454
+ ### ImportError when loading your model
455
+ Your model runs in the same process. All of your model's dependencies must be installed in the active Python environment.
456
+
457
+ ### Relative imports don't work in model.py
458
+ Add your model's directory to `sys.path`:
459
+ ```python
460
+ import sys
461
+ from pathlib import Path
462
+ sys.path.insert(0, str(Path(__file__).parent))
463
+ ```
464
+
465
+ ### "Model did not return '<key>' key"
466
+ Your `predict()` return dict is missing a key that the `outputs` config expects. Make sure the keys match.
467
+
468
+ ### "meta is required for modality_type='depth'"
469
+ In strict mode (default), ds-crawler requires metadata for known modality types. Either:
470
+ - Add a `writer.meta` dict to your output config with the required fields
471
+ - Use `--no-strict` to bypass validation
@@ -0,0 +1,8 @@
1
+ """Model inference pipeline using euler-loading."""
2
+
3
+ # Note: We don't import submodules here to avoid RuntimeWarning when running
4
+ # `python -m euler_inference`. Import directly from submodules instead:
5
+ # from euler_inference.config import InferenceConfig
6
+ # from euler_inference.inference import run_inference
7
+
8
+ __all__ = ["config", "inference", "models"]
@@ -0,0 +1,5 @@
1
+ """Allow running as ``python -m euler_inference``."""
2
+
3
+ from euler_inference.inference import main
4
+
5
+ main()