tdseries 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
tdseries-0.1.0/.envrc ADDED
@@ -0,0 +1,3 @@
1
+ if has nix; then
2
+ use flake .
3
+ fi
@@ -0,0 +1,36 @@
1
+ name: Checks
2
+
3
+ on:
4
+ workflow_call:
5
+
6
+ jobs:
7
+ lint-typecheck-test:
8
+ runs-on: ubuntu-latest
9
+ env:
10
+ UV_CACHE_DIR: /tmp/.uv-cache
11
+ steps:
12
+ - name: Checkout
13
+ uses: actions/checkout@v7
14
+
15
+ - name: Install Nix
16
+ uses: DeterminateSystems/determinate-nix-action@v3
17
+
18
+ - name: Restore uv cache
19
+ uses: actions/cache@v4
20
+ with:
21
+ path: /tmp/.uv-cache
22
+ key: uv-${{ runner.os }}-${{ hashFiles('uv.lock') }}
23
+ restore-keys: |
24
+ uv-${{ runner.os }}-
25
+
26
+ - name: ruff + ruff-format
27
+ run: nix flake check -L
28
+
29
+ - name: basedpyright
30
+ run: nix develop --command basedpyright
31
+
32
+ - name: Tests
33
+ run: nix develop --command uv run pytest -q
34
+
35
+ - name: Minimize uv cache
36
+ run: nix develop --command uv cache prune --ci
@@ -0,0 +1,10 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [master]
6
+ pull_request:
7
+
8
+ jobs:
9
+ checks:
10
+ uses: ./.github/workflows/checks.yml
@@ -0,0 +1,38 @@
1
+ name: Publish to PyPI
2
+
3
+ on:
4
+ push:
5
+ tags:
6
+ - "v*"
7
+
8
+ jobs:
9
+ checks:
10
+ uses: ./.github/workflows/checks.yml
11
+
12
+ publish:
13
+ needs: checks
14
+ runs-on: ubuntu-latest
15
+ environment:
16
+ name: pypi
17
+ url: https://pypi.org/project/tdseries/
18
+ permissions:
19
+ id-token: write
20
+ contents: read
21
+ steps:
22
+ - name: Checkout
23
+ uses: actions/checkout@v7
24
+
25
+ - name: Install uv
26
+ uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
27
+ with:
28
+ enable-cache: true
29
+ cache-dependency-glob: "uv.lock"
30
+
31
+ - name: Install Python
32
+ run: uv python install 3.12
33
+
34
+ - name: Build
35
+ run: uv build
36
+
37
+ - name: Publish
38
+ run: uv publish
@@ -0,0 +1,12 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ *.egg-info/
4
+ .venv/
5
+ .direnv/
6
+ .pytest_cache/
7
+ .hypothesis/
8
+ .ruff_cache/
9
+ dist/
10
+ result
11
+ # generated by git-hooks.nix shellHook; bakes in local /nix/store paths
12
+ .pre-commit-config.yaml
@@ -0,0 +1,45 @@
1
+ # tdseries — agent guide
2
+
3
+ Read [`DESIGN.md`](./DESIGN.md) before touching library code: it is the
4
+ binding API contract (dimension/index model, Series and Frame semantics,
5
+ exactness invariants). [`README.md`](./README.md) is the human-facing
6
+ summary; [`ROADMAP.md`](./ROADMAP.md) records deliberately deferred
7
+ directions (CoordIndex, framed transforms over event series, lazy sources,
8
+ element-wise operator layer) — check it before proposing "new" features.
9
+
10
+ ## Layout
11
+
12
+ | Path | Role |
13
+ |---|---|
14
+ | `src/tdseries/_ticks.py` | seconds↔int64-tick conversion boundary; `TICKS_PER_SECOND` |
15
+ | `src/tdseries/_array.py` | numpy/torch backend dispatch (`take`, `concat`, `array_equal`, ...) |
16
+ | `src/tdseries/errors.py` | `DomainError`, `IncompatibleError`, `DimensionError` |
17
+ | `src/tdseries/indexes.py` | the index hierarchy: `RangeIndex`, `LabelIndex` (ordinary dims); `GridIndex`, `StampIndex`, `SpanIndex` (time). All exact-tick cut/seam arithmetic lives here. |
18
+ | `src/tdseries/series.py` | `Series` leaf (data + dims + indexes) and factories `uniform`/`events`/`spans`/`wrap` |
19
+ | `src/tdseries/frame.py` | `Frame` pytree node: relative-anchor storage, hull, recursion of time/dim ops |
20
+ | `tests/` | hypothesis property tests; `strategies.py` draws exact int64 tick anchors (small and Unix-magnitude) |
21
+
22
+ ## Load-bearing invariants (do not weaken)
23
+
24
+ 1. `x.ticks[a:b].concat(x.ticks[b:c]) == x.ticks[a:c]` — exact slice-concat
25
+ identity for every index kind and for frames; property-tested.
26
+ 2. All time arithmetic is int64 ticks; the only floats are `GridIndex.phase`
27
+ (bounded by one sample) and the non-integer-sr fallback (`INDEX_EPSILON`).
28
+ 3. `shift` is O(1) everywhere: content is stored anchor-relative, only the
29
+ scalar anchor moves. Frames store children relative to the frame anchor;
30
+ extraction re-anchors a copy, never mutates.
31
+ 4. No `# type: ignore` / `# noqa` — restructure until ruff+basedpyright pass
32
+ (a PostToolUse hook may enforce this on every write).
33
+
34
+ ## Workflow
35
+
36
+ - `uv sync` then `uv run pytest` (hypothesis suite, runs in tens of seconds).
37
+ - `nix develop` for the hooked dev shell (ruff, ruff-format, basedpyright).
38
+ - torch is an optional extra for runtime users (lazy import inside
39
+ `_array.py` only, `src/tdseries` stays importable without it); it's a
40
+ `dev` dependency-group member so basedpyright/tests can see it locally
41
+ and in CI. `[tool.uv.sources]` pins it to the CPU-only wheel index so
42
+ `uv sync` doesn't pull the CUDA/nvidia stack.
43
+ - CI (`.github/workflows/checks.yml`, reused by `ci.yml` and `publish.yml`)
44
+ runs ruff + ruff-format (`nix flake check`), basedpyright, and pytest via
45
+ `nix develop`; publishing to PyPI on tag push is gated on all three passing.
@@ -0,0 +1 @@
1
+ AGENTS.md
@@ -0,0 +1,324 @@
1
+ # tdseries — design contract
2
+
3
+ Immutable pytree frames of tensors with **named, indexed dimensions**; time is
4
+ a first-class dimension indexed in exact int64 **ticks** (`TICKS_PER_SECOND =
5
+ 1e9`). Successor of `utils.data` (TimeSeries/TimeFrame) in the
6
+ harmonic-noise-suppression project; the tick/phase arithmetic is ported from
7
+ there verbatim (see that repo's `src/utils/data/` for reference semantics and
8
+ `tests/utils/data/` for reference property tests).
9
+
10
+ ## Core model
11
+
12
+ A **dimension** is a name plus a *domain*. Each tensor axis carrying that name
13
+ has an **index**: a monotone map `positions 0..n-1 → domain coordinates`.
14
+ Slices are expressed in domain coordinates and translated per-tensor into
15
+ positional selections. This is what lets tensors of *different lengths* share
16
+ a dimension (audio at 44.1 kHz and RPS events at ~100 Hz both have dim
17
+ `"time"`).
18
+
19
+ Index families (implemented in `tdseries/indexes.py`, already written):
20
+
21
+ | Index | Domain | Sizes across a frame | Selection |
22
+ |---|---|---|---|
23
+ | `RangeIndex` (implicit default) | positions | must be equal | positional |
24
+ | `LabelIndex` | hashable labels | may differ | by label (`.sel`) or positional |
25
+ | `GridIndex` | ticks (uniform grid + sub-sample phase) | may differ | `.time` / `.ticks` |
26
+ | `StampIndex` | ticks (sorted point events) | may differ | `.time` / `.ticks` |
27
+ | `SpanIndex` | ticks (half-open intervals + identity ids) | may differ | `.time` / `.ticks` |
28
+
29
+ The dim name `"time"` is reserved: it must carry a `TimeIndex`
30
+ (`GridIndex | StampIndex | SpanIndex`), and a `TimeIndex` may only sit on
31
+ `"time"`. At most one `"time"` dim per tensor.
32
+
33
+ `TimeIndex` methods return `(new_index, positional_selection)` so the caller
34
+ (Series) applies the selection to its data axis. All three uphold, exactly:
35
+
36
+ slice(a, b) ⊕ slice(b, c) == slice(a, c) for t_start ≤ a ≤ b ≤ c ≤ t_end
37
+
38
+ `shift` is O(1) everywhere (only the scalar anchor moves; stored content is
39
+ anchor-relative).
40
+
41
+ ## `Series` — the leaf (`tdseries/series.py`)
42
+
43
+ Frozen dataclass, `eq=False`:
44
+
45
+ ```python
46
+ Series(
47
+ data, # np.ndarray | torch.Tensor | None (None ⇒ index-only, dims must be ("time",))
48
+ dims, # tuple[str | None, ...], len == data.ndim; None = anonymous axis
49
+ indexes={}, # Mapping[str, DimIndex | TimeIndex]; keys ⊆ named dims
50
+ )
51
+ ```
52
+
53
+ Validation: dims unique among named; `"time"` ⇒ `indexes["time"]` is a
54
+ `TimeIndex` with `n ==` axis size; non-time named dims may carry a
55
+ `RangeIndex`/`LabelIndex` with matching `n`; anonymous (`None`) axes carry no
56
+ index. Use `tdseries._array.is_tensor/take/concat/array_equal/to_numpy_f64` for
57
+ all data manipulation (they dispatch over the numpy and torch backends).
58
+
59
+ Properties: `shape`, `ndim`, `has_time`, `time_axis`, `tindex` (raises
60
+ `ValueError` if atemporal), `t_start/t_end/duration` (float seconds) and
61
+ `t_start_ticks/t_end_ticks/duration_ticks` (exact; raise `ValueError` if
62
+ atemporal), `dim_index(name)` (explicit or default `RangeIndex(size)`),
63
+ `dim_size(name)`.
64
+
65
+ Accessors (each a tiny helper object with `__getitem__`):
66
+
67
+ - `s.time[a:b]` — **seconds**: floats quantised once via `secs_to_ticks`; an
68
+ `int` here means whole seconds (`seconds_to_ticks_exact`); `None` bounds =
69
+ domain bounds; slice step forbidden (`ValueError`). Delegates to
70
+ `slice_ticks`.
71
+ - `s.ticks[a:b]` — **ticks**: ints only (`TypeError` for floats); `None`
72
+ bounds = domain bounds.
73
+ - `s.slice[dim, sel]` — positional selection along a named non-time dim;
74
+ `sel: int | slice | list[int] | np.ndarray(int|bool)`. An `int` **drops**
75
+ the dim (like numpy). Unknown dim on a Series → `DimensionError`. Using
76
+ `"time"` here → `DimensionError` pointing at `.time`/`.ticks`.
77
+ - `s.sel[dim, label_or_list]` — label selection; requires `LabelIndex` on the
78
+ dim (`DimensionError` otherwise). Scalar label drops the dim; list keeps it
79
+ (absent labels are skipped per `LabelIndex.locate`).
80
+
81
+ Methods:
82
+
83
+ - `slice_ticks(a, b) -> Series` — time-slice: `tindex.slice(a, b)` → apply
84
+ positional selection along the time axis, replace `indexes["time"]`.
85
+ - `shift(dt) -> Series` — `to_ticks` coercion (float = seconds, int = ticks);
86
+ O(1); atemporal series raise `ValueError`.
87
+ - `concat(other) -> Series` — glue along time. Requires
88
+ identical `dims`, matching data-presence, equal non-time indexes
89
+ (`IncompatibleError` otherwise; `RangeIndex` defaults compared by size).
90
+ Time logic delegates to `tindex.concat(other.tindex)` → apply
91
+ `left_sel`/`right_sel` (None = take all) along the time axis, then
92
+ `concat` the data.
93
+ - `take_dim(name, sel) -> Series` — positional selection along a named
94
+ non-time dim; co-updates that dim's stored index via `.take(sel)` (dropped
95
+ index for int sel); int sel removes the dim from `dims`.
96
+ - `interpolate(times, kind="linear", fill="clamp") -> np.ndarray` — ported
97
+ semantics from `UniformSeries.interpolate`/`EventSeries.interpolate`:
98
+ linear only; fill ∈ {"clamp","nan","error"}; query times float seconds or
99
+ int ticks; works for `GridIndex` (grid = `sample_times()`) and `StampIndex`
100
+ (grid = `abs_stamps`); `SpanIndex` → `TypeError`. Generalised to any time
101
+ axis position via `np.moveaxis` (time axis of the result stays where it is
102
+ in `dims`). Data converted with `to_numpy_f64`.
103
+ - `resample(new_sr, kind="linear") -> Series` — ported from
104
+ `UniformSeries.resample` / `EventSeries.interpolate_uniform`: evaluate on a
105
+ fresh phase-0 grid over the same declared domain, result gets a `GridIndex`.
106
+ - `equal(other) -> bool` (exact: dims, indexes `.equal`, data
107
+ `array_equal`), `__eq__` delegating (NotImplemented for non-Series).
108
+ - `with_data(new_data) -> Series` — same shape check; `map_data(fn)` sugar.
109
+
110
+ Factory functions (also in `series.py`, re-exported at top level):
111
+
112
+ ```python
113
+ uniform(data, sr, *, dims=None, t_start=0.0, phase=0.0) # dims default (None,…,"time")
114
+ events(timestamps, values=None, *, dims=None, t_start=None, t_end=None)
115
+ spans(starts, ends, values=None, *, ids=None, t_start=None, t_end=None, dims=None)
116
+ wrap(data, dims=None, indexes=None) # atemporal; dims default all-None
117
+ ```
118
+
119
+ `events`/`spans` with `values=None` build index-only Series (`data=None`,
120
+ `dims=("time",)`). Timestamp/bound arrays: float = seconds (quantised once),
121
+ int = ticks — matching the old `from_events`/`from_segments`.
122
+
123
+ ## `Frame` — the pytree node (`tdseries/frame.py`)
124
+
125
+ Frozen dataclass `Frame(entries, *, t_start=None, t_end=None)` with
126
+ `entries: Mapping[str, Series | Frame | Any]`.
127
+
128
+ Entry kinds:
129
+
130
+ - **temporal** — `Series` with a time dim, or a nested `Frame` that
131
+ (recursively) contains temporal content; a nested frame holding only
132
+ invariant entries (a pure metadata bundle) is itself invariant;
133
+ - **invariant** — atemporal `Series` (e.g. `mic_pos`), or any other Python
134
+ scalar/object (`recording_id`, …). Raw numpy/torch tensors passed in are
135
+ auto-wrapped: `wrap(x)` (all-anonymous dims).
136
+
137
+ Time anchoring (ported from `TimeFrame`): the frame owns the single absolute
138
+ anchor `t_start_ticks` + `dur_ticks` (hull of temporal children, inferred when
139
+ not given; validated to cover them; `(0, 0)` if none). Temporal children are
140
+ stored **relative** (constructor re-bases incoming absolute children by
141
+ `shift(-t_start_ticks)`); `frame[key]` hands back the child re-anchored to
142
+ absolute time (O(1)). Extracting then mutating never touches the parent —
143
+ everything is immutable. Internal `_from_local` classmethod bypasses
144
+ re-basing/validation.
145
+
146
+ **Tree-wide dim validation** (recursively over all Series leaves, `"time"`
147
+ excluded): for each dim name, all `RangeIndex` occurrences must agree in size;
148
+ mixing `RangeIndex` and `LabelIndex` occurrences of one dim is an error
149
+ (`DimensionError`); `LabelIndex` occurrences may differ in size and labels.
150
+
151
+ API:
152
+
153
+ - dict-like: `frame[key]`, `keys/values/items`, `in`, `len`, iteration.
154
+ - column ops: `select(keys)`, `drop(keys)`, `with_entry(name, value)`
155
+ (hull expands as needed), `merge(other, overwrite=False)` (key collisions
156
+ error unless overwrite; hull = union).
157
+ - time ops: `frame.time[a:b]`, `frame.ticks[a:b]`, `slice_ticks(a, b)`,
158
+ `shift(dt)`, `concat(other)`.
159
+ - slice: window must lie inside the frame domain (`DomainError`); each
160
+ temporal child is clipped to its overlap with the window and dropped if
161
+ disjoint; invariant entries pass through.
162
+ - concat: `other` is glued so its domain starts at `self.t_end_ticks`;
163
+ union of keys; temporal∧temporal → child concat (shift other by
164
+ `self.dur_ticks - other.t_start_ticks` relative to frames' anchors, as in
165
+ `TimeFrame.concat`); invariant∧invariant → must be `.equal`/`==`
166
+ (`IncompatibleError` on conflict), keep one; one-sided entries carried
167
+ over (temporal ones shifted); temporal∧invariant → `IncompatibleError`.
168
+ - dim ops (recursive over the tree, non-Series entries untouched, Series
169
+ lacking the dim untouched):
170
+ - `frame.slice[dim, sel]` — positional; `DimensionError` if no leaf has the
171
+ dim, or if occurrences have unequal sizes (→ use `.sel`).
172
+ - `frame.sel[dim, labels]` — all occurrences must carry `LabelIndex`.
173
+ - `"time"` in either → `DimensionError`.
174
+ - `equal(other)` exact (anchors, key sets, children via `.equal`/`==`),
175
+ `__eq__` delegating.
176
+ - `leaves() -> Iterator[tuple[str, Series]]` (dotted paths), `map_data(fn)`
177
+ (apply to every Series leaf's data, e.g. numpy→torch).
178
+
179
+ Frame invariant (property-tested, composed across all leaf kinds):
180
+
181
+ frame.ticks[a:b].concat(frame.ticks[b:c]) == frame.ticks[a:c]
182
+
183
+ ## Operators (decided 2026-07-02)
184
+
185
+ Arithmetic dunders (`+`, `*`, ...) are **reserved for aligned element-wise
186
+ operations** (mixing, gain), matching numpy/xarray intuition — a video-editor
187
+ mix should read `0.5 * a + 0.3 * b`. Concatenation is the named `.concat()`
188
+ method only; `__add__` is NOT concat. Until the align/apply layer lands
189
+ (below), the arithmetic dunders are simply absent.
190
+
191
+ ## Exact rational sample rates (decided 2026-07-02)
192
+
193
+ `GridIndex` stores its rate as a normalized integer fraction
194
+ `(sr_num, sr_den)` — samples per second — not a float. Motivation: framed
195
+ transforms (STFT with hop `h`) produce rates `sr / h` that are non-integer
196
+ but exactly rational; a float rate would demote them to an epsilon-tolerance
197
+ path. With rational rates every grid computation is exact integer
198
+ arithmetic (`divmod` against `TICKS_PER_SECOND * sr_den`), the epsilon
199
+ fallback is deleted outright, and **isomorphism roundtrips preserve the index
200
+ exactly**:
201
+
202
+ inverse_spec(forward_spec(idx)).equal(idx) # wav -> STFT -> iSTFT -> wav
203
+
204
+ Constructors accept `int`, integral `float`, `fractions.Fraction`, or a
205
+ `(num, den)` tuple; non-integral floats are rejected (pass the exact
206
+ fraction instead).
207
+
208
+ ## Transforms, alignment, streaming — v1.1 design (agreed, not yet implemented)
209
+
210
+ Regularizer domains used to pressure-test these decisions: the drone project
211
+ (audio+telemetry), video-editor backend, finance (ticks/bars/as-of joins),
212
+ physiological monitoring (multi-rate biosignals + categorical stages),
213
+ football analytics (LabelIndex joins, ragged player dims), and server
214
+ observability (logs/traces as stamps/spans, windowed aggregation).
215
+
216
+ ### Unary framed transforms
217
+
218
+ The library owns no DSP — only the *time bookkeeping* of transforms. All
219
+ practically relevant grid-warping transforms (STFT, conv/pool stacks,
220
+ resamplers) share one structure: output step `k` depends on the input span
221
+ `[k*hop - left, k*hop - left + win)`. This is captured by a declarative
222
+ spec:
223
+
224
+ ```python
225
+ spec = Framed(win=1024, hop=256, center=True) # causal: left=win-1, right=0
226
+ latents = audio.transform(stft_fn, time=spec, dims=("mic", "freq", "time"))
227
+ ```
228
+
229
+ `transform` runs the user's arbitrary `data -> data` function; the spec is
230
+ used for exactly two derived quantities:
231
+
232
+ 1. **forward map** — `GridIndex(rate, anchor, phase) ->
233
+ GridIndex(rate/hop, anchor', phase')`; frame-center alignment is
234
+ expressed through the existing `phase` mechanism (center=True gives
235
+ output phase -0.5).
236
+ 2. **pullback** — `spec.required_input(a, b)`: the input tick window needed
237
+ to compute output window `[a, b)` (receptive-field arithmetic). This
238
+ powers patch inference and streaming.
239
+
240
+ Specs compose (`spec_a >> spec_b`, standard stride/receptive-field
241
+ composition), so a deep encoder's time semantics is derived, never
242
+ hand-computed. Invertible pairs (STFT/iSTFT) must satisfy the roundtrip
243
+ identity above exactly. New non-time dims (`"freq"`) arrive as ordinary
244
+ named dims (bare `RangeIndex` in v1; the numeric `CoordIndex` is deferred —
245
+ see `ROADMAP.md`).
246
+
247
+ **v1 scope (decided 2026-07-02): `transform` accepts `GridIndex` inputs
248
+ only** (uniform -> uniform). Sliding-window operators over non-uniform
249
+ event series (moving averages over ticks, event-rate counters) go through
250
+ `resample_on(grid)` first; that detour is semantically complete but
251
+ inefficient for sparse events, so a native `FramedEvents` generalization is
252
+ on the roadmap (`ROADMAP.md` § Framed transforms over non-uniform series).
253
+
254
+ ### N-ary operations = align, then broadcast by name, then ufunc
255
+
256
+ Three orthogonal primitives instead of an n-ary op zoo:
257
+
258
+ 1. `align(*series, domain="intersection"|"union", to=<series|index>,
259
+ fill=...)` — the ONLY place where domains/grids are reconciled.
260
+ Union+fill=0 is the video-editor clip mix; `resample_on(other,
261
+ kind="previous")` (zero-order hold) is simultaneously the DAW automation
262
+ curve and the finance as-of join. `interpolate` gains
263
+ `kind="previous"|"nearest"` alongside `"linear"`.
264
+ 2. **Named-dim broadcasting**: dims are matched by NAME, not position —
265
+ `(mic, time) * (time,)` broadcasts, `(mic, time) * (rotor, time)`
266
+ outer-broadcasts to `(mic, rotor, time)`.
267
+ 3. **Strict element-wise application**: `tdseries.apply(fn, *series)` and the
268
+ arithmetic dunders require *identical* time indexes and raise
269
+ `IncompatibleError` otherwise. No silent xarray-style intersection —
270
+ the finance regularizer forbids implicit data loss; magic never crosses
271
+ domain boundaries, only explicit `align` does.
272
+
273
+ ### Streaming
274
+
275
+ Streaming introduces no new ontology; the exactness algebra is the
276
+ correctness proof for chunking:
277
+
278
+ f(x).ticks[a:b] == f(x.ticks[a-left : b+right]).ticks[a:b]
279
+
280
+ with `left`/`right` from the spec pullback — property-testable. Three thin
281
+ pieces make it real:
282
+
283
+ 1. **Lazy leaf protocol** (v1 scope: protocol only): `Series.data` requires
284
+ only `shape`, `dtype`, `ndim`, and `__getitem__` returning an ndarray.
285
+ numpy views already make slicing zero-copy and `np.memmap` works today;
286
+ video decoders (PyAV/decord) plug in later behind a small adapter.
287
+ 2. **Stateless pull execution**: `frame.stream(start, chunk)` yields
288
+ successive `frame.ticks[t : t+chunk)`; a transform in the pipeline pulls
289
+ its `required_input` window per chunk. Stateful streaming (RNN carry)
290
+ is explicitly out of scope — that state belongs to the model.
291
+ 3. **Streaming mixing** is per-chunk `align(union, fill=0)` — no special
292
+ case.
293
+
294
+ ## The motivating example
295
+
296
+ ```python
297
+ frame = Frame({
298
+ "audio": uniform(audio_8xT, sr=44100, dims=("mic", "time")),
299
+ "mic_pos": wrap(pos_8x3, dims=("mic", None)),
300
+ "rps": events(ts, rps_4xM, dims=("rotor", "time")),
301
+ "rotor_pos": wrap(rpos_4x3, dims=("rotor", None)),
302
+ "vad": spans(starts, ends),
303
+ "meta": Frame({"recording_id": "FLY124"}),
304
+ })
305
+
306
+ sub = frame.slice["mic", 0] # audio (T,), mic_pos (3,) — same mic 0
307
+ clip = frame.time[1.0:4.5] # all temporal leaves cut, invariants kept
308
+ one = frame["rps"] # absolute-time Series; frame unchanged
309
+ ```
310
+
311
+ ## Code standards
312
+
313
+ - ruff + pyright (basic) must pass — a PostToolUse hook enforces this on
314
+ every write. **No `# type: ignore`**: if pyright complains, change the
315
+ design (make fields required, hoist optionality into constructors, narrow
316
+ with `isinstance`) rather than suppressing.
317
+ - Frozen dataclasses, exact int64 tick arithmetic, no float tolerances
318
+ outside the documented `phase`/`INDEX_EPSILON` cases.
319
+
320
+ ## Non-goals (v1)
321
+
322
+ - Lazy / disk-backed storage beyond the array protocol; datetime64 interop;
323
+ value-merging of coincident events; pytree registration with torch/jax.
324
+ Deferred directions live in [`ROADMAP.md`](./ROADMAP.md).
@@ -0,0 +1,86 @@
1
+ Metadata-Version: 2.4
2
+ Name: tdseries
3
+ Version: 0.1.0
4
+ Summary: Immutable pytree frames of tensors with named, indexed dimensions — time as a first-class tick-exact dimension
5
+ Author-email: Dmitrii Mukhutdinov <flyingleafe@gmail.com>
6
+ License: MIT
7
+ Requires-Python: >=3.11
8
+ Requires-Dist: numpy>=1.24
9
+ Provides-Extra: torch
10
+ Requires-Dist: torch>=2.0; extra == 'torch'
11
+ Description-Content-Type: text/markdown
12
+
13
+ # tdseries
14
+
15
+ Immutable pytree frames of tensors with **named, indexed dimensions**, where
16
+ **time** is a first-class dimension stored in exact int64 ticks (nanoseconds).
17
+
18
+ Built for bundling media signals with co-recorded telemetry *and* the static
19
+ geometry that shares their non-time dimensions — e.g. multichannel audio
20
+ `(mic, time)` together with microphone positions `(mic, 3)`, rotor speeds
21
+ `(rotor, time)` together with rotor positions `(rotor, 3)`:
22
+
23
+ ```python
24
+ import tdseries as td
25
+
26
+ frame = td.Frame({
27
+ "audio": td.uniform(audio_8xT, sr=44100, dims=("mic", "time")),
28
+ "mic_pos": td.wrap(pos_8x3, dims=("mic", None)),
29
+ "rps": td.events(stamps, rps_4xM, dims=("rotor", "time")),
30
+ "rotor_pos": td.wrap(rpos_4x3, dims=("rotor", None)),
31
+ "vad": td.spans(starts, ends),
32
+ "recording_id": "FLY124",
33
+ })
34
+
35
+ sub = frame.slice["mic", 0] # audio -> (T,), mic_pos -> (3,): same mic
36
+ clip = frame.time[1.0:4.5] # seconds; all temporal leaves cut exactly
37
+ raw = frame.ticks[a:b] # exact int64 tick window
38
+ rps = frame["rps"] # self-contained absolute-time Series
39
+ ```
40
+
41
+ ## The model
42
+
43
+ A *dimension* is a name plus a domain; every tensor axis carrying the name has
44
+ an **index** mapping positions into that domain. Slices are expressed in
45
+ domain coordinates and translated per-tensor — which is how tensors of
46
+ *different lengths* share one dimension:
47
+
48
+ | Index | Domain | Example |
49
+ |---|---|---|
50
+ | `RangeIndex` (default) | positions | `mic`, `rotor` |
51
+ | `LabelIndex` | hashable labels | mic serial numbers |
52
+ | `GridIndex` | int64 ticks, uniform grid + sub-sample phase | audio |
53
+ | `StampIndex` | int64 ticks, sorted point events | RPS, IMU |
54
+ | `SpanIndex` | int64 ticks, half-open intervals with identity tags | VAD |
55
+
56
+ Time is exact: all anchors and cut points are int64 tick counts, `shift` is
57
+ O(1), and the algebra satisfies, at the sample level and without tolerances,
58
+
59
+ ```
60
+ x.ticks[a:b].concat(x.ticks[b:c]) == x.ticks[a:c]
61
+ ```
62
+
63
+ for any tick cut points — including sub-sample cuts through audio and cuts
64
+ through the middle of a VAD span (split spans re-merge via identity tags).
65
+
66
+ Everything is immutable. Extracting an entry from a frame hands back a
67
+ self-contained series re-anchored to absolute time; the frame is untouched.
68
+
69
+ See [`DESIGN.md`](./DESIGN.md) for the full contract.
70
+
71
+ ## Development
72
+
73
+ ```sh
74
+ nix develop # or: uv sync
75
+ uv run pytest
76
+ ```
77
+
78
+ `nix develop` provides Python + uv and pre-commit hooks (ruff, ruff-format,
79
+ pyright). Data leaves may be numpy arrays or torch tensors (torch optional:
80
+ `uv sync --extra torch`).
81
+
82
+ ## Lineage
83
+
84
+ Extracted from the `utils.data` module of the harmonic-noise-suppression
85
+ research project; the tick/phase arithmetic and the slice/concat/shift
86
+ invariants are ported from there, the named-dimension frame model is new.
@@ -0,0 +1,74 @@
1
+ # tdseries
2
+
3
+ Immutable pytree frames of tensors with **named, indexed dimensions**, where
4
+ **time** is a first-class dimension stored in exact int64 ticks (nanoseconds).
5
+
6
+ Built for bundling media signals with co-recorded telemetry *and* the static
7
+ geometry that shares their non-time dimensions — e.g. multichannel audio
8
+ `(mic, time)` together with microphone positions `(mic, 3)`, rotor speeds
9
+ `(rotor, time)` together with rotor positions `(rotor, 3)`:
10
+
11
+ ```python
12
+ import tdseries as td
13
+
14
+ frame = td.Frame({
15
+ "audio": td.uniform(audio_8xT, sr=44100, dims=("mic", "time")),
16
+ "mic_pos": td.wrap(pos_8x3, dims=("mic", None)),
17
+ "rps": td.events(stamps, rps_4xM, dims=("rotor", "time")),
18
+ "rotor_pos": td.wrap(rpos_4x3, dims=("rotor", None)),
19
+ "vad": td.spans(starts, ends),
20
+ "recording_id": "FLY124",
21
+ })
22
+
23
+ sub = frame.slice["mic", 0] # audio -> (T,), mic_pos -> (3,): same mic
24
+ clip = frame.time[1.0:4.5] # seconds; all temporal leaves cut exactly
25
+ raw = frame.ticks[a:b] # exact int64 tick window
26
+ rps = frame["rps"] # self-contained absolute-time Series
27
+ ```
28
+
29
+ ## The model
30
+
31
+ A *dimension* is a name plus a domain; every tensor axis carrying the name has
32
+ an **index** mapping positions into that domain. Slices are expressed in
33
+ domain coordinates and translated per-tensor — which is how tensors of
34
+ *different lengths* share one dimension:
35
+
36
+ | Index | Domain | Example |
37
+ |---|---|---|
38
+ | `RangeIndex` (default) | positions | `mic`, `rotor` |
39
+ | `LabelIndex` | hashable labels | mic serial numbers |
40
+ | `GridIndex` | int64 ticks, uniform grid + sub-sample phase | audio |
41
+ | `StampIndex` | int64 ticks, sorted point events | RPS, IMU |
42
+ | `SpanIndex` | int64 ticks, half-open intervals with identity tags | VAD |
43
+
44
+ Time is exact: all anchors and cut points are int64 tick counts, `shift` is
45
+ O(1), and the algebra satisfies, at the sample level and without tolerances,
46
+
47
+ ```
48
+ x.ticks[a:b].concat(x.ticks[b:c]) == x.ticks[a:c]
49
+ ```
50
+
51
+ for any tick cut points — including sub-sample cuts through audio and cuts
52
+ through the middle of a VAD span (split spans re-merge via identity tags).
53
+
54
+ Everything is immutable. Extracting an entry from a frame hands back a
55
+ self-contained series re-anchored to absolute time; the frame is untouched.
56
+
57
+ See [`DESIGN.md`](./DESIGN.md) for the full contract.
58
+
59
+ ## Development
60
+
61
+ ```sh
62
+ nix develop # or: uv sync
63
+ uv run pytest
64
+ ```
65
+
66
+ `nix develop` provides Python + uv and pre-commit hooks (ruff, ruff-format,
67
+ pyright). Data leaves may be numpy arrays or torch tensors (torch optional:
68
+ `uv sync --extra torch`).
69
+
70
+ ## Lineage
71
+
72
+ Extracted from the `utils.data` module of the harmonic-noise-suppression
73
+ research project; the tick/phase arithmetic and the slice/concat/shift
74
+ invariants are ported from there, the named-dimension frame model is new.