tdseries 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- tdseries-0.1.0/.envrc +3 -0
- tdseries-0.1.0/.github/workflows/checks.yml +36 -0
- tdseries-0.1.0/.github/workflows/ci.yml +10 -0
- tdseries-0.1.0/.github/workflows/publish.yml +38 -0
- tdseries-0.1.0/.gitignore +12 -0
- tdseries-0.1.0/AGENTS.md +45 -0
- tdseries-0.1.0/CLAUDE.md +1 -0
- tdseries-0.1.0/DESIGN.md +324 -0
- tdseries-0.1.0/PKG-INFO +86 -0
- tdseries-0.1.0/README.md +74 -0
- tdseries-0.1.0/ROADMAP.md +86 -0
- tdseries-0.1.0/flake.lock +113 -0
- tdseries-0.1.0/flake.nix +81 -0
- tdseries-0.1.0/pyproject.toml +58 -0
- tdseries-0.1.0/src/tdseries/__init__.py +63 -0
- tdseries-0.1.0/src/tdseries/_array.py +60 -0
- tdseries-0.1.0/src/tdseries/_ticks.py +63 -0
- tdseries-0.1.0/src/tdseries/errors.py +17 -0
- tdseries-0.1.0/src/tdseries/frame.py +529 -0
- tdseries-0.1.0/src/tdseries/indexes.py +729 -0
- tdseries-0.1.0/src/tdseries/series.py +540 -0
- tdseries-0.1.0/tests/__init__.py +1 -0
- tdseries-0.1.0/tests/strategies.py +255 -0
- tdseries-0.1.0/tests/test_dims.py +364 -0
- tdseries-0.1.0/tests/test_frame.py +481 -0
- tdseries-0.1.0/tests/test_grid.py +360 -0
- tdseries-0.1.0/tests/test_span.py +256 -0
- tdseries-0.1.0/tests/test_stamp.py +224 -0
- tdseries-0.1.0/tests/test_ticks.py +143 -0
- tdseries-0.1.0/uv.lock +513 -0
tdseries-0.1.0/.envrc
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
name: Checks
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
workflow_call:
|
|
5
|
+
|
|
6
|
+
jobs:
|
|
7
|
+
lint-typecheck-test:
|
|
8
|
+
runs-on: ubuntu-latest
|
|
9
|
+
env:
|
|
10
|
+
UV_CACHE_DIR: /tmp/.uv-cache
|
|
11
|
+
steps:
|
|
12
|
+
- name: Checkout
|
|
13
|
+
uses: actions/checkout@v7
|
|
14
|
+
|
|
15
|
+
- name: Install Nix
|
|
16
|
+
uses: DeterminateSystems/determinate-nix-action@v3
|
|
17
|
+
|
|
18
|
+
- name: Restore uv cache
|
|
19
|
+
uses: actions/cache@v4
|
|
20
|
+
with:
|
|
21
|
+
path: /tmp/.uv-cache
|
|
22
|
+
key: uv-${{ runner.os }}-${{ hashFiles('uv.lock') }}
|
|
23
|
+
restore-keys: |
|
|
24
|
+
uv-${{ runner.os }}-
|
|
25
|
+
|
|
26
|
+
- name: ruff + ruff-format
|
|
27
|
+
run: nix flake check -L
|
|
28
|
+
|
|
29
|
+
- name: basedpyright
|
|
30
|
+
run: nix develop --command basedpyright
|
|
31
|
+
|
|
32
|
+
- name: Tests
|
|
33
|
+
run: nix develop --command uv run pytest -q
|
|
34
|
+
|
|
35
|
+
- name: Minimize uv cache
|
|
36
|
+
run: nix develop --command uv cache prune --ci
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
name: Publish to PyPI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
tags:
|
|
6
|
+
- "v*"
|
|
7
|
+
|
|
8
|
+
jobs:
|
|
9
|
+
checks:
|
|
10
|
+
uses: ./.github/workflows/checks.yml
|
|
11
|
+
|
|
12
|
+
publish:
|
|
13
|
+
needs: checks
|
|
14
|
+
runs-on: ubuntu-latest
|
|
15
|
+
environment:
|
|
16
|
+
name: pypi
|
|
17
|
+
url: https://pypi.org/project/tdseries/
|
|
18
|
+
permissions:
|
|
19
|
+
id-token: write
|
|
20
|
+
contents: read
|
|
21
|
+
steps:
|
|
22
|
+
- name: Checkout
|
|
23
|
+
uses: actions/checkout@v7
|
|
24
|
+
|
|
25
|
+
- name: Install uv
|
|
26
|
+
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
|
|
27
|
+
with:
|
|
28
|
+
enable-cache: true
|
|
29
|
+
cache-dependency-glob: "uv.lock"
|
|
30
|
+
|
|
31
|
+
- name: Install Python
|
|
32
|
+
run: uv python install 3.12
|
|
33
|
+
|
|
34
|
+
- name: Build
|
|
35
|
+
run: uv build
|
|
36
|
+
|
|
37
|
+
- name: Publish
|
|
38
|
+
run: uv publish
|
tdseries-0.1.0/AGENTS.md
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# tdseries — agent guide
|
|
2
|
+
|
|
3
|
+
Read [`DESIGN.md`](./DESIGN.md) before touching library code: it is the
|
|
4
|
+
binding API contract (dimension/index model, Series and Frame semantics,
|
|
5
|
+
exactness invariants). [`README.md`](./README.md) is the human-facing
|
|
6
|
+
summary; [`ROADMAP.md`](./ROADMAP.md) records deliberately deferred
|
|
7
|
+
directions (CoordIndex, framed transforms over event series, lazy sources,
|
|
8
|
+
element-wise operator layer) — check it before proposing "new" features.
|
|
9
|
+
|
|
10
|
+
## Layout
|
|
11
|
+
|
|
12
|
+
| Path | Role |
|
|
13
|
+
|---|---|
|
|
14
|
+
| `src/tdseries/_ticks.py` | seconds↔int64-tick conversion boundary; `TICKS_PER_SECOND` |
|
|
15
|
+
| `src/tdseries/_array.py` | numpy/torch backend dispatch (`take`, `concat`, `array_equal`, ...) |
|
|
16
|
+
| `src/tdseries/errors.py` | `DomainError`, `IncompatibleError`, `DimensionError` |
|
|
17
|
+
| `src/tdseries/indexes.py` | the index hierarchy: `RangeIndex`, `LabelIndex` (ordinary dims); `GridIndex`, `StampIndex`, `SpanIndex` (time). All exact-tick cut/seam arithmetic lives here. |
|
|
18
|
+
| `src/tdseries/series.py` | `Series` leaf (data + dims + indexes) and factories `uniform`/`events`/`spans`/`wrap` |
|
|
19
|
+
| `src/tdseries/frame.py` | `Frame` pytree node: relative-anchor storage, hull, recursion of time/dim ops |
|
|
20
|
+
| `tests/` | hypothesis property tests; `strategies.py` draws exact int64 tick anchors (small and Unix-magnitude) |
|
|
21
|
+
|
|
22
|
+
## Load-bearing invariants (do not weaken)
|
|
23
|
+
|
|
24
|
+
1. `x.ticks[a:b].concat(x.ticks[b:c]) == x.ticks[a:c]` — exact slice-concat
|
|
25
|
+
identity for every index kind and for frames; property-tested.
|
|
26
|
+
2. All time arithmetic is int64 ticks; the only floats are `GridIndex.phase`
|
|
27
|
+
(bounded by one sample) and the non-integer-sr fallback (`INDEX_EPSILON`).
|
|
28
|
+
3. `shift` is O(1) everywhere: content is stored anchor-relative, only the
|
|
29
|
+
scalar anchor moves. Frames store children relative to the frame anchor;
|
|
30
|
+
extraction re-anchors a copy, never mutates.
|
|
31
|
+
4. No `# type: ignore` / `# noqa` — restructure until ruff+basedpyright pass
|
|
32
|
+
(a PostToolUse hook may enforce this on every write).
|
|
33
|
+
|
|
34
|
+
## Workflow
|
|
35
|
+
|
|
36
|
+
- `uv sync` then `uv run pytest` (hypothesis suite, runs in tens of seconds).
|
|
37
|
+
- `nix develop` for the hooked dev shell (ruff, ruff-format, basedpyright).
|
|
38
|
+
- torch is an optional extra for runtime users (lazy import inside
|
|
39
|
+
`_array.py` only, `src/tdseries` stays importable without it); it's a
|
|
40
|
+
`dev` dependency-group member so basedpyright/tests can see it locally
|
|
41
|
+
and in CI. `[tool.uv.sources]` pins it to the CPU-only wheel index so
|
|
42
|
+
`uv sync` doesn't pull the CUDA/nvidia stack.
|
|
43
|
+
- CI (`.github/workflows/checks.yml`, reused by `ci.yml` and `publish.yml`)
|
|
44
|
+
runs ruff + ruff-format (`nix flake check`), basedpyright, and pytest via
|
|
45
|
+
`nix develop`; publishing to PyPI on tag push is gated on all three passing.
|
tdseries-0.1.0/CLAUDE.md
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
AGENTS.md
|
tdseries-0.1.0/DESIGN.md
ADDED
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
# tdseries — design contract
|
|
2
|
+
|
|
3
|
+
Immutable pytree frames of tensors with **named, indexed dimensions**; time is
|
|
4
|
+
a first-class dimension indexed in exact int64 **ticks** (`TICKS_PER_SECOND =
|
|
5
|
+
1e9`). Successor of `utils.data` (TimeSeries/TimeFrame) in the
|
|
6
|
+
harmonic-noise-suppression project; the tick/phase arithmetic is ported from
|
|
7
|
+
there verbatim (see that repo's `src/utils/data/` for reference semantics and
|
|
8
|
+
`tests/utils/data/` for reference property tests).
|
|
9
|
+
|
|
10
|
+
## Core model
|
|
11
|
+
|
|
12
|
+
A **dimension** is a name plus a *domain*. Each tensor axis carrying that name
|
|
13
|
+
has an **index**: a monotone map `positions 0..n-1 → domain coordinates`.
|
|
14
|
+
Slices are expressed in domain coordinates and translated per-tensor into
|
|
15
|
+
positional selections. This is what lets tensors of *different lengths* share
|
|
16
|
+
a dimension (audio at 44.1 kHz and RPS events at ~100 Hz both have dim
|
|
17
|
+
`"time"`).
|
|
18
|
+
|
|
19
|
+
Index families (implemented in `tdseries/indexes.py`, already written):
|
|
20
|
+
|
|
21
|
+
| Index | Domain | Sizes across a frame | Selection |
|
|
22
|
+
|---|---|---|---|
|
|
23
|
+
| `RangeIndex` (implicit default) | positions | must be equal | positional |
|
|
24
|
+
| `LabelIndex` | hashable labels | may differ | by label (`.sel`) or positional |
|
|
25
|
+
| `GridIndex` | ticks (uniform grid + sub-sample phase) | may differ | `.time` / `.ticks` |
|
|
26
|
+
| `StampIndex` | ticks (sorted point events) | may differ | `.time` / `.ticks` |
|
|
27
|
+
| `SpanIndex` | ticks (half-open intervals + identity ids) | may differ | `.time` / `.ticks` |
|
|
28
|
+
|
|
29
|
+
The dim name `"time"` is reserved: it must carry a `TimeIndex`
|
|
30
|
+
(`GridIndex | StampIndex | SpanIndex`), and a `TimeIndex` may only sit on
|
|
31
|
+
`"time"`. At most one `"time"` dim per tensor.
|
|
32
|
+
|
|
33
|
+
`TimeIndex` methods return `(new_index, positional_selection)` so the caller
|
|
34
|
+
(Series) applies the selection to its data axis. All three uphold, exactly:
|
|
35
|
+
|
|
36
|
+
slice(a, b) ⊕ slice(b, c) == slice(a, c) for t_start ≤ a ≤ b ≤ c ≤ t_end
|
|
37
|
+
|
|
38
|
+
`shift` is O(1) everywhere (only the scalar anchor moves; stored content is
|
|
39
|
+
anchor-relative).
|
|
40
|
+
|
|
41
|
+
## `Series` — the leaf (`tdseries/series.py`)
|
|
42
|
+
|
|
43
|
+
Frozen dataclass, `eq=False`:
|
|
44
|
+
|
|
45
|
+
```python
|
|
46
|
+
Series(
|
|
47
|
+
data, # np.ndarray | torch.Tensor | None (None ⇒ index-only, dims must be ("time",))
|
|
48
|
+
dims, # tuple[str | None, ...], len == data.ndim; None = anonymous axis
|
|
49
|
+
indexes={}, # Mapping[str, DimIndex | TimeIndex]; keys ⊆ named dims
|
|
50
|
+
)
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
Validation: dims unique among named; `"time"` ⇒ `indexes["time"]` is a
|
|
54
|
+
`TimeIndex` with `n ==` axis size; non-time named dims may carry a
|
|
55
|
+
`RangeIndex`/`LabelIndex` with matching `n`; anonymous (`None`) axes carry no
|
|
56
|
+
index. Use `tdseries._array.is_tensor/take/concat/array_equal/to_numpy_f64` for
|
|
57
|
+
all data manipulation (they dispatch over the numpy and torch backends).
|
|
58
|
+
|
|
59
|
+
Properties: `shape`, `ndim`, `has_time`, `time_axis`, `tindex` (raises
|
|
60
|
+
`ValueError` if atemporal), `t_start/t_end/duration` (float seconds) and
|
|
61
|
+
`t_start_ticks/t_end_ticks/duration_ticks` (exact; raise `ValueError` if
|
|
62
|
+
atemporal), `dim_index(name)` (explicit or default `RangeIndex(size)`),
|
|
63
|
+
`dim_size(name)`.
|
|
64
|
+
|
|
65
|
+
Accessors (each a tiny helper object with `__getitem__`):
|
|
66
|
+
|
|
67
|
+
- `s.time[a:b]` — **seconds**: floats quantised once via `secs_to_ticks`; an
|
|
68
|
+
`int` here means whole seconds (`seconds_to_ticks_exact`); `None` bounds =
|
|
69
|
+
domain bounds; slice step forbidden (`ValueError`). Delegates to
|
|
70
|
+
`slice_ticks`.
|
|
71
|
+
- `s.ticks[a:b]` — **ticks**: ints only (`TypeError` for floats); `None`
|
|
72
|
+
bounds = domain bounds.
|
|
73
|
+
- `s.slice[dim, sel]` — positional selection along a named non-time dim;
|
|
74
|
+
`sel: int | slice | list[int] | np.ndarray(int|bool)`. An `int` **drops**
|
|
75
|
+
the dim (like numpy). Unknown dim on a Series → `DimensionError`. Using
|
|
76
|
+
`"time"` here → `DimensionError` pointing at `.time`/`.ticks`.
|
|
77
|
+
- `s.sel[dim, label_or_list]` — label selection; requires `LabelIndex` on the
|
|
78
|
+
dim (`DimensionError` otherwise). Scalar label drops the dim; list keeps it
|
|
79
|
+
(absent labels are skipped per `LabelIndex.locate`).
|
|
80
|
+
|
|
81
|
+
Methods:
|
|
82
|
+
|
|
83
|
+
- `slice_ticks(a, b) -> Series` — time-slice: `tindex.slice(a, b)` → apply
|
|
84
|
+
positional selection along the time axis, replace `indexes["time"]`.
|
|
85
|
+
- `shift(dt) -> Series` — `to_ticks` coercion (float = seconds, int = ticks);
|
|
86
|
+
O(1); atemporal series raise `ValueError`.
|
|
87
|
+
- `concat(other) -> Series` — glue along time. Requires
|
|
88
|
+
identical `dims`, matching data-presence, equal non-time indexes
|
|
89
|
+
(`IncompatibleError` otherwise; `RangeIndex` defaults compared by size).
|
|
90
|
+
Time logic delegates to `tindex.concat(other.tindex)` → apply
|
|
91
|
+
`left_sel`/`right_sel` (None = take all) along the time axis, then
|
|
92
|
+
`concat` the data.
|
|
93
|
+
- `take_dim(name, sel) -> Series` — positional selection along a named
|
|
94
|
+
non-time dim; co-updates that dim's stored index via `.take(sel)` (dropped
|
|
95
|
+
index for int sel); int sel removes the dim from `dims`.
|
|
96
|
+
- `interpolate(times, kind="linear", fill="clamp") -> np.ndarray` — ported
|
|
97
|
+
semantics from `UniformSeries.interpolate`/`EventSeries.interpolate`:
|
|
98
|
+
linear only; fill ∈ {"clamp","nan","error"}; query times float seconds or
|
|
99
|
+
int ticks; works for `GridIndex` (grid = `sample_times()`) and `StampIndex`
|
|
100
|
+
(grid = `abs_stamps`); `SpanIndex` → `TypeError`. Generalised to any time
|
|
101
|
+
axis position via `np.moveaxis` (time axis of the result stays where it is
|
|
102
|
+
in `dims`). Data converted with `to_numpy_f64`.
|
|
103
|
+
- `resample(new_sr, kind="linear") -> Series` — ported from
|
|
104
|
+
`UniformSeries.resample` / `EventSeries.interpolate_uniform`: evaluate on a
|
|
105
|
+
fresh phase-0 grid over the same declared domain, result gets a `GridIndex`.
|
|
106
|
+
- `equal(other) -> bool` (exact: dims, indexes `.equal`, data
|
|
107
|
+
`array_equal`), `__eq__` delegating (NotImplemented for non-Series).
|
|
108
|
+
- `with_data(new_data) -> Series` — same shape check; `map_data(fn)` sugar.
|
|
109
|
+
|
|
110
|
+
Factory functions (also in `series.py`, re-exported at top level):
|
|
111
|
+
|
|
112
|
+
```python
|
|
113
|
+
uniform(data, sr, *, dims=None, t_start=0.0, phase=0.0) # dims default (None,…,"time")
|
|
114
|
+
events(timestamps, values=None, *, dims=None, t_start=None, t_end=None)
|
|
115
|
+
spans(starts, ends, values=None, *, ids=None, t_start=None, t_end=None, dims=None)
|
|
116
|
+
wrap(data, dims=None, indexes=None) # atemporal; dims default all-None
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
`events`/`spans` with `values=None` build index-only Series (`data=None`,
|
|
120
|
+
`dims=("time",)`). Timestamp/bound arrays: float = seconds (quantised once),
|
|
121
|
+
int = ticks — matching the old `from_events`/`from_segments`.
|
|
122
|
+
|
|
123
|
+
## `Frame` — the pytree node (`tdseries/frame.py`)
|
|
124
|
+
|
|
125
|
+
Frozen dataclass `Frame(entries, *, t_start=None, t_end=None)` with
|
|
126
|
+
`entries: Mapping[str, Series | Frame | Any]`.
|
|
127
|
+
|
|
128
|
+
Entry kinds:
|
|
129
|
+
|
|
130
|
+
- **temporal** — `Series` with a time dim, or a nested `Frame` that
|
|
131
|
+
(recursively) contains temporal content; a nested frame holding only
|
|
132
|
+
invariant entries (a pure metadata bundle) is itself invariant;
|
|
133
|
+
- **invariant** — atemporal `Series` (e.g. `mic_pos`), or any other Python
|
|
134
|
+
scalar/object (`recording_id`, …). Raw numpy/torch tensors passed in are
|
|
135
|
+
auto-wrapped: `wrap(x)` (all-anonymous dims).
|
|
136
|
+
|
|
137
|
+
Time anchoring (ported from `TimeFrame`): the frame owns the single absolute
|
|
138
|
+
anchor `t_start_ticks` + `dur_ticks` (hull of temporal children, inferred when
|
|
139
|
+
not given; validated to cover them; `(0, 0)` if none). Temporal children are
|
|
140
|
+
stored **relative** (constructor re-bases incoming absolute children by
|
|
141
|
+
`shift(-t_start_ticks)`); `frame[key]` hands back the child re-anchored to
|
|
142
|
+
absolute time (O(1)). Extracting then mutating never touches the parent —
|
|
143
|
+
everything is immutable. Internal `_from_local` classmethod bypasses
|
|
144
|
+
re-basing/validation.
|
|
145
|
+
|
|
146
|
+
**Tree-wide dim validation** (recursively over all Series leaves, `"time"`
|
|
147
|
+
excluded): for each dim name, all `RangeIndex` occurrences must agree in size;
|
|
148
|
+
mixing `RangeIndex` and `LabelIndex` occurrences of one dim is an error
|
|
149
|
+
(`DimensionError`); `LabelIndex` occurrences may differ in size and labels.
|
|
150
|
+
|
|
151
|
+
API:
|
|
152
|
+
|
|
153
|
+
- dict-like: `frame[key]`, `keys/values/items`, `in`, `len`, iteration.
|
|
154
|
+
- column ops: `select(keys)`, `drop(keys)`, `with_entry(name, value)`
|
|
155
|
+
(hull expands as needed), `merge(other, overwrite=False)` (key collisions
|
|
156
|
+
error unless overwrite; hull = union).
|
|
157
|
+
- time ops: `frame.time[a:b]`, `frame.ticks[a:b]`, `slice_ticks(a, b)`,
|
|
158
|
+
`shift(dt)`, `concat(other)`.
|
|
159
|
+
- slice: window must lie inside the frame domain (`DomainError`); each
|
|
160
|
+
temporal child is clipped to its overlap with the window and dropped if
|
|
161
|
+
disjoint; invariant entries pass through.
|
|
162
|
+
- concat: `other` is glued so its domain starts at `self.t_end_ticks`;
|
|
163
|
+
union of keys; temporal∧temporal → child concat (shift other by
|
|
164
|
+
`self.dur_ticks - other.t_start_ticks` relative to frames' anchors, as in
|
|
165
|
+
`TimeFrame.concat`); invariant∧invariant → must be `.equal`/`==`
|
|
166
|
+
(`IncompatibleError` on conflict), keep one; one-sided entries carried
|
|
167
|
+
over (temporal ones shifted); temporal∧invariant → `IncompatibleError`.
|
|
168
|
+
- dim ops (recursive over the tree, non-Series entries untouched, Series
|
|
169
|
+
lacking the dim untouched):
|
|
170
|
+
- `frame.slice[dim, sel]` — positional; `DimensionError` if no leaf has the
|
|
171
|
+
dim, or if occurrences have unequal sizes (→ use `.sel`).
|
|
172
|
+
- `frame.sel[dim, labels]` — all occurrences must carry `LabelIndex`.
|
|
173
|
+
- `"time"` in either → `DimensionError`.
|
|
174
|
+
- `equal(other)` exact (anchors, key sets, children via `.equal`/`==`),
|
|
175
|
+
`__eq__` delegating.
|
|
176
|
+
- `leaves() -> Iterator[tuple[str, Series]]` (dotted paths), `map_data(fn)`
|
|
177
|
+
(apply to every Series leaf's data, e.g. numpy→torch).
|
|
178
|
+
|
|
179
|
+
Frame invariant (property-tested, composed across all leaf kinds):
|
|
180
|
+
|
|
181
|
+
frame.ticks[a:b].concat(frame.ticks[b:c]) == frame.ticks[a:c]
|
|
182
|
+
|
|
183
|
+
## Operators (decided 2026-07-02)
|
|
184
|
+
|
|
185
|
+
Arithmetic dunders (`+`, `*`, ...) are **reserved for aligned element-wise
|
|
186
|
+
operations** (mixing, gain), matching numpy/xarray intuition — a video-editor
|
|
187
|
+
mix should read `0.5 * a + 0.3 * b`. Concatenation is the named `.concat()`
|
|
188
|
+
method only; `__add__` is NOT concat. Until the align/apply layer lands
|
|
189
|
+
(below), the arithmetic dunders are simply absent.
|
|
190
|
+
|
|
191
|
+
## Exact rational sample rates (decided 2026-07-02)
|
|
192
|
+
|
|
193
|
+
`GridIndex` stores its rate as a normalized integer fraction
|
|
194
|
+
`(sr_num, sr_den)` — samples per second — not a float. Motivation: framed
|
|
195
|
+
transforms (STFT with hop `h`) produce rates `sr / h` that are non-integer
|
|
196
|
+
but exactly rational; a float rate would demote them to an epsilon-tolerance
|
|
197
|
+
path. With rational rates every grid computation is exact integer
|
|
198
|
+
arithmetic (`divmod` against `TICKS_PER_SECOND * sr_den`), the epsilon
|
|
199
|
+
fallback is deleted outright, and **isomorphism roundtrips preserve the index
|
|
200
|
+
exactly**:
|
|
201
|
+
|
|
202
|
+
inverse_spec(forward_spec(idx)).equal(idx) # wav -> STFT -> iSTFT -> wav
|
|
203
|
+
|
|
204
|
+
Constructors accept `int`, integral `float`, `fractions.Fraction`, or a
|
|
205
|
+
`(num, den)` tuple; non-integral floats are rejected (pass the exact
|
|
206
|
+
fraction instead).
|
|
207
|
+
|
|
208
|
+
## Transforms, alignment, streaming — v1.1 design (agreed, not yet implemented)
|
|
209
|
+
|
|
210
|
+
Regularizer domains used to pressure-test these decisions: the drone project
|
|
211
|
+
(audio+telemetry), video-editor backend, finance (ticks/bars/as-of joins),
|
|
212
|
+
physiological monitoring (multi-rate biosignals + categorical stages),
|
|
213
|
+
football analytics (LabelIndex joins, ragged player dims), and server
|
|
214
|
+
observability (logs/traces as stamps/spans, windowed aggregation).
|
|
215
|
+
|
|
216
|
+
### Unary framed transforms
|
|
217
|
+
|
|
218
|
+
The library owns no DSP — only the *time bookkeeping* of transforms. All
|
|
219
|
+
practically relevant grid-warping transforms (STFT, conv/pool stacks,
|
|
220
|
+
resamplers) share one structure: output step `k` depends on the input span
|
|
221
|
+
`[k*hop - left, k*hop - left + win)`. This is captured by a declarative
|
|
222
|
+
spec:
|
|
223
|
+
|
|
224
|
+
```python
|
|
225
|
+
spec = Framed(win=1024, hop=256, center=True) # causal: left=win-1, right=0
|
|
226
|
+
latents = audio.transform(stft_fn, time=spec, dims=("mic", "freq", "time"))
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
`transform` runs the user's arbitrary `data -> data` function; the spec is
|
|
230
|
+
used for exactly two derived quantities:
|
|
231
|
+
|
|
232
|
+
1. **forward map** — `GridIndex(rate, anchor, phase) ->
|
|
233
|
+
GridIndex(rate/hop, anchor', phase')`; frame-center alignment is
|
|
234
|
+
expressed through the existing `phase` mechanism (center=True gives
|
|
235
|
+
output phase -0.5).
|
|
236
|
+
2. **pullback** — `spec.required_input(a, b)`: the input tick window needed
|
|
237
|
+
to compute output window `[a, b)` (receptive-field arithmetic). This
|
|
238
|
+
powers patch inference and streaming.
|
|
239
|
+
|
|
240
|
+
Specs compose (`spec_a >> spec_b`, standard stride/receptive-field
|
|
241
|
+
composition), so a deep encoder's time semantics is derived, never
|
|
242
|
+
hand-computed. Invertible pairs (STFT/iSTFT) must satisfy the roundtrip
|
|
243
|
+
identity above exactly. New non-time dims (`"freq"`) arrive as ordinary
|
|
244
|
+
named dims (bare `RangeIndex` in v1; the numeric `CoordIndex` is deferred —
|
|
245
|
+
see `ROADMAP.md`).
|
|
246
|
+
|
|
247
|
+
**v1 scope (decided 2026-07-02): `transform` accepts `GridIndex` inputs
|
|
248
|
+
only** (uniform -> uniform). Sliding-window operators over non-uniform
|
|
249
|
+
event series (moving averages over ticks, event-rate counters) go through
|
|
250
|
+
`resample_on(grid)` first; that detour is semantically complete but
|
|
251
|
+
inefficient for sparse events, so a native `FramedEvents` generalization is
|
|
252
|
+
on the roadmap (`ROADMAP.md` § Framed transforms over non-uniform series).
|
|
253
|
+
|
|
254
|
+
### N-ary operations = align, then broadcast by name, then ufunc
|
|
255
|
+
|
|
256
|
+
Three orthogonal primitives instead of an n-ary op zoo:
|
|
257
|
+
|
|
258
|
+
1. `align(*series, domain="intersection"|"union", to=<series|index>,
|
|
259
|
+
fill=...)` — the ONLY place where domains/grids are reconciled.
|
|
260
|
+
Union+fill=0 is the video-editor clip mix; `resample_on(other,
|
|
261
|
+
kind="previous")` (zero-order hold) is simultaneously the DAW automation
|
|
262
|
+
curve and the finance as-of join. `interpolate` gains
|
|
263
|
+
`kind="previous"|"nearest"` alongside `"linear"`.
|
|
264
|
+
2. **Named-dim broadcasting**: dims are matched by NAME, not position —
|
|
265
|
+
`(mic, time) * (time,)` broadcasts, `(mic, time) * (rotor, time)`
|
|
266
|
+
outer-broadcasts to `(mic, rotor, time)`.
|
|
267
|
+
3. **Strict element-wise application**: `tdseries.apply(fn, *series)` and the
|
|
268
|
+
arithmetic dunders require *identical* time indexes and raise
|
|
269
|
+
`IncompatibleError` otherwise. No silent xarray-style intersection —
|
|
270
|
+
the finance regularizer forbids implicit data loss; magic never crosses
|
|
271
|
+
domain boundaries, only explicit `align` does.
|
|
272
|
+
|
|
273
|
+
### Streaming
|
|
274
|
+
|
|
275
|
+
Streaming introduces no new ontology; the exactness algebra is the
|
|
276
|
+
correctness proof for chunking:
|
|
277
|
+
|
|
278
|
+
f(x).ticks[a:b] == f(x.ticks[a-left : b+right]).ticks[a:b]
|
|
279
|
+
|
|
280
|
+
with `left`/`right` from the spec pullback — property-testable. Three thin
|
|
281
|
+
pieces make it real:
|
|
282
|
+
|
|
283
|
+
1. **Lazy leaf protocol** (v1 scope: protocol only): `Series.data` requires
|
|
284
|
+
only `shape`, `dtype`, `ndim`, and `__getitem__` returning an ndarray.
|
|
285
|
+
numpy views already make slicing zero-copy and `np.memmap` works today;
|
|
286
|
+
video decoders (PyAV/decord) plug in later behind a small adapter.
|
|
287
|
+
2. **Stateless pull execution**: `frame.stream(start, chunk)` yields
|
|
288
|
+
successive `frame.ticks[t : t+chunk)`; a transform in the pipeline pulls
|
|
289
|
+
its `required_input` window per chunk. Stateful streaming (RNN carry)
|
|
290
|
+
is explicitly out of scope — that state belongs to the model.
|
|
291
|
+
3. **Streaming mixing** is per-chunk `align(union, fill=0)` — no special
|
|
292
|
+
case.
|
|
293
|
+
|
|
294
|
+
## The motivating example
|
|
295
|
+
|
|
296
|
+
```python
|
|
297
|
+
frame = Frame({
|
|
298
|
+
"audio": uniform(audio_8xT, sr=44100, dims=("mic", "time")),
|
|
299
|
+
"mic_pos": wrap(pos_8x3, dims=("mic", None)),
|
|
300
|
+
"rps": events(ts, rps_4xM, dims=("rotor", "time")),
|
|
301
|
+
"rotor_pos": wrap(rpos_4x3, dims=("rotor", None)),
|
|
302
|
+
"vad": spans(starts, ends),
|
|
303
|
+
"meta": Frame({"recording_id": "FLY124"}),
|
|
304
|
+
})
|
|
305
|
+
|
|
306
|
+
sub = frame.slice["mic", 0] # audio (T,), mic_pos (3,) — same mic 0
|
|
307
|
+
clip = frame.time[1.0:4.5] # all temporal leaves cut, invariants kept
|
|
308
|
+
one = frame["rps"] # absolute-time Series; frame unchanged
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
## Code standards
|
|
312
|
+
|
|
313
|
+
- ruff + pyright (basic) must pass — a PostToolUse hook enforces this on
|
|
314
|
+
every write. **No `# type: ignore`**: if pyright complains, change the
|
|
315
|
+
design (make fields required, hoist optionality into constructors, narrow
|
|
316
|
+
with `isinstance`) rather than suppressing.
|
|
317
|
+
- Frozen dataclasses, exact int64 tick arithmetic, no float tolerances
|
|
318
|
+
outside the documented `phase`/`INDEX_EPSILON` cases.
|
|
319
|
+
|
|
320
|
+
## Non-goals (v1)
|
|
321
|
+
|
|
322
|
+
- Lazy / disk-backed storage beyond the array protocol; datetime64 interop;
|
|
323
|
+
value-merging of coincident events; pytree registration with torch/jax.
|
|
324
|
+
Deferred directions live in [`ROADMAP.md`](./ROADMAP.md).
|
tdseries-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: tdseries
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Immutable pytree frames of tensors with named, indexed dimensions — time as a first-class tick-exact dimension
|
|
5
|
+
Author-email: Dmitrii Mukhutdinov <flyingleafe@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Requires-Python: >=3.11
|
|
8
|
+
Requires-Dist: numpy>=1.24
|
|
9
|
+
Provides-Extra: torch
|
|
10
|
+
Requires-Dist: torch>=2.0; extra == 'torch'
|
|
11
|
+
Description-Content-Type: text/markdown
|
|
12
|
+
|
|
13
|
+
# tdseries
|
|
14
|
+
|
|
15
|
+
Immutable pytree frames of tensors with **named, indexed dimensions**, where
|
|
16
|
+
**time** is a first-class dimension stored in exact int64 ticks (nanoseconds).
|
|
17
|
+
|
|
18
|
+
Built for bundling media signals with co-recorded telemetry *and* the static
|
|
19
|
+
geometry that shares their non-time dimensions — e.g. multichannel audio
|
|
20
|
+
`(mic, time)` together with microphone positions `(mic, 3)`, rotor speeds
|
|
21
|
+
`(rotor, time)` together with rotor positions `(rotor, 3)`:
|
|
22
|
+
|
|
23
|
+
```python
|
|
24
|
+
import tdseries as td
|
|
25
|
+
|
|
26
|
+
frame = td.Frame({
|
|
27
|
+
"audio": td.uniform(audio_8xT, sr=44100, dims=("mic", "time")),
|
|
28
|
+
"mic_pos": td.wrap(pos_8x3, dims=("mic", None)),
|
|
29
|
+
"rps": td.events(stamps, rps_4xM, dims=("rotor", "time")),
|
|
30
|
+
"rotor_pos": td.wrap(rpos_4x3, dims=("rotor", None)),
|
|
31
|
+
"vad": td.spans(starts, ends),
|
|
32
|
+
"recording_id": "FLY124",
|
|
33
|
+
})
|
|
34
|
+
|
|
35
|
+
sub = frame.slice["mic", 0] # audio -> (T,), mic_pos -> (3,): same mic
|
|
36
|
+
clip = frame.time[1.0:4.5] # seconds; all temporal leaves cut exactly
|
|
37
|
+
raw = frame.ticks[a:b] # exact int64 tick window
|
|
38
|
+
rps = frame["rps"] # self-contained absolute-time Series
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
## The model
|
|
42
|
+
|
|
43
|
+
A *dimension* is a name plus a domain; every tensor axis carrying the name has
|
|
44
|
+
an **index** mapping positions into that domain. Slices are expressed in
|
|
45
|
+
domain coordinates and translated per-tensor — which is how tensors of
|
|
46
|
+
*different lengths* share one dimension:
|
|
47
|
+
|
|
48
|
+
| Index | Domain | Example |
|
|
49
|
+
|---|---|---|
|
|
50
|
+
| `RangeIndex` (default) | positions | `mic`, `rotor` |
|
|
51
|
+
| `LabelIndex` | hashable labels | mic serial numbers |
|
|
52
|
+
| `GridIndex` | int64 ticks, uniform grid + sub-sample phase | audio |
|
|
53
|
+
| `StampIndex` | int64 ticks, sorted point events | RPS, IMU |
|
|
54
|
+
| `SpanIndex` | int64 ticks, half-open intervals with identity tags | VAD |
|
|
55
|
+
|
|
56
|
+
Time is exact: all anchors and cut points are int64 tick counts, `shift` is
|
|
57
|
+
O(1), and the algebra satisfies, at the sample level and without tolerances,
|
|
58
|
+
|
|
59
|
+
```
|
|
60
|
+
x.ticks[a:b].concat(x.ticks[b:c]) == x.ticks[a:c]
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
for any tick cut points — including sub-sample cuts through audio and cuts
|
|
64
|
+
through the middle of a VAD span (split spans re-merge via identity tags).
|
|
65
|
+
|
|
66
|
+
Everything is immutable. Extracting an entry from a frame hands back a
|
|
67
|
+
self-contained series re-anchored to absolute time; the frame is untouched.
|
|
68
|
+
|
|
69
|
+
See [`DESIGN.md`](./DESIGN.md) for the full contract.
|
|
70
|
+
|
|
71
|
+
## Development
|
|
72
|
+
|
|
73
|
+
```sh
|
|
74
|
+
nix develop # or: uv sync
|
|
75
|
+
uv run pytest
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
`nix develop` provides Python + uv and pre-commit hooks (ruff, ruff-format,
|
|
79
|
+
pyright). Data leaves may be numpy arrays or torch tensors (torch optional:
|
|
80
|
+
`uv sync --extra torch`).
|
|
81
|
+
|
|
82
|
+
## Lineage
|
|
83
|
+
|
|
84
|
+
Extracted from the `utils.data` module of the harmonic-noise-suppression
|
|
85
|
+
research project; the tick/phase arithmetic and the slice/concat/shift
|
|
86
|
+
invariants are ported from there, the named-dimension frame model is new.
|
tdseries-0.1.0/README.md
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# tdseries
|
|
2
|
+
|
|
3
|
+
Immutable pytree frames of tensors with **named, indexed dimensions**, where
|
|
4
|
+
**time** is a first-class dimension stored in exact int64 ticks (nanoseconds).
|
|
5
|
+
|
|
6
|
+
Built for bundling media signals with co-recorded telemetry *and* the static
|
|
7
|
+
geometry that shares their non-time dimensions — e.g. multichannel audio
|
|
8
|
+
`(mic, time)` together with microphone positions `(mic, 3)`, rotor speeds
|
|
9
|
+
`(rotor, time)` together with rotor positions `(rotor, 3)`:
|
|
10
|
+
|
|
11
|
+
```python
|
|
12
|
+
import tdseries as td
|
|
13
|
+
|
|
14
|
+
frame = td.Frame({
|
|
15
|
+
"audio": td.uniform(audio_8xT, sr=44100, dims=("mic", "time")),
|
|
16
|
+
"mic_pos": td.wrap(pos_8x3, dims=("mic", None)),
|
|
17
|
+
"rps": td.events(stamps, rps_4xM, dims=("rotor", "time")),
|
|
18
|
+
"rotor_pos": td.wrap(rpos_4x3, dims=("rotor", None)),
|
|
19
|
+
"vad": td.spans(starts, ends),
|
|
20
|
+
"recording_id": "FLY124",
|
|
21
|
+
})
|
|
22
|
+
|
|
23
|
+
sub = frame.slice["mic", 0] # audio -> (T,), mic_pos -> (3,): same mic
|
|
24
|
+
clip = frame.time[1.0:4.5] # seconds; all temporal leaves cut exactly
|
|
25
|
+
raw = frame.ticks[a:b] # exact int64 tick window
|
|
26
|
+
rps = frame["rps"] # self-contained absolute-time Series
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## The model
|
|
30
|
+
|
|
31
|
+
A *dimension* is a name plus a domain; every tensor axis carrying the name has
|
|
32
|
+
an **index** mapping positions into that domain. Slices are expressed in
|
|
33
|
+
domain coordinates and translated per-tensor — which is how tensors of
|
|
34
|
+
*different lengths* share one dimension:
|
|
35
|
+
|
|
36
|
+
| Index | Domain | Example |
|
|
37
|
+
|---|---|---|
|
|
38
|
+
| `RangeIndex` (default) | positions | `mic`, `rotor` |
|
|
39
|
+
| `LabelIndex` | hashable labels | mic serial numbers |
|
|
40
|
+
| `GridIndex` | int64 ticks, uniform grid + sub-sample phase | audio |
|
|
41
|
+
| `StampIndex` | int64 ticks, sorted point events | RPS, IMU |
|
|
42
|
+
| `SpanIndex` | int64 ticks, half-open intervals with identity tags | VAD |
|
|
43
|
+
|
|
44
|
+
Time is exact: all anchors and cut points are int64 tick counts, `shift` is
|
|
45
|
+
O(1), and the algebra satisfies, at the sample level and without tolerances,
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
x.ticks[a:b].concat(x.ticks[b:c]) == x.ticks[a:c]
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
for any tick cut points — including sub-sample cuts through audio and cuts
|
|
52
|
+
through the middle of a VAD span (split spans re-merge via identity tags).
|
|
53
|
+
|
|
54
|
+
Everything is immutable. Extracting an entry from a frame hands back a
|
|
55
|
+
self-contained series re-anchored to absolute time; the frame is untouched.
|
|
56
|
+
|
|
57
|
+
See [`DESIGN.md`](./DESIGN.md) for the full contract.
|
|
58
|
+
|
|
59
|
+
## Development
|
|
60
|
+
|
|
61
|
+
```sh
|
|
62
|
+
nix develop # or: uv sync
|
|
63
|
+
uv run pytest
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
`nix develop` provides Python + uv and pre-commit hooks (ruff, ruff-format,
|
|
67
|
+
pyright). Data leaves may be numpy arrays or torch tensors (torch optional:
|
|
68
|
+
`uv sync --extra torch`).
|
|
69
|
+
|
|
70
|
+
## Lineage
|
|
71
|
+
|
|
72
|
+
Extracted from the `utils.data` module of the harmonic-noise-suppression
|
|
73
|
+
research project; the tick/phase arithmetic and the slice/concat/shift
|
|
74
|
+
invariants are ported from there, the named-dimension frame model is new.
|