PyPI - torchrir - Versions diffs - 0.1.0__tar.gz → 0.1.4__tar.gz - Mend

torchrir 0.1.0tar.gz → 0.1.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

torchrir-0.1.4/PKG-INFO ADDED Viewed

@@ -0,0 +1,70 @@
+Metadata-Version: 2.4
+Name: torchrir
+Version: 0.1.4
+Summary: PyTorch-based room impulse response (RIR) simulation toolkit for static and dynamic scenes.
+Project-URL: Repository, https://github.com/taishi-n/torchrir
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+License-File: LICENSE
+License-File: NOTICE
+Requires-Dist: numpy>=2.2.6
+Requires-Dist: torch>=2.10.0
+Dynamic: license-file
+# TorchRIR
+PyTorch-based room impulse response (RIR) simulation toolkit focused on a clean, modern API with GPU support.
+This project has been substantially assisted by AI using Codex.
+## Installation
+```bash
+pip install torchrir
+```
+## Examples
+- `examples/static.py`: fixed sources/mics with binaural output.
+  `uv run python examples/static.py --plot`
+- `examples/dynamic_src.py`: moving sources, fixed mics.
+  `uv run python examples/dynamic_src.py --plot`
+- `examples/dynamic_mic.py`: fixed sources, moving mics.
+  `uv run python examples/dynamic_mic.py --plot`
+- `examples/cli.py`: unified CLI for static/dynamic scenes, JSON/YAML configs.
+  `uv run python examples/cli.py --mode static --plot`
+- `examples/cmu_arctic_dynamic_dataset.py`: small dynamic dataset generator (fixed room/mics, randomized source motion).
+  `uv run python examples/cmu_arctic_dynamic_dataset.py --num-scenes 4 --num-sources 2`
+- `examples/benchmark_device.py`: CPU/GPU benchmark for RIR simulation.
+  `uv run python examples/benchmark_device.py --dynamic`
+## Core API Overview
+- Geometry: `Room`, `Source`, `MicrophoneArray`
+- Static RIR: `simulate_rir`
+- Dynamic RIR: `simulate_dynamic_rir`
+- Dynamic convolution: `DynamicConvolver`
+- Metadata export: `build_metadata`, `save_metadata_json`
+```python
+from torchrir import DynamicConvolver, MicrophoneArray, Room, Source, simulate_rir
+room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
+sources = Source.from_positions([[1.0, 2.0, 1.5]])
+mics = MicrophoneArray.from_positions([[2.0, 2.0, 1.5]])
+rir = simulate_rir(room=room, sources=sources, mics=mics, max_order=6, tmax=0.3)
+# For dynamic scenes, compute rirs with simulate_dynamic_rir and convolve:
+# y = DynamicConvolver(mode="trajectory").convolve(signal, rirs)
+```
+For detailed documentation, see the docs under `docs/` and Read the Docs.
+## Future Work
+- Ray tracing backend: implement `RayTracingSimulator` with frequency-dependent absorption/scattering.
+- CUDA-native acceleration: introduce dedicated CUDA kernels for large-scale RIR generation.
+- Dataset expansion: add additional dataset integrations beyond CMU ARCTIC (see `TemplateDataset`).
+- Add regression tests comparing generated RIRs against gpuRIR outputs.
+## Related Libraries
+- [gpuRIR](https://github.com/DavidDiazGuerra/gpuRIR)
+- [Cross3D](https://github.com/DavidDiazGuerra/Cross3D)
+- [pyroomacoustics](https://github.com/LCAV/pyroomacoustics)
+- [das-generator](https://github.com/ehabets/das-generator)
+- [rir-generator](https://github.com/audiolabs/rir-generator)

torchrir-0.1.4/README.md ADDED Viewed

@@ -0,0 +1,57 @@
+# TorchRIR
+PyTorch-based room impulse response (RIR) simulation toolkit focused on a clean, modern API with GPU support.
+This project has been substantially assisted by AI using Codex.
+## Installation
+```bash
+pip install torchrir
+```
+## Examples
+- `examples/static.py`: fixed sources/mics with binaural output.
+  `uv run python examples/static.py --plot`
+- `examples/dynamic_src.py`: moving sources, fixed mics.
+  `uv run python examples/dynamic_src.py --plot`
+- `examples/dynamic_mic.py`: fixed sources, moving mics.
+  `uv run python examples/dynamic_mic.py --plot`
+- `examples/cli.py`: unified CLI for static/dynamic scenes, JSON/YAML configs.
+  `uv run python examples/cli.py --mode static --plot`
+- `examples/cmu_arctic_dynamic_dataset.py`: small dynamic dataset generator (fixed room/mics, randomized source motion).
+  `uv run python examples/cmu_arctic_dynamic_dataset.py --num-scenes 4 --num-sources 2`
+- `examples/benchmark_device.py`: CPU/GPU benchmark for RIR simulation.
+  `uv run python examples/benchmark_device.py --dynamic`
+## Core API Overview
+- Geometry: `Room`, `Source`, `MicrophoneArray`
+- Static RIR: `simulate_rir`
+- Dynamic RIR: `simulate_dynamic_rir`
+- Dynamic convolution: `DynamicConvolver`
+- Metadata export: `build_metadata`, `save_metadata_json`
+```python
+from torchrir import DynamicConvolver, MicrophoneArray, Room, Source, simulate_rir
+room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
+sources = Source.from_positions([[1.0, 2.0, 1.5]])
+mics = MicrophoneArray.from_positions([[2.0, 2.0, 1.5]])
+rir = simulate_rir(room=room, sources=sources, mics=mics, max_order=6, tmax=0.3)
+# For dynamic scenes, compute rirs with simulate_dynamic_rir and convolve:
+# y = DynamicConvolver(mode="trajectory").convolve(signal, rirs)
+```
+For detailed documentation, see the docs under `docs/` and Read the Docs.
+## Future Work
+- Ray tracing backend: implement `RayTracingSimulator` with frequency-dependent absorption/scattering.
+- CUDA-native acceleration: introduce dedicated CUDA kernels for large-scale RIR generation.
+- Dataset expansion: add additional dataset integrations beyond CMU ARCTIC (see `TemplateDataset`).
+- Add regression tests comparing generated RIRs against gpuRIR outputs.
+## Related Libraries
+- [gpuRIR](https://github.com/DavidDiazGuerra/gpuRIR)
+- [Cross3D](https://github.com/DavidDiazGuerra/Cross3D)
+- [pyroomacoustics](https://github.com/LCAV/pyroomacoustics)
+- [das-generator](https://github.com/ehabets/das-generator)
+- [rir-generator](https://github.com/audiolabs/rir-generator)

torchrir-0.1.4/pyproject.toml ADDED Viewed

@@ -0,0 +1,37 @@
+[project]
+name = "torchrir"
+version = "0.1.4"
+description = "PyTorch-based room impulse response (RIR) simulation toolkit for static and dynamic scenes."
+readme = "README.md"
+requires-python = ">=3.10"
+dependencies = [
+    "numpy>=2.2.6",
+    "torch>=2.10.0",
+]
+[project.urls]
+Repository = "https://github.com/taishi-n/torchrir"
+[dependency-groups]
+dev = [
+    "commitizen>=3.29.0",
+    "git-cliff>=2.10.1",
+    "matplotlib>=3.10.8",
+    "ruff>=0.12.2",
+    "pillow>=11.2.1",
+    "pyroomacoustics>=0.9.0",
+    "pytest>=9.0.2",
+    "soundfile>=0.13.1",
+    "sphinx>=7.0,<8.2.3",
+    "sphinx-rtd-theme>=2.0.0",
+    "myst-parser>=2.0,<4.0",
+    "ty>=0.0.14,<0.1",
+]
+[tool.commitizen]
+name = "cz_conventional_commits"
+tag_format = "v$version"
+version_scheme = "pep440"
+version_provider = "pep621"
+update_changelog_on_bump = true
+changelog_file = "CHANGELOG.md"

{torchrir-0.1.0 → torchrir-0.1.4}/src/torchrir/__init__.py RENAMED Viewed

@@ -4,6 +4,8 @@ from .config import SimulationConfig, default_config
 from .core import simulate_dynamic_rir, simulate_rir
 from .dynamic import DynamicConvolver
 from .logging_utils import LoggingConfig, get_logger, setup_logging
+from .animation import animate_scene_gif
+from .metadata import build_metadata, save_metadata_json
 from .plotting import plot_scene_dynamic, plot_scene_static
 from .plotting_utils import plot_scene_and_save
 from .room import MicrophoneArray, Room, Source
@@ -24,7 +26,12 @@ from .datasets import (
     load_wav_mono,
     save_wav,
 )
-from .scene_utils import binaural_mic_positions, clamp_positions, linear_trajectory, sample_positions
+from .scene_utils import (
+    binaural_mic_positions,
+    clamp_positions,
+    linear_trajectory,
+    sample_positions,
+)
 from .utils import (
     att2t_SabineEstimation,
     att2t_sabine_estimation,
@@ -61,6 +68,8 @@ __all__ = [
     "get_logger",
     "list_cmu_arctic_speakers",
     "LoggingConfig",
+    "animate_scene_gif",
+    "build_metadata",
     "resolve_device",
     "SentenceLike",
     "load_dataset_sources",
@@ -75,6 +84,7 @@ __all__ = [
     "plot_scene_and_save",
     "plot_scene_static",
     "save_wav",
+    "save_metadata_json",
     "Scene",
     "setup_logging",
     "SimulationConfig",

torchrir-0.1.4/src/torchrir/animation.py ADDED Viewed

@@ -0,0 +1,175 @@
+from __future__ import annotations
+"""Animation helpers for dynamic scenes."""
+from pathlib import Path
+from typing import Optional, Sequence
+import torch
+from .plotting_utils import _positions_to_cpu, _to_cpu, _traj_steps, _trajectory_to_cpu
+def animate_scene_gif(
+    *,
+    out_path: Path,
+    room: Sequence[float] | torch.Tensor,
+    sources: object | torch.Tensor | Sequence,
+    mics: object | torch.Tensor | Sequence,
+    src_traj: Optional[torch.Tensor | Sequence] = None,
+    mic_traj: Optional[torch.Tensor | Sequence] = None,
+    step: int = 1,
+    fps: Optional[float] = None,
+    signal_len: Optional[int] = None,
+    fs: Optional[float] = None,
+    duration_s: Optional[float] = None,
+    plot_2d: bool = True,
+    plot_3d: bool = False,
+) -> Path:
+    """Render a GIF showing source/mic trajectories.
+    Args:
+        out_path: Destination GIF path.
+        room: Room size tensor or sequence.
+        sources: Source positions or Source-like object.
+        mics: Microphone positions or MicrophoneArray-like object.
+        src_traj: Optional source trajectory (T, n_src, dim).
+        mic_traj: Optional mic trajectory (T, n_mic, dim).
+        step: Subsampling step for trajectories.
+        fps: Frames per second for the GIF (auto if None).
+        signal_len: Optional signal length (samples) to infer elapsed time.
+        fs: Sample rate used with signal_len.
+        duration_s: Optional total duration in seconds (overrides signal_len/fs).
+        plot_2d: Use 2D projection if True.
+        plot_3d: Use 3D projection if True and dim == 3.
+    Returns:
+        The output path.
+    Example:
+        >>> animate_scene_gif(
+        ...     out_path=Path("outputs/scene.gif"),
+        ...     room=[6.0, 4.0, 3.0],
+        ...     sources=[[1.0, 2.0, 1.5]],
+        ...     mics=[[2.0, 2.0, 1.5]],
+        ...     src_traj=src_traj,
+        ...     mic_traj=mic_traj,
+        ...     signal_len=16000,
+        ...     fs=16000,
+        ... )
+    """
+    import matplotlib.pyplot as plt
+    from matplotlib import animation
+    out_path = Path(out_path)
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    room_size = _to_cpu(room)
+    src_pos = _positions_to_cpu(sources)
+    mic_pos = _positions_to_cpu(mics)
+    dim = int(room_size.numel())
+    view_dim = 3 if (plot_3d and dim == 3) else 2
+    view_room = room_size[:view_dim]
+    view_src = src_pos[:, :view_dim]
+    view_mic = mic_pos[:, :view_dim]
+    if src_traj is None and mic_traj is None:
+        raise ValueError("at least one trajectory is required for animation")
+    steps = _traj_steps(src_traj, mic_traj)
+    src_traj = _trajectory_to_cpu(src_traj, src_pos, steps)
+    mic_traj = _trajectory_to_cpu(mic_traj, mic_pos, steps)
+    view_src_traj = src_traj[:, :, :view_dim]
+    view_mic_traj = mic_traj[:, :, :view_dim]
+    if view_dim == 3:
+        fig = plt.figure()
+        ax = fig.add_subplot(111, projection="3d")
+        ax.set_xlim(0, view_room[0].item())
+        ax.set_ylim(0, view_room[1].item())
+        ax.set_zlim(0, view_room[2].item())
+        ax.set_xlabel("x")
+        ax.set_ylabel("y")
+        ax.set_zlabel("z")
+    else:
+        fig, ax = plt.subplots()
+        ax.set_xlim(0, view_room[0].item())
+        ax.set_ylim(0, view_room[1].item())
+        ax.set_aspect("equal", adjustable="box")
+        ax.set_xlabel("x")
+        ax.set_ylabel("y")
+    src_scatter = ax.scatter([], [], marker="^", color="tab:green", label="sources")
+    mic_scatter = ax.scatter([], [], marker="o", color="tab:orange", label="mics")
+    src_lines = []
+    mic_lines = []
+    for _ in range(view_src_traj.shape[1]):
+        if view_dim == 2:
+            (line,) = ax.plot([], [], color="tab:green", alpha=0.6)
+        else:
+            (line,) = ax.plot([], [], [], color="tab:green", alpha=0.6)
+        src_lines.append(line)
+    for _ in range(view_mic_traj.shape[1]):
+        if view_dim == 2:
+            (line,) = ax.plot([], [], color="tab:orange", alpha=0.6)
+        else:
+            (line,) = ax.plot([], [], [], color="tab:orange", alpha=0.6)
+        mic_lines.append(line)
+    ax.legend(loc="best")
+    if duration_s is None and signal_len is not None and fs is not None:
+        duration_s = float(signal_len) / float(fs)
+    def _frame(i: int):
+        idx = min(i * step, view_src_traj.shape[0] - 1)
+        src_frame = view_src_traj[: idx + 1]
+        mic_frame = view_mic_traj[: idx + 1]
+        src_pos_frame = view_src_traj[idx]
+        mic_pos_frame = view_mic_traj[idx]
+        if view_dim == 2:
+            src_scatter.set_offsets(src_pos_frame)
+            mic_scatter.set_offsets(mic_pos_frame)
+            for s_idx, line in enumerate(src_lines):
+                xy = src_frame[:, s_idx, :]
+                line.set_data(xy[:, 0], xy[:, 1])
+            for m_idx, line in enumerate(mic_lines):
+                xy = mic_frame[:, m_idx, :]
+                line.set_data(xy[:, 0], xy[:, 1])
+        else:
+            setattr(
+                src_scatter,
+                "_offsets3d",
+                (src_pos_frame[:, 0], src_pos_frame[:, 1], src_pos_frame[:, 2]),
+            )
+            setattr(
+                mic_scatter,
+                "_offsets3d",
+                (mic_pos_frame[:, 0], mic_pos_frame[:, 1], mic_pos_frame[:, 2]),
+            )
+            for s_idx, line in enumerate(src_lines):
+                xyz = src_frame[:, s_idx, :]
+                line.set_data(xyz[:, 0], xyz[:, 1])
+                line.set_3d_properties(xyz[:, 2])
+            for m_idx, line in enumerate(mic_lines):
+                xyz = mic_frame[:, m_idx, :]
+                line.set_data(xyz[:, 0], xyz[:, 1])
+                line.set_3d_properties(xyz[:, 2])
+        if duration_s is not None and steps > 1:
+            t = (idx / (steps - 1)) * duration_s
+            ax.set_title(f"t = {t:.2f} s")
+        return [src_scatter, mic_scatter, *src_lines, *mic_lines]
+    frames = max(1, (view_src_traj.shape[0] + step - 1) // step)
+    if fps is None or fps <= 0:
+        if duration_s is not None and duration_s > 0:
+            fps = frames / duration_s
+        else:
+            fps = 6.0
+    anim = animation.FuncAnimation(
+        fig, _frame, frames=frames, interval=1000 / fps, blit=False
+    )
+    fps_int = None if fps is None else max(1, int(round(fps)))
+    anim.save(out_path, writer="pillow", fps=fps_int)
+    plt.close(fig)
+    return out_path

{torchrir-0.1.0 → torchrir-0.1.4}/src/torchrir/config.py RENAMED Viewed

@@ -10,7 +10,12 @@ import torch
 @dataclass(frozen=True)
 class SimulationConfig:
-    """Configuration values for RIR simulation and convolution."""
+    """Configuration values for RIR simulation and convolution.
+    Example:
+        >>> cfg = SimulationConfig(max_order=6, tmax=0.3, device="auto")
+        >>> cfg.validate()
+    """
     fs: Optional[float] = None
     max_order: Optional[int] = None
@@ -53,7 +58,11 @@ class SimulationConfig:
 def default_config() -> SimulationConfig:
-    """Return the default simulation configuration."""
+    """Return the default simulation configuration.
+    Example:
+        >>> cfg = default_config()
+    """
     cfg = SimulationConfig()
     cfg.validate()
     return cfg

{torchrir-0.1.0 → torchrir-0.1.4}/src/torchrir/core.py RENAMED Viewed

@@ -3,6 +3,7 @@ from __future__ import annotations
 """Core RIR simulation functions (static and dynamic)."""
 import math
+from collections.abc import Callable
 from typing import Optional, Tuple
 import torch
@@ -58,6 +59,18 @@ def simulate_rir(
     Returns:
         Tensor of shape (n_src, n_mic, nsample).
+    Example:
+        >>> room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
+        >>> sources = Source.from_positions([[1.0, 2.0, 1.5]])
+        >>> mics = MicrophoneArray.from_positions([[2.0, 2.0, 1.5]])
+        >>> rir = simulate_rir(
+        ...     room=room,
+        ...     sources=sources,
+        ...     mics=mics,
+        ...     max_order=6,
+        ...     tmax=0.3,
+        ... )
     """
     cfg = config or default_config()
     cfg.validate()
@@ -78,9 +91,9 @@ def simulate_rir(
     if not isinstance(room, Room):
         raise TypeError("room must be a Room instance")
-    if nsample is None and tmax is None:
-        raise ValueError("nsample or tmax must be provided")
     if nsample is None:
+        if tmax is None:
+            raise ValueError("nsample or tmax must be provided")
         nsample = int(math.ceil(tmax * room.fs))
     if nsample <= 0:
         raise ValueError("nsample must be positive")
@@ -208,6 +221,24 @@ def simulate_dynamic_rir(
     Returns:
         Tensor of shape (T, n_src, n_mic, nsample).
+    Example:
+        >>> room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
+        >>> from torchrir import linear_trajectory
+        >>> src_traj = torch.stack(
+        ...     [linear_trajectory(torch.tensor([1.0, 2.0, 1.5]),
+        ...                        torch.tensor([4.0, 2.0, 1.5]), 8)],
+        ...     dim=1,
+        ... )
+        >>> mic_pos = torch.tensor([[2.0, 2.0, 1.5]])
+        >>> mic_traj = mic_pos.unsqueeze(0).repeat(8, 1, 1)
+        >>> rirs = simulate_dynamic_rir(
+        ...     room=room,
+        ...     src_traj=src_traj,
+        ...     mic_traj=mic_traj,
+        ...     max_order=4,
+        ...     tmax=0.3,
+        ... )
     """
     cfg = config or default_config()
     cfg.validate()
@@ -465,7 +496,11 @@ def _compute_image_contributions_batch(
     if mic_pattern != "omni":
         if mic_dir is None:
             raise ValueError("mic orientation required for non-omni directivity")
-        mic_dir = mic_dir[None, :, None, :] if mic_dir.ndim == 2 else mic_dir.view(1, 1, 1, -1)
+        mic_dir = (
+            mic_dir[None, :, None, :]
+            if mic_dir.ndim == 2
+            else mic_dir.view(1, 1, 1, -1)
+        )
         cos_theta = _cos_between(-vec, mic_dir)
         gain = gain * directivity_gain(mic_pattern, cos_theta)
@@ -512,9 +547,9 @@ def _accumulate_rir(
     if use_lut:
         sinc_lut = _get_sinc_lut(fdl, lut_gran, device=rir.device, dtype=dtype)
-    mic_offsets = (torch.arange(n_mic, device=rir.device, dtype=torch.int64) * nsample).view(
-        n_mic, 1, 1
-    )
+    mic_offsets = (
+        torch.arange(n_mic, device=rir.device, dtype=torch.int64) * nsample
+    ).view(n_mic, 1, 1)
     rir_flat = rir.view(-1)
     chunk_size = cfg.accumulate_chunk_size
@@ -529,7 +564,9 @@ def _accumulate_rir(
             x_off_frac = (1.0 - frac_m) * lut_gran
             lut_gran_off = torch.floor(x_off_frac).to(torch.int64)
             x_off = x_off_frac - lut_gran_off.to(dtype)
-            lut_pos = lut_gran_off[..., None] + (n[None, None, :].to(torch.int64) * lut_gran)
+            lut_pos = lut_gran_off[..., None] + (
+                n[None, None, :].to(torch.int64) * lut_gran
+            )
             s0 = torch.take(sinc_lut, lut_pos)
             s1 = torch.take(sinc_lut, lut_pos + 1)
@@ -588,9 +625,9 @@ def _accumulate_rir_batch_impl(
     if use_lut:
         sinc_lut = _get_sinc_lut(fdl, lut_gran, device=rir.device, dtype=sample.dtype)
-    sm_offsets = (torch.arange(n_sm, device=rir.device, dtype=torch.int64) * nsample).view(
-        n_sm, 1, 1
-    )
+    sm_offsets = (
+        torch.arange(n_sm, device=rir.device, dtype=torch.int64) * nsample
+    ).view(n_sm, 1, 1)
     rir_flat = rir.view(-1)
     n_img = idx0.shape[1]
@@ -604,7 +641,9 @@ def _accumulate_rir_batch_impl(
             x_off_frac = (1.0 - frac_m) * lut_gran
             lut_gran_off = torch.floor(x_off_frac).to(torch.int64)
             x_off = x_off_frac - lut_gran_off.to(sample.dtype)
-            lut_pos = lut_gran_off[..., None] + (n[None, None, :].to(torch.int64) * lut_gran)
+            lut_pos = lut_gran_off[..., None] + (
+                n[None, None, :].to(torch.int64) * lut_gran
+            )
             s0 = torch.take(sinc_lut, lut_pos)
             s1 = torch.take(sinc_lut, lut_pos + 1)
@@ -630,12 +669,13 @@ _SINC_LUT_CACHE: dict[tuple[int, int, str, torch.dtype], Tensor] = {}
 _FDL_GRID_CACHE: dict[tuple[int, str, torch.dtype], Tensor] = {}
 _FDL_OFFSETS_CACHE: dict[tuple[int, str], Tensor] = {}
 _FDL_WINDOW_CACHE: dict[tuple[int, str, torch.dtype], Tensor] = {}
-_ACCUM_BATCH_COMPILED: dict[tuple[str, torch.dtype, int, int, bool, int], callable] = {}
+_AccumFn = Callable[[Tensor, Tensor, Tensor], None]
+_ACCUM_BATCH_COMPILED: dict[tuple[str, torch.dtype, int, int, bool, int], _AccumFn] = {}
 def _get_accumulate_fn(
     cfg: SimulationConfig, device: torch.device, dtype: torch.dtype
-) -> callable:
+) -> _AccumFn:
     """Return an accumulation function with config-bound constants."""
     use_lut = cfg.use_lut and device.type != "mps"
     fdl = cfg.frac_delay_length
@@ -691,7 +731,9 @@ def _get_fdl_window(fdl: int, *, device: torch.device, dtype: torch.dtype) -> Te
     return cached
-def _get_sinc_lut(fdl: int, lut_gran: int, *, device: torch.device, dtype: torch.dtype) -> Tensor:
+def _get_sinc_lut(
+    fdl: int, lut_gran: int, *, device: torch.device, dtype: torch.dtype
+) -> Tensor:
     """Create a sinc lookup table for fractional delays."""
     key = (fdl, lut_gran, str(device), dtype)
     cached = _SINC_LUT_CACHE.get(key)
@@ -735,7 +777,12 @@ def _apply_diffuse_tail(
     gen = torch.Generator(device=rir.device)
     gen.manual_seed(0 if seed is None else seed)
-    noise = torch.randn(rir[..., tdiff_idx:].shape, device=rir.device, dtype=rir.dtype, generator=gen)
-    scale = torch.linalg.norm(rir[..., tdiff_idx - 1 : tdiff_idx], dim=-1, keepdim=True) + 1e-8
+    noise = torch.randn(
+        rir[..., tdiff_idx:].shape, device=rir.device, dtype=rir.dtype, generator=gen
+    )
+    scale = (
+        torch.linalg.norm(rir[..., tdiff_idx - 1 : tdiff_idx], dim=-1, keepdim=True)
+        + 1e-8
+    )
     rir[..., tdiff_idx:] = noise * decay * scale
     return rir

{torchrir-0.1.0 → torchrir-0.1.4}/src/torchrir/datasets/cmu_arctic.py RENAMED Viewed

@@ -44,12 +44,22 @@ def list_cmu_arctic_speakers() -> List[str]:
 @dataclass
 class CmuArcticSentence:
     """Sentence metadata from CMU ARCTIC."""
     utterance_id: str
     text: str
 class CmuArcticDataset:
-    def __init__(self, root: Path, speaker: str = "bdl", download: bool = False) -> None:
+    """CMU ARCTIC dataset loader.
+    Example:
+        >>> dataset = CmuArcticDataset(Path("datasets/cmu_arctic"), speaker="bdl", download=True)
+        >>> audio, fs = dataset.load_wav("arctic_a0001")
+    """
+    def __init__(
+        self, root: Path, speaker: str = "bdl", download: bool = False
+    ) -> None:
         """Initialize a CMU ARCTIC dataset handle.
         Args:
@@ -182,7 +192,11 @@ def _parse_text_line(line: str) -> Tuple[str, str]:
 def load_wav_mono(path: Path) -> Tuple[torch.Tensor, int]:
-    """Load a wav file and return mono audio and sample rate."""
+    """Load a wav file and return mono audio and sample rate.
+    Example:
+        >>> audio, fs = load_wav_mono(Path("datasets/cmu_arctic/ARCTIC/.../wav/arctic_a0001.wav"))
+    """
     import soundfile as sf
     audio, sample_rate = sf.read(str(path), dtype="float32", always_2d=True)
@@ -195,7 +209,11 @@ def load_wav_mono(path: Path) -> Tuple[torch.Tensor, int]:
 def save_wav(path: Path, audio: torch.Tensor, sample_rate: int) -> None:
-    """Save a mono or multi-channel wav to disk."""
+    """Save a mono or multi-channel wav to disk.
+    Example:
+        >>> save_wav(Path("outputs/example.wav"), audio, sample_rate)
+    """
     import soundfile as sf
     audio = audio.detach().cpu().clamp(-1.0, 1.0).to(torch.float32)

{torchrir-0.1.0 → torchrir-0.1.4}/src/torchrir/datasets/template.py RENAMED Viewed

@@ -34,7 +34,9 @@ class TemplateDataset(BaseDataset):
         protocol intact.
     """
-    def __init__(self, root: Path, speaker: str = "default", download: bool = False) -> None:
+    def __init__(
+        self, root: Path, speaker: str = "default", download: bool = False
+    ) -> None:
         self.root = Path(root)
         self.speaker = speaker
         if download:

{torchrir-0.1.0 → torchrir-0.1.4}/src/torchrir/datasets/utils.py RENAMED Viewed

@@ -10,8 +10,15 @@ import torch
 from .base import BaseDataset, SentenceLike
-def choose_speakers(dataset: BaseDataset, num_sources: int, rng: random.Random) -> List[str]:
-    """Select unique speakers for the requested number of sources."""
+def choose_speakers(
+    dataset: BaseDataset, num_sources: int, rng: random.Random
+) -> List[str]:
+    """Select unique speakers for the requested number of sources.
+    Example:
+        >>> rng = random.Random(0)
+        >>> speakers = choose_speakers(dataset, num_sources=2, rng=rng)
+    """
     speakers = dataset.list_speakers()
     if not speakers:
         raise RuntimeError("no speakers available")
@@ -27,7 +34,20 @@ def load_dataset_sources(
     duration_s: float,
     rng: random.Random,
 ) -> Tuple[torch.Tensor, int, List[Tuple[str, List[str]]]]:
-    """Load and concatenate utterances for each speaker into fixed-length signals."""
+    """Load and concatenate utterances for each speaker into fixed-length signals.
+    Example:
+        >>> from pathlib import Path
+        >>> from torchrir import CmuArcticDataset
+        >>> rng = random.Random(0)
+        >>> root = Path("datasets/cmu_arctic")
+        >>> signals, fs, info = load_dataset_sources(
+        ...     dataset_factory=lambda spk: CmuArcticDataset(root, speaker=spk, download=True),
+        ...     num_sources=2,
+        ...     duration_s=10.0,
+        ...     rng=rng,
+        ... )
+    """
     dataset0 = dataset_factory(None)
     speakers = choose_speakers(dataset0, num_sources, rng)
     signals: List[torch.Tensor] = []
@@ -71,4 +91,6 @@ def load_dataset_sources(
         info.append((speaker, utterance_ids))
     stacked = torch.stack(signals, dim=0)
+    if fs is None:
+        raise RuntimeError("no audio loaded from dataset sources")
     return stacked, int(fs), info

torchrir 0.1.0__tar.gz → 0.1.4__tar.gz

torchrir 0.1.0tar.gz → 0.1.4tar.gz