PyPI - sdf-sampler - Versions diffs - 0.1.0__tar.gz → 0.2.0__tar.gz - Mend

sdf-sampler 0.1.0tar.gz → 0.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

{sdf_sampler-0.1.0 → sdf_sampler-0.2.0}/.gitignore RENAMED Viewed

@@ -79,3 +79,4 @@ dmypy.json
 # Secrets
 .pypi_token.env
+*.env

{sdf_sampler-0.1.0 → sdf_sampler-0.2.0}/CHANGELOG.md RENAMED Viewed

@@ -5,6 +5,19 @@ All notable changes to sdf-sampler will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.2.0] - 2025-01-29
+### Added
+- **Command-Line Interface** for batch processing
+  - `sdf-sampler pipeline` - Full workflow (analyze + sample + export)
+  - `sdf-sampler analyze` - Detect SOLID/EMPTY regions
+  - `sdf-sampler sample` - Generate training samples from constraints
+  - `sdf-sampler info` - Inspect point clouds, constraints, and sample files
+- Support for `python -m sdf_sampler` invocation
+- Console script entry point (`sdf-sampler` command)
+- Comprehensive README with SDK and CLI documentation
 ## [0.1.0] - 2025-01-29
 ### Added

{sdf_sampler-0.1.0 → sdf_sampler-0.2.0}/PKG-INFO RENAMED Viewed

@@ -1,8 +1,8 @@
 Metadata-Version: 2.4
 Name: sdf-sampler
-Version: 0.1.0
+Version: 0.2.0
 Summary: Auto-analysis and sampling of point clouds for SDF (Signed Distance Field) training data generation
-Project-URL: Repository, https://github.com/chiark/sdf-sampler
+Project-URL: Repository, https://github.com/Chiark-Collective/sdf-sampler
 Author-email: Liam <liam@example.com>
 License: MIT
 License-File: LICENSE
@@ -60,7 +60,88 @@ For additional I/O format support (PLY, LAS/LAZ):
 pip install sdf-sampler[io]
 ```
-## Quick Start
+## Command-Line Interface
+sdf-sampler provides a CLI for common workflows:
+```bash
+# Run as module
+python -m sdf_sampler --help
+# Or use the installed command
+sdf-sampler --help
+```
+### Commands
+#### `pipeline` - Full workflow (recommended)
+Run the complete pipeline: analyze point cloud → generate samples → export.
+```bash
+# Basic usage
+sdf-sampler pipeline scan.ply -o training_data.parquet
+# With options
+sdf-sampler pipeline scan.ply \
+    -o training_data.parquet \
+    -n 50000 \
+    -s inverse_square \
+    --save-constraints constraints.json \
+    -v
+```
+Options:
+- `-o, --output`: Output parquet file (default: `<input>_samples.parquet`)
+- `-n, --total-samples`: Number of samples to generate (default: 10000)
+- `-s, --strategy`: Sampling strategy: `constant`, `density`, `inverse_square` (default: `inverse_square`)
+- `-a, --algorithms`: Specific algorithms to run (default: all)
+- `--save-constraints`: Also save constraints to JSON
+- `--seed`: Random seed for reproducibility
+- `-v, --verbose`: Verbose output
+#### `analyze` - Detect regions
+Analyze a point cloud to detect SOLID/EMPTY regions.
+```bash
+sdf-sampler analyze scan.ply -o constraints.json -v
+```
+Options:
+- `-o, --output`: Output JSON file (default: `<input>_constraints.json`)
+- `-a, --algorithms`: Algorithms to run (see below)
+- `--no-hull-filter`: Disable hull filtering
+- `-v, --verbose`: Verbose output
+#### `sample` - Generate training samples
+Generate training samples from a constraints file.
+```bash
+sdf-sampler sample scan.ply constraints.json -o samples.parquet -n 50000
+```
+Options:
+- `-o, --output`: Output parquet file
+- `-n, --total-samples`: Number of samples (default: 10000)
+- `-s, --strategy`: Sampling strategy (default: `inverse_square`)
+- `--seed`: Random seed
+- `-v, --verbose`: Verbose output
+#### `info` - Inspect files
+Show information about point clouds, constraints, or sample files.
+```bash
+sdf-sampler info scan.ply
+sdf-sampler info constraints.json
+sdf-sampler info samples.parquet
+```
+## Python SDK
+### Quick Start
 ```python
 from sdf_sampler import SDFAnalyzer, SDFSampler, load_point_cloud
@@ -86,28 +167,13 @@ samples = sampler.generate(
 sampler.export_parquet(samples, "training_data.parquet")
 ```
-## Features
-### Auto-Analysis Algorithms
-- **flood_fill**: Detects EMPTY (outside) regions by ray propagation from sky
-- **voxel_regions**: Detects SOLID (underground) regions
-- **normal_offset**: Generates paired SOLID/EMPTY boxes along surface normals
-- **normal_idw**: Inverse distance weighted sampling along normals
-- **pocket**: Detects interior cavities
-### Sampling Strategies
-- **CONSTANT**: Fixed number of samples per constraint
-- **DENSITY**: Samples proportional to constraint volume
-- **INVERSE_SQUARE**: More samples near surface, fewer far away (recommended)
-## API Reference
 ### SDFAnalyzer
+Analyzes point clouds to detect SOLID and EMPTY regions.
 ```python
-from sdf_sampler import SDFAnalyzer, AnalyzerConfig
+from sdf_sampler import SDFAnalyzer
+from sdf_sampler.config import AnalyzerConfig, AutoAnalysisOptions
 # With default config
 analyzer = SDFAnalyzer()
@@ -136,10 +202,23 @@ print(f"EMPTY: {result.summary.empty_constraints}")
 constraints = result.constraints
 ```
+#### Analysis Algorithms
+| Algorithm | Description | Output |
+|-----------|-------------|--------|
+| `flood_fill` | Detects EMPTY (outside) regions by ray propagation from sky | Box or SamplePoint constraints |
+| `voxel_regions` | Detects SOLID (underground) regions | Box or SamplePoint constraints |
+| `normal_offset` | Generates paired SOLID/EMPTY boxes along surface normals | Box constraints |
+| `normal_idw` | Inverse distance weighted sampling along normals | SamplePoint constraints |
+| `pocket` | Detects interior cavities | Pocket constraints |
 ### SDFSampler
+Generates training samples from constraints.
 ```python
-from sdf_sampler import SDFSampler, SamplerConfig
+from sdf_sampler import SDFSampler
+from sdf_sampler.config import SamplerConfig
 # With default config
 sampler = SDFSampler()
@@ -167,6 +246,14 @@ sampler.export_parquet(samples, "output.parquet")
 df = sampler.to_dataframe(samples)
 ```
+#### Sampling Strategies
+| Strategy | Description |
+|----------|-------------|
+| `constant` | Fixed number of samples per constraint |
+| `density` | Samples proportional to constraint volume |
+| `inverse_square` | More samples near surface, fewer far away (recommended) |
 ### Constraint Types
 The analyzer generates various constraint types:
@@ -180,6 +267,22 @@ Each constraint has:
 - `sign`: "solid" (negative SDF) or "empty" (positive SDF)
 - `weight`: Sample weight (default 1.0)
+### I/O Helpers
+```python
+from sdf_sampler import load_point_cloud, export_parquet
+# Load various formats
+xyz, normals = load_point_cloud("scan.ply")    # PLY (requires trimesh)
+xyz, normals = load_point_cloud("scan.las")    # LAS/LAZ (requires laspy)
+xyz, normals = load_point_cloud("scan.csv")    # CSV with x,y,z columns
+xyz, normals = load_point_cloud("scan.npz")    # NumPy archive
+xyz, normals = load_point_cloud("scan.parquet") # Parquet
+# Export samples
+export_parquet(samples, "output.parquet")
+```
 ## Output Format
 The exported parquet file contains columns:
@@ -194,32 +297,40 @@ The exported parquet file contains columns:
 | is_surface | bool | Whether sample is on surface |
 | is_free | bool | Whether sample is in free space (EMPTY) |
-## Configuration Options
+## Configuration Reference
 ### AnalyzerConfig
 | Option | Default | Description |
 |--------|---------|-------------|
-| min_gap_size | 0.10 | Minimum gap size for flood fill (meters) |
-| max_grid_dim | 200 | Maximum voxel grid dimension |
-| cone_angle | 15.0 | Ray propagation cone half-angle (degrees) |
-| normal_offset_pairs | 40 | Number of box pairs for normal_offset |
-| idw_sample_count | 1000 | Total IDW samples |
-| idw_max_distance | 0.5 | Maximum IDW distance (meters) |
-| hull_filter_enabled | True | Filter outside X-Y alpha shape |
-| hull_alpha | 1.0 | Alpha shape parameter |
+| `min_gap_size` | 0.10 | Minimum gap size for flood fill (meters) |
+| `max_grid_dim` | 200 | Maximum voxel grid dimension |
+| `cone_angle` | 15.0 | Ray propagation cone half-angle (degrees) |
+| `normal_offset_pairs` | 40 | Number of box pairs for normal_offset |
+| `idw_sample_count` | 1000 | Total IDW samples |
+| `idw_max_distance` | 0.5 | Maximum IDW distance (meters) |
+| `hull_filter_enabled` | True | Filter outside X-Y alpha shape |
+| `hull_alpha` | 1.0 | Alpha shape parameter |
 ### SamplerConfig
 | Option | Default | Description |
 |--------|---------|-------------|
-| total_samples | 10000 | Default total samples |
-| samples_per_primitive | 100 | Samples per constraint (CONSTANT) |
-| samples_per_cubic_meter | 10000 | Sample density (DENSITY) |
-| inverse_square_base_samples | 100 | Base samples (INVERSE_SQUARE) |
-| inverse_square_falloff | 2.0 | Falloff exponent |
-| near_band | 0.02 | Near-band width |
-| seed | 0 | Random seed |
+| `total_samples` | 10000 | Default total samples |
+| `samples_per_primitive` | 100 | Samples per constraint (CONSTANT) |
+| `samples_per_cubic_meter` | 10000 | Sample density (DENSITY) |
+| `inverse_square_base_samples` | 100 | Base samples (INVERSE_SQUARE) |
+| `inverse_square_falloff` | 2.0 | Falloff exponent |
+| `near_band` | 0.02 | Near-band width |
+| `seed` | 0 | Random seed |
+## Integration with Ubik
+sdf-sampler is the core analysis engine for [Ubik](https://github.com/Chiark-Collective/ubik), an interactive web application for SDF labeling. Use sdf-sampler directly for:
+- Automated batch processing pipelines
+- Integration into ML training workflows
+- Custom analysis scripts
 ## License

sdf_sampler-0.2.0/README.md ADDED Viewed

@@ -0,0 +1,293 @@
+# sdf-sampler
+Auto-analysis and sampling of point clouds for SDF (Signed Distance Field) training data generation.
+A lightweight, standalone Python package for generating SDF training hints from point clouds. Automatically detects SOLID (inside) and EMPTY (outside) regions and generates training samples suitable for SDF regression models.
+## Installation
+```bash
+pip install sdf-sampler
+```
+For additional I/O format support (PLY, LAS/LAZ):
+```bash
+pip install sdf-sampler[io]
+```
+## Command-Line Interface
+sdf-sampler provides a CLI for common workflows:
+```bash
+# Run as module
+python -m sdf_sampler --help
+# Or use the installed command
+sdf-sampler --help
+```
+### Commands
+#### `pipeline` - Full workflow (recommended)
+Run the complete pipeline: analyze point cloud → generate samples → export.
+```bash
+# Basic usage
+sdf-sampler pipeline scan.ply -o training_data.parquet
+# With options
+sdf-sampler pipeline scan.ply \
+    -o training_data.parquet \
+    -n 50000 \
+    -s inverse_square \
+    --save-constraints constraints.json \
+    -v
+```
+Options:
+- `-o, --output`: Output parquet file (default: `<input>_samples.parquet`)
+- `-n, --total-samples`: Number of samples to generate (default: 10000)
+- `-s, --strategy`: Sampling strategy: `constant`, `density`, `inverse_square` (default: `inverse_square`)
+- `-a, --algorithms`: Specific algorithms to run (default: all)
+- `--save-constraints`: Also save constraints to JSON
+- `--seed`: Random seed for reproducibility
+- `-v, --verbose`: Verbose output
+#### `analyze` - Detect regions
+Analyze a point cloud to detect SOLID/EMPTY regions.
+```bash
+sdf-sampler analyze scan.ply -o constraints.json -v
+```
+Options:
+- `-o, --output`: Output JSON file (default: `<input>_constraints.json`)
+- `-a, --algorithms`: Algorithms to run (see below)
+- `--no-hull-filter`: Disable hull filtering
+- `-v, --verbose`: Verbose output
+#### `sample` - Generate training samples
+Generate training samples from a constraints file.
+```bash
+sdf-sampler sample scan.ply constraints.json -o samples.parquet -n 50000
+```
+Options:
+- `-o, --output`: Output parquet file
+- `-n, --total-samples`: Number of samples (default: 10000)
+- `-s, --strategy`: Sampling strategy (default: `inverse_square`)
+- `--seed`: Random seed
+- `-v, --verbose`: Verbose output
+#### `info` - Inspect files
+Show information about point clouds, constraints, or sample files.
+```bash
+sdf-sampler info scan.ply
+sdf-sampler info constraints.json
+sdf-sampler info samples.parquet
+```
+## Python SDK
+### Quick Start
+```python
+from sdf_sampler import SDFAnalyzer, SDFSampler, load_point_cloud
+# 1. Load point cloud (supports PLY, LAS, CSV, NPZ, Parquet)
+xyz, normals = load_point_cloud("scan.ply")
+# 2. Auto-analyze to detect EMPTY/SOLID regions
+analyzer = SDFAnalyzer()
+result = analyzer.analyze(xyz=xyz, normals=normals)
+print(f"Generated {len(result.constraints)} constraints")
+# 3. Generate training samples
+sampler = SDFSampler()
+samples = sampler.generate(
+    xyz=xyz,
+    constraints=result.constraints,
+    strategy="inverse_square",
+    total_samples=50000,
+)
+# 4. Export to parquet
+sampler.export_parquet(samples, "training_data.parquet")
+```
+### SDFAnalyzer
+Analyzes point clouds to detect SOLID and EMPTY regions.
+```python
+from sdf_sampler import SDFAnalyzer
+from sdf_sampler.config import AnalyzerConfig, AutoAnalysisOptions
+# With default config
+analyzer = SDFAnalyzer()
+# With custom config
+analyzer = SDFAnalyzer(config=AnalyzerConfig(
+    min_gap_size=0.10,      # Minimum gap for flood fill
+    max_grid_dim=200,       # Maximum voxel grid dimension
+    cone_angle=15.0,        # Ray propagation cone angle
+    hull_filter_enabled=True,  # Filter outside X-Y hull
+))
+# Run analysis
+result = analyzer.analyze(
+    xyz=xyz,                    # (N, 3) point positions
+    normals=normals,            # (N, 3) point normals (optional)
+    algorithms=["flood_fill", "voxel_regions"],  # Which algorithms to run
+)
+# Access results
+print(f"Total constraints: {result.summary.total_constraints}")
+print(f"SOLID: {result.summary.solid_constraints}")
+print(f"EMPTY: {result.summary.empty_constraints}")
+# Get constraint dicts for sampling
+constraints = result.constraints
+```
+#### Analysis Algorithms
+| Algorithm | Description | Output |
+|-----------|-------------|--------|
+| `flood_fill` | Detects EMPTY (outside) regions by ray propagation from sky | Box or SamplePoint constraints |
+| `voxel_regions` | Detects SOLID (underground) regions | Box or SamplePoint constraints |
+| `normal_offset` | Generates paired SOLID/EMPTY boxes along surface normals | Box constraints |
+| `normal_idw` | Inverse distance weighted sampling along normals | SamplePoint constraints |
+| `pocket` | Detects interior cavities | Pocket constraints |
+### SDFSampler
+Generates training samples from constraints.
+```python
+from sdf_sampler import SDFSampler
+from sdf_sampler.config import SamplerConfig
+# With default config
+sampler = SDFSampler()
+# With custom config
+sampler = SDFSampler(config=SamplerConfig(
+    total_samples=10000,
+    inverse_square_base_samples=100,
+    inverse_square_falloff=2.0,
+    near_band=0.02,
+))
+# Generate samples
+samples = sampler.generate(
+    xyz=xyz,                     # Point cloud for distance computation
+    constraints=constraints,      # From analyzer.analyze().constraints
+    strategy="inverse_square",    # Sampling strategy
+    seed=42,                      # For reproducibility
+)
+# Export
+sampler.export_parquet(samples, "output.parquet")
+# Or get DataFrame
+df = sampler.to_dataframe(samples)
+```
+#### Sampling Strategies
+| Strategy | Description |
+|----------|-------------|
+| `constant` | Fixed number of samples per constraint |
+| `density` | Samples proportional to constraint volume |
+| `inverse_square` | More samples near surface, fewer far away (recommended) |
+### Constraint Types
+The analyzer generates various constraint types:
+- **BoxConstraint**: Axis-aligned bounding box
+- **SphereConstraint**: Spherical region
+- **SamplePointConstraint**: Direct point with signed distance
+- **PocketConstraint**: Detected cavity region
+Each constraint has:
+- `sign`: "solid" (negative SDF) or "empty" (positive SDF)
+- `weight`: Sample weight (default 1.0)
+### I/O Helpers
+```python
+from sdf_sampler import load_point_cloud, export_parquet
+# Load various formats
+xyz, normals = load_point_cloud("scan.ply")    # PLY (requires trimesh)
+xyz, normals = load_point_cloud("scan.las")    # LAS/LAZ (requires laspy)
+xyz, normals = load_point_cloud("scan.csv")    # CSV with x,y,z columns
+xyz, normals = load_point_cloud("scan.npz")    # NumPy archive
+xyz, normals = load_point_cloud("scan.parquet") # Parquet
+# Export samples
+export_parquet(samples, "output.parquet")
+```
+## Output Format
+The exported parquet file contains columns:
+| Column | Type | Description |
+|--------|------|-------------|
+| x, y, z | float | 3D position |
+| phi | float | Signed distance (negative=solid, positive=empty) |
+| nx, ny, nz | float | Normal vector (if available) |
+| weight | float | Sample weight |
+| source | string | Sample origin (e.g., "box_solid", "flood_fill_empty") |
+| is_surface | bool | Whether sample is on surface |
+| is_free | bool | Whether sample is in free space (EMPTY) |
+## Configuration Reference
+### AnalyzerConfig
+| Option | Default | Description |
+|--------|---------|-------------|
+| `min_gap_size` | 0.10 | Minimum gap size for flood fill (meters) |
+| `max_grid_dim` | 200 | Maximum voxel grid dimension |
+| `cone_angle` | 15.0 | Ray propagation cone half-angle (degrees) |
+| `normal_offset_pairs` | 40 | Number of box pairs for normal_offset |
+| `idw_sample_count` | 1000 | Total IDW samples |
+| `idw_max_distance` | 0.5 | Maximum IDW distance (meters) |
+| `hull_filter_enabled` | True | Filter outside X-Y alpha shape |
+| `hull_alpha` | 1.0 | Alpha shape parameter |
+### SamplerConfig
+| Option | Default | Description |
+|--------|---------|-------------|
+| `total_samples` | 10000 | Default total samples |
+| `samples_per_primitive` | 100 | Samples per constraint (CONSTANT) |
+| `samples_per_cubic_meter` | 10000 | Sample density (DENSITY) |
+| `inverse_square_base_samples` | 100 | Base samples (INVERSE_SQUARE) |
+| `inverse_square_falloff` | 2.0 | Falloff exponent |
+| `near_band` | 0.02 | Near-band width |
+| `seed` | 0 | Random seed |
+## Integration with Ubik
+sdf-sampler is the core analysis engine for [Ubik](https://github.com/Chiark-Collective/ubik), an interactive web application for SDF labeling. Use sdf-sampler directly for:
+- Automated batch processing pipelines
+- Integration into ML training workflows
+- Custom analysis scripts
+## License
+MIT

{sdf_sampler-0.1.0 → sdf_sampler-0.2.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "sdf-sampler"
-version = "0.1.0"
+version = "0.2.0"
 description = "Auto-analysis and sampling of point clouds for SDF (Signed Distance Field) training data generation"
 readme = "README.md"
 license = { text = "MIT" }
@@ -42,8 +42,11 @@ dev = [
 ]
 all = ["sdf-sampler[io,dev]"]
+[project.scripts]
+sdf-sampler = "sdf_sampler.cli:main"
 [project.urls]
-Repository = "https://github.com/chiark/sdf-sampler"
+Repository = "https://github.com/Chiark-Collective/sdf-sampler"
 [build-system]
 requires = ["hatchling"]

{sdf_sampler-0.1.0 → sdf_sampler-0.2.0}/src/sdf_sampler/__init__.py RENAMED Viewed

@@ -47,7 +47,7 @@ from sdf_sampler.models import (
 )
 from sdf_sampler.sampler import SDFSampler
-__version__ = "0.1.0"
+__version__ = "0.2.0"
 __all__ = [
     # Main classes

sdf_sampler-0.2.0/src/sdf_sampler/__main__.py ADDED Viewed

@@ -0,0 +1,17 @@
+# ABOUTME: Entry point for running sdf-sampler as a module
+# ABOUTME: Enables `python -m sdf_sampler` invocation
+"""
+Run sdf-sampler as a module.
+Usage:
+    python -m sdf_sampler --help
+    python -m sdf_sampler analyze input.ply -o constraints.json
+    python -m sdf_sampler sample input.ply constraints.json -o samples.parquet
+    python -m sdf_sampler pipeline input.ply -o samples.parquet
+"""
+from sdf_sampler.cli import main
+if __name__ == "__main__":
+    raise SystemExit(main())

sdf_sampler-0.2.0/src/sdf_sampler/cli.py ADDED Viewed

@@ -0,0 +1,457 @@
+# ABOUTME: Command-line interface for sdf-sampler
+# ABOUTME: Provides analyze, sample, and pipeline commands
+"""
+CLI for sdf-sampler.
+Usage:
+    python -m sdf_sampler analyze input.ply -o constraints.json
+    python -m sdf_sampler sample input.ply constraints.json -o samples.parquet
+    python -m sdf_sampler pipeline input.ply -o samples.parquet
+"""
+import argparse
+import json
+import sys
+from pathlib import Path
+import numpy as np
+def main(argv: list[str] | None = None) -> int:
+    """Main CLI entry point."""
+    parser = argparse.ArgumentParser(
+        prog="sdf-sampler",
+        description="Auto-analysis and sampling of point clouds for SDF training data",
+    )
+    parser.add_argument(
+        "--version", action="store_true", help="Show version and exit"
+    )
+    subparsers = parser.add_subparsers(dest="command", help="Available commands")
+    # analyze command
+    analyze_parser = subparsers.add_parser(
+        "analyze",
+        help="Analyze point cloud to detect SOLID/EMPTY regions",
+    )
+    analyze_parser.add_argument(
+        "input",
+        type=Path,
+        help="Input point cloud file (PLY, LAS, NPZ, CSV, Parquet)",
+    )
+    analyze_parser.add_argument(
+        "-o", "--output",
+        type=Path,
+        default=None,
+        help="Output constraints JSON file (default: <input>_constraints.json)",
+    )
+    analyze_parser.add_argument(
+        "-a", "--algorithms",
+        type=str,
+        nargs="+",
+        default=None,
+        help="Algorithms to run (flood_fill, voxel_regions, normal_offset, normal_idw, pocket)",
+    )
+    analyze_parser.add_argument(
+        "--no-hull-filter",
+        action="store_true",
+        help="Disable hull filtering",
+    )
+    analyze_parser.add_argument(
+        "-v", "--verbose",
+        action="store_true",
+        help="Verbose output",
+    )
+    # sample command
+    sample_parser = subparsers.add_parser(
+        "sample",
+        help="Generate training samples from constraints",
+    )
+    sample_parser.add_argument(
+        "input",
+        type=Path,
+        help="Input point cloud file",
+    )
+    sample_parser.add_argument(
+        "constraints",
+        type=Path,
+        help="Constraints JSON file (from analyze command)",
+    )
+    sample_parser.add_argument(
+        "-o", "--output",
+        type=Path,
+        default=None,
+        help="Output parquet file (default: <input>_samples.parquet)",
+    )
+    sample_parser.add_argument(
+        "-n", "--total-samples",
+        type=int,
+        default=10000,
+        help="Total number of samples to generate (default: 10000)",
+    )
+    sample_parser.add_argument(
+        "-s", "--strategy",
+        type=str,
+        choices=["constant", "density", "inverse_square"],
+        default="inverse_square",
+        help="Sampling strategy (default: inverse_square)",
+    )
+    sample_parser.add_argument(
+        "--seed",
+        type=int,
+        default=None,
+        help="Random seed for reproducibility",
+    )
+    sample_parser.add_argument(
+        "-v", "--verbose",
+        action="store_true",
+        help="Verbose output",
+    )
+    # pipeline command
+    pipeline_parser = subparsers.add_parser(
+        "pipeline",
+        help="Full pipeline: analyze + sample + export",
+    )
+    pipeline_parser.add_argument(
+        "input",
+        type=Path,
+        help="Input point cloud file",
+    )
+    pipeline_parser.add_argument(
+        "-o", "--output",
+        type=Path,
+        default=None,
+        help="Output parquet file (default: <input>_samples.parquet)",
+    )
+    pipeline_parser.add_argument(
+        "-a", "--algorithms",
+        type=str,
+        nargs="+",
+        default=None,
+        help="Algorithms to run",
+    )
+    pipeline_parser.add_argument(
+        "-n", "--total-samples",
+        type=int,
+        default=10000,
+        help="Total number of samples to generate (default: 10000)",
+    )
+    pipeline_parser.add_argument(
+        "-s", "--strategy",
+        type=str,
+        choices=["constant", "density", "inverse_square"],
+        default="inverse_square",
+        help="Sampling strategy (default: inverse_square)",
+    )
+    pipeline_parser.add_argument(
+        "--seed",
+        type=int,
+        default=None,
+        help="Random seed for reproducibility",
+    )
+    pipeline_parser.add_argument(
+        "--save-constraints",
+        type=Path,
+        default=None,
+        help="Also save constraints to JSON file",
+    )
+    pipeline_parser.add_argument(
+        "-v", "--verbose",
+        action="store_true",
+        help="Verbose output",
+    )
+    # info command
+    info_parser = subparsers.add_parser(
+        "info",
+        help="Show information about a point cloud or constraints file",
+    )
+    info_parser.add_argument(
+        "input",
+        type=Path,
+        help="Input file (point cloud or constraints JSON)",
+    )
+    args = parser.parse_args(argv)
+    if args.version:
+        from sdf_sampler import __version__
+        print(f"sdf-sampler {__version__}")
+        return 0
+    if args.command is None:
+        parser.print_help()
+        return 0
+    if args.command == "analyze":
+        return cmd_analyze(args)
+    elif args.command == "sample":
+        return cmd_sample(args)
+    elif args.command == "pipeline":
+        return cmd_pipeline(args)
+    elif args.command == "info":
+        return cmd_info(args)
+    return 0
+def cmd_analyze(args: argparse.Namespace) -> int:
+    """Run analyze command."""
+    from sdf_sampler import SDFAnalyzer, load_point_cloud
+    from sdf_sampler.config import AutoAnalysisOptions
+    if not args.input.exists():
+        print(f"Error: Input file not found: {args.input}", file=sys.stderr)
+        return 1
+    output = args.output or args.input.with_suffix(".constraints.json")
+    if args.verbose:
+        print(f"Loading point cloud: {args.input}")
+    try:
+        xyz, normals = load_point_cloud(str(args.input))
+    except Exception as e:
+        print(f"Error loading point cloud: {e}", file=sys.stderr)
+        return 1
+    if args.verbose:
+        print(f"  Points: {len(xyz):,}")
+        print(f"  Normals: {'yes' if normals is not None else 'no'}")
+    options = AutoAnalysisOptions(
+        hull_filter_enabled=not args.no_hull_filter,
+    )
+    if args.verbose:
+        algos = args.algorithms or ["all"]
+        print(f"Running analysis: {', '.join(algos)}")
+    analyzer = SDFAnalyzer()
+    result = analyzer.analyze(
+        xyz=xyz,
+        normals=normals,
+        algorithms=args.algorithms,
+        options=options,
+    )
+    if args.verbose:
+        print(f"Generated {len(result.constraints)} constraints")
+        print(f"  SOLID: {result.summary.solid_constraints}")
+        print(f"  EMPTY: {result.summary.empty_constraints}")
+    # Save constraints
+    with open(output, "w") as f:
+        json.dump(result.constraints, f, indent=2, default=_json_serializer)
+    print(f"Saved constraints to: {output}")
+    return 0
+def cmd_sample(args: argparse.Namespace) -> int:
+    """Run sample command."""
+    from sdf_sampler import SDFSampler, load_point_cloud
+    if not args.input.exists():
+        print(f"Error: Input file not found: {args.input}", file=sys.stderr)
+        return 1
+    if not args.constraints.exists():
+        print(f"Error: Constraints file not found: {args.constraints}", file=sys.stderr)
+        return 1
+    output = args.output or args.input.with_suffix(".samples.parquet")
+    if args.verbose:
+        print(f"Loading point cloud: {args.input}")
+    try:
+        xyz, normals = load_point_cloud(str(args.input))
+    except Exception as e:
+        print(f"Error loading point cloud: {e}", file=sys.stderr)
+        return 1
+    if args.verbose:
+        print(f"Loading constraints: {args.constraints}")
+    with open(args.constraints) as f:
+        constraints = json.load(f)
+    if args.verbose:
+        print(f"  Constraints: {len(constraints)}")
+        print(f"Generating {args.total_samples:,} samples with strategy: {args.strategy}")
+    sampler = SDFSampler()
+    samples = sampler.generate(
+        xyz=xyz,
+        normals=normals,
+        constraints=constraints,
+        total_samples=args.total_samples,
+        strategy=args.strategy,
+        seed=args.seed,
+    )
+    if args.verbose:
+        print(f"Generated {len(samples)} samples")
+    sampler.export_parquet(samples, str(output))
+    print(f"Saved samples to: {output}")
+    return 0
+def cmd_pipeline(args: argparse.Namespace) -> int:
+    """Run full pipeline: analyze + sample + export."""
+    from sdf_sampler import SDFAnalyzer, SDFSampler, load_point_cloud
+    from sdf_sampler.config import AutoAnalysisOptions
+    if not args.input.exists():
+        print(f"Error: Input file not found: {args.input}", file=sys.stderr)
+        return 1
+    output = args.output or args.input.with_suffix(".samples.parquet")
+    if args.verbose:
+        print(f"Loading point cloud: {args.input}")
+    try:
+        xyz, normals = load_point_cloud(str(args.input))
+    except Exception as e:
+        print(f"Error loading point cloud: {e}", file=sys.stderr)
+        return 1
+    if args.verbose:
+        print(f"  Points: {len(xyz):,}")
+        print(f"  Normals: {'yes' if normals is not None else 'no'}")
+    # Analyze
+    if args.verbose:
+        algos = args.algorithms or ["all"]
+        print(f"Running analysis: {', '.join(algos)}")
+    options = AutoAnalysisOptions()
+    analyzer = SDFAnalyzer()
+    result = analyzer.analyze(
+        xyz=xyz,
+        normals=normals,
+        algorithms=args.algorithms,
+        options=options,
+    )
+    if args.verbose:
+        print(f"Generated {len(result.constraints)} constraints")
+        print(f"  SOLID: {result.summary.solid_constraints}")
+        print(f"  EMPTY: {result.summary.empty_constraints}")
+    # Optionally save constraints
+    if args.save_constraints:
+        with open(args.save_constraints, "w") as f:
+            json.dump(result.constraints, f, indent=2, default=_json_serializer)
+        if args.verbose:
+            print(f"Saved constraints to: {args.save_constraints}")
+    # Sample
+    if args.verbose:
+        print(f"Generating {args.total_samples:,} samples with strategy: {args.strategy}")
+    sampler = SDFSampler()
+    samples = sampler.generate(
+        xyz=xyz,
+        normals=normals,
+        constraints=result.constraints,
+        total_samples=args.total_samples,
+        strategy=args.strategy,
+        seed=args.seed,
+    )
+    if args.verbose:
+        print(f"Generated {len(samples)} samples")
+    # Export
+    sampler.export_parquet(samples, str(output))
+    print(f"Saved samples to: {output}")
+    return 0
+def cmd_info(args: argparse.Namespace) -> int:
+    """Show information about a file."""
+    if not args.input.exists():
+        print(f"Error: File not found: {args.input}", file=sys.stderr)
+        return 1
+    suffix = args.input.suffix.lower()
+    if suffix == ".json":
+        # Constraints file
+        with open(args.input) as f:
+            constraints = json.load(f)
+        print(f"Constraints file: {args.input}")
+        print(f"  Total constraints: {len(constraints)}")
+        # Count by type and sign
+        by_type: dict[str, int] = {}
+        by_sign: dict[str, int] = {}
+        for c in constraints:
+            ctype = c.get("type", "unknown")
+            sign = c.get("sign", "unknown")
+            by_type[ctype] = by_type.get(ctype, 0) + 1
+            by_sign[sign] = by_sign.get(sign, 0) + 1
+        print("  By type:")
+        for t, count in sorted(by_type.items()):
+            print(f"    {t}: {count}")
+        print("  By sign:")
+        for s, count in sorted(by_sign.items()):
+            print(f"    {s}: {count}")
+    elif suffix == ".parquet":
+        import pandas as pd
+        df = pd.read_parquet(args.input)
+        print(f"Parquet file: {args.input}")
+        print(f"  Samples: {len(df):,}")
+        print(f"  Columns: {', '.join(df.columns)}")
+        if "source" in df.columns:
+            print("  By source:")
+            for source, count in df["source"].value_counts().items():
+                print(f"    {source}: {count:,}")
+        if "phi" in df.columns:
+            print(f"  Phi range: [{df['phi'].min():.4f}, {df['phi'].max():.4f}]")
+    else:
+        # Point cloud file
+        from sdf_sampler import load_point_cloud
+        try:
+            xyz, normals = load_point_cloud(str(args.input))
+        except Exception as e:
+            print(f"Error loading file: {e}", file=sys.stderr)
+            return 1
+        print(f"Point cloud: {args.input}")
+        print(f"  Points: {len(xyz):,}")
+        print(f"  Normals: {'yes' if normals is not None else 'no'}")
+        print(f"  Bounds:")
+        print(f"    X: [{xyz[:, 0].min():.4f}, {xyz[:, 0].max():.4f}]")
+        print(f"    Y: [{xyz[:, 1].min():.4f}, {xyz[:, 1].max():.4f}]")
+        print(f"    Z: [{xyz[:, 2].min():.4f}, {xyz[:, 2].max():.4f}]")
+    return 0
+def _json_serializer(obj):
+    """JSON serializer for numpy types."""
+    if isinstance(obj, np.ndarray):
+        return obj.tolist()
+    if isinstance(obj, (np.integer, np.floating)):
+        return obj.item()
+    raise TypeError(f"Object of type {type(obj)} is not JSON serializable")
+if __name__ == "__main__":
+    sys.exit(main())

sdf_sampler-0.1.0/.github_token.env DELETED Viewed

	@@ -1 +0,0 @@
1	- GITHUB_TOKEN=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

sdf_sampler-0.1.0/README.md DELETED Viewed

@@ -1,182 +0,0 @@
-# sdf-sampler
-Auto-analysis and sampling of point clouds for SDF (Signed Distance Field) training data generation.
-A lightweight, standalone Python package for generating SDF training hints from point clouds. Automatically detects SOLID (inside) and EMPTY (outside) regions and generates training samples suitable for SDF regression models.
-## Installation
-```bash
-pip install sdf-sampler
-```
-For additional I/O format support (PLY, LAS/LAZ):
-```bash
-pip install sdf-sampler[io]
-```
-## Quick Start
-```python
-from sdf_sampler import SDFAnalyzer, SDFSampler, load_point_cloud
-# 1. Load point cloud (supports PLY, LAS, CSV, NPZ, Parquet)
-xyz, normals = load_point_cloud("scan.ply")
-# 2. Auto-analyze to detect EMPTY/SOLID regions
-analyzer = SDFAnalyzer()
-result = analyzer.analyze(xyz=xyz, normals=normals)
-print(f"Generated {len(result.constraints)} constraints")
-# 3. Generate training samples
-sampler = SDFSampler()
-samples = sampler.generate(
-    xyz=xyz,
-    constraints=result.constraints,
-    strategy="inverse_square",
-    total_samples=50000,
-)
-# 4. Export to parquet
-sampler.export_parquet(samples, "training_data.parquet")
-```
-## Features
-### Auto-Analysis Algorithms
-- **flood_fill**: Detects EMPTY (outside) regions by ray propagation from sky
-- **voxel_regions**: Detects SOLID (underground) regions
-- **normal_offset**: Generates paired SOLID/EMPTY boxes along surface normals
-- **normal_idw**: Inverse distance weighted sampling along normals
-- **pocket**: Detects interior cavities
-### Sampling Strategies
-- **CONSTANT**: Fixed number of samples per constraint
-- **DENSITY**: Samples proportional to constraint volume
-- **INVERSE_SQUARE**: More samples near surface, fewer far away (recommended)
-## API Reference
-### SDFAnalyzer
-```python
-from sdf_sampler import SDFAnalyzer, AnalyzerConfig
-# With default config
-analyzer = SDFAnalyzer()
-# With custom config
-analyzer = SDFAnalyzer(config=AnalyzerConfig(
-    min_gap_size=0.10,      # Minimum gap for flood fill
-    max_grid_dim=200,       # Maximum voxel grid dimension
-    cone_angle=15.0,        # Ray propagation cone angle
-    hull_filter_enabled=True,  # Filter outside X-Y hull
-))
-# Run analysis
-result = analyzer.analyze(
-    xyz=xyz,                    # (N, 3) point positions
-    normals=normals,            # (N, 3) point normals (optional)
-    algorithms=["flood_fill", "voxel_regions"],  # Which algorithms to run
-)
-# Access results
-print(f"Total constraints: {result.summary.total_constraints}")
-print(f"SOLID: {result.summary.solid_constraints}")
-print(f"EMPTY: {result.summary.empty_constraints}")
-# Get constraint dicts for sampling
-constraints = result.constraints
-```
-### SDFSampler
-```python
-from sdf_sampler import SDFSampler, SamplerConfig
-# With default config
-sampler = SDFSampler()
-# With custom config
-sampler = SDFSampler(config=SamplerConfig(
-    total_samples=10000,
-    inverse_square_base_samples=100,
-    inverse_square_falloff=2.0,
-    near_band=0.02,
-))
-# Generate samples
-samples = sampler.generate(
-    xyz=xyz,                     # Point cloud for distance computation
-    constraints=constraints,      # From analyzer.analyze().constraints
-    strategy="inverse_square",    # Sampling strategy
-    seed=42,                      # For reproducibility
-)
-# Export
-sampler.export_parquet(samples, "output.parquet")
-# Or get DataFrame
-df = sampler.to_dataframe(samples)
-```
-### Constraint Types
-The analyzer generates various constraint types:
-- **BoxConstraint**: Axis-aligned bounding box
-- **SphereConstraint**: Spherical region
-- **SamplePointConstraint**: Direct point with signed distance
-- **PocketConstraint**: Detected cavity region
-Each constraint has:
-- `sign`: "solid" (negative SDF) or "empty" (positive SDF)
-- `weight`: Sample weight (default 1.0)
-## Output Format
-The exported parquet file contains columns:
-| Column | Type | Description |
-|--------|------|-------------|
-| x, y, z | float | 3D position |
-| phi | float | Signed distance (negative=solid, positive=empty) |
-| nx, ny, nz | float | Normal vector (if available) |
-| weight | float | Sample weight |
-| source | string | Sample origin (e.g., "box_solid", "flood_fill_empty") |
-| is_surface | bool | Whether sample is on surface |
-| is_free | bool | Whether sample is in free space (EMPTY) |
-## Configuration Options
-### AnalyzerConfig
-| Option | Default | Description |
-|--------|---------|-------------|
-| min_gap_size | 0.10 | Minimum gap size for flood fill (meters) |
-| max_grid_dim | 200 | Maximum voxel grid dimension |
-| cone_angle | 15.0 | Ray propagation cone half-angle (degrees) |
-| normal_offset_pairs | 40 | Number of box pairs for normal_offset |
-| idw_sample_count | 1000 | Total IDW samples |
-| idw_max_distance | 0.5 | Maximum IDW distance (meters) |
-| hull_filter_enabled | True | Filter outside X-Y alpha shape |
-| hull_alpha | 1.0 | Alpha shape parameter |
-### SamplerConfig
-| Option | Default | Description |
-|--------|---------|-------------|
-| total_samples | 10000 | Default total samples |
-| samples_per_primitive | 100 | Samples per constraint (CONSTANT) |
-| samples_per_cubic_meter | 10000 | Sample density (DENSITY) |
-| inverse_square_base_samples | 100 | Base samples (INVERSE_SQUARE) |
-| inverse_square_falloff | 2.0 | Falloff exponent |
-| near_band | 0.02 | Near-band width |
-| seed | 0 | Random seed |
-## License
-MIT