torchbvh 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. torchbvh-0.1.0/LICENSE +21 -0
  2. torchbvh-0.1.0/MANIFEST.in +14 -0
  3. torchbvh-0.1.0/PKG-INFO +104 -0
  4. torchbvh-0.1.0/README.md +79 -0
  5. torchbvh-0.1.0/docs/algorithms.md +121 -0
  6. torchbvh-0.1.0/docs/api_reference.md +458 -0
  7. torchbvh-0.1.0/docs/examples.md +17 -0
  8. torchbvh-0.1.0/docs/index.md +71 -0
  9. torchbvh-0.1.0/docs/lifecycle_and_gradients.md +133 -0
  10. torchbvh-0.1.0/docs/performance.md +65 -0
  11. torchbvh-0.1.0/docs/testing.md +511 -0
  12. torchbvh-0.1.0/docs/user_guide.md +155 -0
  13. torchbvh-0.1.0/examples/basic_bvh_knn.ipynb +63 -0
  14. torchbvh-0.1.0/examples/batched_displaced_query.ipynb +103 -0
  15. torchbvh-0.1.0/examples/fps_downsampling_geometry.ipynb +132 -0
  16. torchbvh-0.1.0/examples/mls_interpolation.ipynb +136 -0
  17. torchbvh-0.1.0/pyproject.toml +7 -0
  18. torchbvh-0.1.0/setup.cfg +4 -0
  19. torchbvh-0.1.0/setup.py +90 -0
  20. torchbvh-0.1.0/tests/test_batched_interpolate.py +163 -0
  21. torchbvh-0.1.0/tests/test_batched_knn.py +325 -0
  22. torchbvh-0.1.0/tests/test_batched_python_api.py +124 -0
  23. torchbvh-0.1.0/tests/test_benchmark_flowers_grid_sample.py +87 -0
  24. torchbvh-0.1.0/tests/test_benchmark_knn.py +536 -0
  25. torchbvh-0.1.0/tests/test_bvh_build.py +320 -0
  26. torchbvh-0.1.0/tests/test_bvh_class.py +298 -0
  27. torchbvh-0.1.0/tests/test_displaced_query.py +395 -0
  28. torchbvh-0.1.0/tests/test_fps.py +741 -0
  29. torchbvh-0.1.0/tests/test_implicit_tree.py +123 -0
  30. torchbvh-0.1.0/tests/test_knn.py +410 -0
  31. torchbvh-0.1.0/tests/test_mls_interpolate.py +147 -0
  32. torchbvh-0.1.0/tests/test_morton.py +151 -0
  33. torchbvh-0.1.0/tests/test_noncontiguous_public_api.py +242 -0
  34. torchbvh-0.1.0/tests/test_python_api.py +216 -0
  35. torchbvh-0.1.0/tests/test_ragged_knn.py +408 -0
  36. torchbvh-0.1.0/tests/test_ragged_python_api.py +205 -0
  37. torchbvh-0.1.0/tests/test_smoke.py +12 -0
  38. torchbvh-0.1.0/tests/test_stage7_gradient_boundaries.py +139 -0
  39. torchbvh-0.1.0/tests/test_training_step_proxy.py +54 -0
  40. torchbvh-0.1.0/torchbvh/__init__.py +76 -0
  41. torchbvh-0.1.0/torchbvh/_bvh_class.py +125 -0
  42. torchbvh-0.1.0/torchbvh/_constants.py +5 -0
  43. torchbvh-0.1.0/torchbvh/_fps.py +487 -0
  44. torchbvh-0.1.0/torchbvh/_handles.py +144 -0
  45. torchbvh-0.1.0/torchbvh/_mls.py +644 -0
  46. torchbvh-0.1.0/torchbvh/_multihead.py +190 -0
  47. torchbvh-0.1.0/torchbvh/_prototypes/__init__.py +5 -0
  48. torchbvh-0.1.0/torchbvh/_query.py +396 -0
  49. torchbvh-0.1.0/torchbvh/_reorder.py +97 -0
  50. torchbvh-0.1.0/torchbvh/_tree.py +66 -0
  51. torchbvh-0.1.0/torchbvh/_validation.py +96 -0
  52. torchbvh-0.1.0/torchbvh/csrc/bindings.cpp +644 -0
  53. torchbvh-0.1.0/torchbvh/csrc/bvh_build.cu +554 -0
  54. torchbvh-0.1.0/torchbvh/csrc/fps_sample.cu +2233 -0
  55. torchbvh-0.1.0/torchbvh/csrc/geometry.cuh +37 -0
  56. torchbvh-0.1.0/torchbvh/csrc/implicit_tree.cuh +131 -0
  57. torchbvh-0.1.0/torchbvh/csrc/knn_query.cu +935 -0
  58. torchbvh-0.1.0/torchbvh/csrc/mls_fused.cu +739 -0
  59. torchbvh-0.1.0/torchbvh/csrc/morton.cuh +71 -0
  60. torchbvh-0.1.0/torchbvh/csrc/morton_sort.cu +195 -0
  61. torchbvh-0.1.0/torchbvh/csrc/smoke.cu +51 -0
  62. torchbvh-0.1.0/torchbvh/ops.py +98 -0
  63. torchbvh-0.1.0/torchbvh.egg-info/PKG-INFO +104 -0
  64. torchbvh-0.1.0/torchbvh.egg-info/SOURCES.txt +73 -0
  65. torchbvh-0.1.0/torchbvh.egg-info/dependency_links.txt +1 -0
  66. torchbvh-0.1.0/torchbvh.egg-info/not-zip-safe +1 -0
  67. torchbvh-0.1.0/torchbvh.egg-info/requires.txt +1 -0
  68. torchbvh-0.1.0/torchbvh.egg-info/top_level.txt +1 -0
torchbvh-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Roberto Hart-Villamil
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,14 @@
1
+ include README.md
2
+ include LICENSE
3
+ include pyproject.toml
4
+ recursive-include torchbvh/csrc *.cpp *.cu *.cuh
5
+ recursive-include docs *.md
6
+ recursive-include examples *.ipynb
7
+ global-exclude __pycache__
8
+ global-exclude *.py[cod]
9
+ global-exclude *.pyd
10
+ global-exclude *.so
11
+ global-exclude *.dll
12
+ global-exclude *.obj
13
+ global-exclude *.exp
14
+ global-exclude *.lib
@@ -0,0 +1,104 @@
1
+ Metadata-Version: 2.1
2
+ Name: torchbvh
3
+ Version: 0.1.0
4
+ Summary: GPU-native BVH, k-NN, MLS interpolation, and FPS primitives for PyTorch.
5
+ Home-page: https://github.com/Robh96/torchbvh
6
+ License: MIT
7
+ Project-URL: Documentation, https://torchbvh.readthedocs.io/
8
+ Project-URL: Source, https://github.com/Robh96/torchbvh
9
+ Classifier: Development Status :: 3 - Alpha
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3 :: Only
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
20
+ Classifier: Topic :: Scientific/Engineering :: Mathematics
21
+ Requires-Python: >=3.10
22
+ Description-Content-Type: text/markdown
23
+ License-File: LICENSE
24
+ Requires-Dist: torch>=2.0
25
+
26
+ # torchbvh
27
+
28
+ GPU-native geometry primitives for PyTorch point-cloud workflows: BVH construction,
29
+ exact k-NN search, MLS interpolation, displaced-query helpers, and FPS downsampling.
30
+
31
+ ## Performance
32
+
33
+ k-NN at N=10k, 3D, k=8 (RTX 3500 Ada, uniform distribution):
34
+
35
+ | | Build + query |
36
+ |---|---|
37
+ | `scipy_cKD-Tree` CPU | ~23 ms |
38
+ | `torch_cluster` GPU | ~6.8 ms |
39
+ | `cupy_knn` GPU | ~4.1 ms |
40
+ | `torchbvh` GPU | ~1.4 ms |
41
+
42
+ FPS at B=16, N=10k, 25% selection (RTX 3500 Ada):
43
+
44
+ | | Time |
45
+ |---|---|
46
+ | `fpsample` CPU | ~833 ms |
47
+ | `torch_fpsample` h=7 (CPU, fastest setting) | ~37 ms |
48
+ | `torchbvh` GPU | ~21 ms |
49
+
50
+ See `benchmarks/third_party_algorithm_comparison.ipynb` for more detailed comparisons.
51
+
52
+
53
+ ## Install
54
+
55
+ ```bash
56
+ pip install torchbvh
57
+ ```
58
+
59
+ `torchbvh` builds a PyTorch CUDA extension. Source installs require PyTorch, a
60
+ compatible CUDA toolkit/NVCC, and a supported host compiler in the build environment.
61
+
62
+ ## Docs
63
+
64
+ Documentation can be found at [torchbvh.readthedocs.io](https://torchbvh.readthedocs.io/).
65
+
66
+ ## Quickstart
67
+
68
+ ```python
69
+ import torch
70
+ import torchbvh as tb
71
+
72
+ points = torch.randn(1024, 3, device="cuda")
73
+ bvh = tb.BVH(points)
74
+
75
+ # k-NN
76
+ idx, dists = bvh.knn(points, k=8) # (N,8) int64, (N,8) float32
77
+
78
+ # MLS interpolation — gradients flow through features
79
+ feat = torch.randn(1024, 16, device="cuda", requires_grad=True)
80
+ out = bvh.interpolate(points, feat, k=8) # (N, 16)
81
+
82
+ # FPS downsampling geometry
83
+ fps = tb.fps(points, target_tokens=256)
84
+ # fps.indices, fps.points, fps.nearest_anchor, fps.anchor_radius, ...
85
+
86
+ # Batched: pass (B, N, D) → returns (B, N, k)
87
+ ```
88
+
89
+ Supports `D in {2, 3}`, `k in {4, 8, 16}`, CUDA float32 contiguous inputs.
90
+
91
+
92
+ ## References
93
+ `torchbvh` builds an implicit bounding volume hierarchy over 2-D or 3-D points.
94
+ The BVH layout follows the ostensibly-implicit tree formulation of Chitalu, Dubach, and
95
+ Komura, and the Python/CUDA implementation was ported from the Julia
96
+ `ImplicitBVH.jl` implementation.
97
+
98
+ - Floyd M. Chitalu, Christophe Dubach, and Taku Komura. "Binary
99
+ Ostensibly-Implicit Trees for Fast Collision Detection." Computer Graphics Forum,
100
+ 39(2), 509-521, 2020. DOI:
101
+ [10.1111/cgf.13948](https://doi.org/10.1111/cgf.13948).
102
+ - `ImplicitBVH.jl`, StellaOrg. Julia implementation of the implicitly indexed BVH
103
+ formulation from which the `torchbvh` BVH code was ported:
104
+ [github.com/StellaOrg/ImplicitBVH.jl](https://github.com/StellaOrg/ImplicitBVH.jl).
@@ -0,0 +1,79 @@
1
+ # torchbvh
2
+
3
+ GPU-native geometry primitives for PyTorch point-cloud workflows: BVH construction,
4
+ exact k-NN search, MLS interpolation, displaced-query helpers, and FPS downsampling.
5
+
6
+ ## Performance
7
+
8
+ k-NN at N=10k, 3D, k=8 (RTX 3500 Ada, uniform distribution):
9
+
10
+ | | Build + query |
11
+ |---|---|
12
+ | `scipy_cKD-Tree` CPU | ~23 ms |
13
+ | `torch_cluster` GPU | ~6.8 ms |
14
+ | `cupy_knn` GPU | ~4.1 ms |
15
+ | `torchbvh` GPU | ~1.4 ms |
16
+
17
+ FPS at B=16, N=10k, 25% selection (RTX 3500 Ada):
18
+
19
+ | | Time |
20
+ |---|---|
21
+ | `fpsample` CPU | ~833 ms |
22
+ | `torch_fpsample` h=7 (CPU, fastest setting) | ~37 ms |
23
+ | `torchbvh` GPU | ~21 ms |
24
+
25
+ See `benchmarks/third_party_algorithm_comparison.ipynb` for more detailed comparisons.
26
+
27
+
28
+ ## Install
29
+
30
+ ```bash
31
+ pip install torchbvh
32
+ ```
33
+
34
+ `torchbvh` builds a PyTorch CUDA extension. Source installs require PyTorch, a
35
+ compatible CUDA toolkit/NVCC, and a supported host compiler in the build environment.
36
+
37
+ ## Docs
38
+
39
+ Documentation can be found at [torchbvh.readthedocs.io](https://torchbvh.readthedocs.io/).
40
+
41
+ ## Quickstart
42
+
43
+ ```python
44
+ import torch
45
+ import torchbvh as tb
46
+
47
+ points = torch.randn(1024, 3, device="cuda")
48
+ bvh = tb.BVH(points)
49
+
50
+ # k-NN
51
+ idx, dists = bvh.knn(points, k=8) # (N,8) int64, (N,8) float32
52
+
53
+ # MLS interpolation — gradients flow through features
54
+ feat = torch.randn(1024, 16, device="cuda", requires_grad=True)
55
+ out = bvh.interpolate(points, feat, k=8) # (N, 16)
56
+
57
+ # FPS downsampling geometry
58
+ fps = tb.fps(points, target_tokens=256)
59
+ # fps.indices, fps.points, fps.nearest_anchor, fps.anchor_radius, ...
60
+
61
+ # Batched: pass (B, N, D) → returns (B, N, k)
62
+ ```
63
+
64
+ Supports `D in {2, 3}`, `k in {4, 8, 16}`, CUDA float32 contiguous inputs.
65
+
66
+
67
+ ## References
68
+ `torchbvh` builds an implicit bounding volume hierarchy over 2-D or 3-D points.
69
+ The BVH layout follows the ostensibly-implicit tree formulation of Chitalu, Dubach, and
70
+ Komura, and the Python/CUDA implementation was ported from the Julia
71
+ `ImplicitBVH.jl` implementation.
72
+
73
+ - Floyd M. Chitalu, Christophe Dubach, and Taku Komura. "Binary
74
+ Ostensibly-Implicit Trees for Fast Collision Detection." Computer Graphics Forum,
75
+ 39(2), 509-521, 2020. DOI:
76
+ [10.1111/cgf.13948](https://doi.org/10.1111/cgf.13948).
77
+ - `ImplicitBVH.jl`, StellaOrg. Julia implementation of the implicitly indexed BVH
78
+ formulation from which the `torchbvh` BVH code was ported:
79
+ [github.com/StellaOrg/ImplicitBVH.jl](https://github.com/StellaOrg/ImplicitBVH.jl).
@@ -0,0 +1,121 @@
1
+ # Algorithms
2
+
3
+ This page describes the public algorithms behind `torchbvh` operations. It focuses on the algorithm steps and observable behavior, not CUDA implementation details.
4
+
5
+ ## BVH Construction
6
+
7
+ `torchbvh` builds an implicit bounding volume hierarchy over 2-D or 3-D points.
8
+ The BVH layout follows the ostensibly-implicit tree formulation of Chitalu, Dubach, and
9
+ Komura, and the Python/CUDA implementation was ported from the Julia
10
+ `ImplicitBVH.jl` implementation.
11
+
12
+ 1. Compute the scene axis-aligned bounding box for the input points.
13
+ 2. Normalize each point into that scene box and assign it a Morton code.
14
+ 3. Sort source point indices by Morton code. The source point tensor itself stays in original order.
15
+ 4. Treat the sorted points as leaves of an implicit binary tree. If the leaf count is not a power of two, virtual leaves fill the rightmost missing positions.
16
+ 5. Store each real leaf's AABB as the point coordinate repeated as lower and upper bounds.
17
+ 6. Build internal node AABBs bottom-up by merging real child AABBs. If an internal node has only one real child, its AABB equals that child's AABB.
18
+ 7. Keep `sorted_indices` so leaf positions can map back to original source indices.
19
+
20
+ Fixed-size batched BVHs repeat the same process independently for each sample in the batch. Ragged BVHs apply the same single-sample process to each packed segment described by `batch_offsets`.
21
+
22
+ Why it is fast: The construction avoids serial tree insertion and pointer-heavy node allocation. Morton sorting converts spatial hierarchy construction into a parallel sort-plus-reduction problem, and the compact implicit tree stores only real node AABBs
23
+ and source-index mappings. The main bottleneck in pointer or CPU tree builders is irregular allocation and recursive dependency; this design replaces it with contiguous tensor work that can be built and traversed with predictable memory access.
24
+
25
+ ## k-NN Query
26
+
27
+ k-NN is an exact branch-and-bound search over the BVH. Returned distances are squared Euclidean distances sorted from nearest to farthest.
28
+
29
+ For each query point:
30
+
31
+ 1. Start with an empty sorted neighbor list of length `k`, initialized to infinite distances.
32
+ 2. Start traversal at the BVH root.
33
+ 3. For a candidate node, compute the minimum possible squared distance from the query point to that node's AABB.
34
+ 4. If that minimum distance is greater than the current kth-best distance, skip the whole node.
35
+ 5. If the node is a leaf, compute the exact squared distance to its source point and insert the point into the sorted neighbor list if it improves the current list.
36
+ 6. If the node is internal, evaluate both real children and visit the closer child first.
37
+ 7. Continue until no reachable node can improve the neighbor list.
38
+ 8. Return original source indices and their matching squared distances.
39
+
40
+ Self-neighbors are included when source points are queried against themselves. When two or more points are exactly tied, the returned tied indices are valid exact neighbors, but their order is not part of the public contract.
41
+
42
+ For fixed-size batches, each sample is searched independently and indices are local to that sample. For ragged batches, each packed segment is searched independently and indices are local to the segment, not global packed-row indices.
43
+
44
+ Why it is fast: Brute-force k-NN evaluates every query against every source point, so its distance-work scales as `M * N`. BVH traversal replaces most of those exact point distance evaluations with cheap AABB lower-bound tests. Once a query has `k` good neighbors, any subtree whose nearest possible point is farther than the current k-th neighbor is skipped entirely. The remaining per-query work is independent, so many queries can run in parallel without CPU-side tree traversal or Python loops.
45
+
46
+ ## MLS Interpolation
47
+
48
+ Moving least squares interpolation fits a local linear field around each query position using k-NN neighborhoods.
49
+
50
+ For each displaced query point:
51
+
52
+ 1. Run exact k-NN against the source geometry to get neighbor indices, squared distances, and neighbor positions. This selection is discrete and non-differentiable.
53
+ 2. Gather neighbor features with the returned indices.
54
+ 3. If one or more neighbors have squared distance at or below the exact-hit epsilon, use the unweighted mean of those exact-hit neighbor features as the interpolated value and return a zero spatial field gradient.
55
+ 4. Otherwise, compute offsets from the query to each neighbor position.
56
+ 5. Choose a local bandwidth from the sorted neighbor distances and clamp it away from zero.
57
+ 6. Assign each neighbor a smooth weight from its squared offset distance and the local bandwidth.
58
+ 7. Fit a regularized weighted linear model in local coordinates:
59
+ value = constant term + spatial slope dot offset.
60
+ 8. Return the fitted constant term as the interpolated feature.
61
+ 9. When `return_grad=True`, return the fitted spatial slope as the field gradient.
62
+
63
+ Gradients flow through the MLS solve to `features` and live `displaced_points`.
64
+ Gradients do not flow through BVH construction, Morton sorting, k-NN selection, integer indices, squared distances, or detached neighbor positions.
65
+
66
+ Why it is fast: a dense differentiable interpolation would either compare each query to all source points or build large intermediate tensors for weights and gradients. MLS uses the exact k-NN result to restrict the solve to `k` local samples, then solves only a
67
+ small regularized linear system per query. The discrete geometry search is detached, so autograd tracks the continuous MLS solve for `features` and `displaced_points` without recording the BVH traversal, sorting, or integer neighbor selection.
68
+
69
+ ## FPS
70
+
71
+ Farthest point sampling selects anchor points that cover the source geometry and returns assignment metadata for the selected anchors.
72
+
73
+ All FPS modes maintain this state:
74
+
75
+ 1. `indices`: selected source point indices in selection order.
76
+ 2. `nearest_anchor`: for each source point, the selection-order anchor currently closest to it.
77
+ 3. `nearest_anchor_dist_sq`: the squared distance from each source point to that nearest anchor.
78
+
79
+ ### Exact FPS
80
+
81
+ The exact modes follow the standard farthest point sampling rule.
82
+
83
+ 1. Choose the first anchor from `seed`. If `seed=-1`, choose the source point nearest the input AABB center.
84
+ 2. Initialize every source point's nearest-anchor distance to its squared distance from the first anchor.
85
+ 3. Select the next anchor as a point with maximum current nearest-anchor distance.
86
+ 4. Update every source point: if the new anchor is closer than its current nearest anchor, replace the stored distance and assignment.
87
+ 5. Repeat selection and update until `target_tokens` anchors have been selected.
88
+ 6. Gather selected anchor coordinates.
89
+ 7. Compute per-anchor assignment counts and per-anchor radius from the final nearest-anchor assignments.
90
+ 8. Compute `coarse_order`, which orders selected anchors by BVH leaf position for locality-aware downstream gathering.
91
+
92
+ `mode="exact_full_scan"` expresses this rule directly. `mode="exact_bucketed"` preserves the same exact result while using BVH/Morton bucket structure to organize the work.
93
+
94
+ Why it is fast: standard exact FPS is dominated by the repeated global update and maximum search over all `N` points for each of `M` anchors. `exact_full_scan` keeps that state on the GPU and avoids CPU round trips. `exact_bucketed` keeps the exact selection rule but groups points by BVH/Morton buckets, so bucket maxima identify farthest candidates and AABB-based pruning can skip bucket refreshes that cannot improve any point's current nearest-anchor distance. This reduces memory traffic while preserving the exact FPS sequence.
95
+
96
+ ### Approximate Bucketed FPS
97
+
98
+ `mode="approx_bucketed"` keeps the same output metadata but relaxes the anchor selection rule for speed.
99
+
100
+ 1. Build the same BVH/Morton spatial structure used by the exact bucketed path.
101
+ 2. Choose the first anchor with the same seed policy.
102
+ 3. Maintain exact nearest-anchor assignments and distances for the anchors that have already been selected.
103
+ 4. Track candidate farthest points from spatial buckets instead of scanning the full point set for one global maximum every round.
104
+ 5. Generate a small candidate set, reject duplicates, and commit one or more anchors per round according to the `r`, `c`, and `alpha` settings.
105
+ 6. After committed anchors are chosen, update every affected source point's exact nearest-anchor distance and assignment against those committed anchors.
106
+ 7. Refresh bucket state and repeat until `target_tokens` anchors have been selected.
107
+ 8. Compute the same public metadata as exact FPS.
108
+
109
+ The approximate mode is useful when throughput is more important than matching the exact serial FPS sequence. Its returned assignments are exact with respect to the anchors it selected.
110
+
111
+ Why it is fast: the fundamental bottleneck in exact FPS is the serial dependency that selects one anchor, updates all distances, then selects the next anchor. The approximate bucketed mode breaks part of that dependency by proposing farthest candidates from spatial buckets and committing multiple anchors per round. Distance assignments are then updated against the committed anchor set in one pass. The algorithm trades exact global anchor order for fewer full distance-update and bucket-refresh cycles, while keeping the final nearest-anchor metadata exact for the anchors it actually selected.
112
+
113
+ ## References
114
+
115
+ - Floyd M. Chitalu, Christophe Dubach, and Taku Komura. "Binary
116
+ Ostensibly-Implicit Trees for Fast Collision Detection." Computer Graphics Forum,
117
+ 39(2), 509-521, 2020. DOI:
118
+ [10.1111/cgf.13948](https://doi.org/10.1111/cgf.13948).
119
+ - `ImplicitBVH.jl`, StellaOrg. Julia implementation of the implicitly indexed BVH
120
+ formulation from which the `torchbvh` BVH code was ported:
121
+ [github.com/StellaOrg/ImplicitBVH.jl](https://github.com/StellaOrg/ImplicitBVH.jl).