TY - JOUR
T1 - Chunked Bounding Volume Hierarchies for fast digital prototyping using volumetric meshes
AU - Schmidtke, Robert
AU - Erleben, Kenny
PY - 2018
Y1 - 2018
N2 - We present a novel approach to using Bounding Volume Hierarchies (BVHs) for collision detection of volumetric meshes for digital prototyping based on accurate simulation. In general, volumetric meshes contain more primitives than surface meshes, which in turn means larger BVHs. To manage these larger BVHs, we propose an algorithm for splitting meshes into smaller chunks with a limited-size BVH each. Limited-height BVHs make guided, all-pairs testing of two chunked meshes well-suited for GPU implementation. This is because the dynamically generated work during BVH traversal becomes bounded. Chunking is simple to implement compared to dynamic load balancing methods and can result in an overall two orders of magnitude speedup on GPUs. This indicates that dynamic load balancing may not be a well suited scheme for the GPU. The overall application timings showed that data transfers were not the bottleneck. Instead, the conversion to and from OpenCL friendly data structures was causing serious performance impediments. Still, a simple OpenMP acceleration of the conversion allowed the GPU solution to beat the CPU solution by 20 percent. We demonstrate our results using rigid and deformable body scenes of varying complexities on a variety of GPUs.
AB - We present a novel approach to using Bounding Volume Hierarchies (BVHs) for collision detection of volumetric meshes for digital prototyping based on accurate simulation. In general, volumetric meshes contain more primitives than surface meshes, which in turn means larger BVHs. To manage these larger BVHs, we propose an algorithm for splitting meshes into smaller chunks with a limited-size BVH each. Limited-height BVHs make guided, all-pairs testing of two chunked meshes well-suited for GPU implementation. This is because the dynamically generated work during BVH traversal becomes bounded. Chunking is simple to implement compared to dynamic load balancing methods and can result in an overall two orders of magnitude speedup on GPUs. This indicates that dynamic load balancing may not be a well suited scheme for the GPU. The overall application timings showed that data transfers were not the bottleneck. Instead, the conversion to and from OpenCL friendly data structures was causing serious performance impediments. Still, a simple OpenMP acceleration of the conversion allowed the GPU solution to beat the CPU solution by 20 percent. We demonstrate our results using rigid and deformable body scenes of varying complexities on a variety of GPUs.
KW - Collision detection
KW - bounding volume hierarchies
KW - parallel tandem traversal
KW - scheduling algorithm
UR - http://www.scopus.com/inward/record.url?scp=85039761803&partnerID=8YFLogxK
U2 - 10.1109/TVCG.2017.2784441
DO - 10.1109/TVCG.2017.2784441
M3 - Journal article
C2 - 29990042
SN - 1077-2626
VL - 24
SP - 3044
EP - 3057
JO - IEEE Transactions on Visualization and Computer Graphics
JF - IEEE Transactions on Visualization and Computer Graphics
IS - 12
ER -