Skip to yearly menu bar Skip to main content


Poster

3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction

Jongmin Lee · Minsu Cho

[ ]
Thu 12 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Determining the 3D orientations of an object in an image, known as single-image pose estimation, is a crucial task in 3D vision applications. Existing methods typically learn 3D rotations parametrized in the spatial domain using Euler angles, quaternions, and axis-angle representations. To effectively train 3D rotations, SO(3)-equivariant networks are used, allowing for the structured capture of pose patterns with data-efficient training. However, for efficient SO(3)-equivariant networks such as spherical CNNs, parametrization in the spatial domain becomes problematic because their convolutions operate in the frequency domain. To overcome this issue, we introduce a novel approach that predicts Wigner-D coefficients directly in the frequency domain for 3D rotation regression. Our method employs SO(3)-equivariant networks alongside a spherical mapper to project 2D image features onto the sphere, ensuring consistent and reliable performance under arbitrary rotations. Our approach achieves state-of-the-art performance on standard pose estimation benchmarks, ModelNet10-SO(3) and PASCAL3D+, demonstrating significant improvements in accuracy, robustness, and data efficiency compared to existing 3D rotation estimation methods.

Live content is unavailable. Log in and register to view live content