"3D understanding of a scene is of fundamental importance to Extended Reality (XR), which includes both Augmented Reality (AR) and Virtual Reality (VR). In addition, being able to efficiently run all the needed 3D algorithms on a resource-constrained XR device is critical for delivering a satisfactory XR experience.
In this demo, we showcase a full-stack approach, where we efficiently deploy 3D understanding on a XR headset by optimizing across the AI algorithms, software, and Snapdragon® hardware. In particular, we first adopt a self-supervised learning strategy to train a monocular depth estimation network on unlabeled video sequences previously captured on the headset and utilize information from the 6 degrees-of-freedom (6DoF) camera tracking algorithm to provide scale-correct training and inference. Then, the trained depth network is quantized and deployed onto the device using the Qualcomm® Neural Processing SDK. Given our accurate, low-latency depth estimation and 6DoF pose estimation, we perform 3D reconstruction of the scene as well as plane estimation in real time on the XR headset."