Poster
in
Affinity Workshop: Black in AI
Set2Set Transformer: Towards End-to-End 3D Object Detection from Point Clouds
Yeabsira Tessema · Abel Mekonnen · Michael Desta · Selameab Demilew
Keywords: [ Computer Vision ]
Accurate and robust perception of the surrounding environment is a key component of all autonomous systems such as self-driving cars. Currently, widely used 3D object detectors employ complex handcrafted feature extractors and post-processors to produce semantic object interpretations from raw point clouds. In essence, the task of detecting 3D bounding boxes from point clouds can be reduced to a set-to-set transformation. The input is a set of points while the output is a set of bounding boxes. In this work, we streamline the 3D object detection pipeline by using a simple transformer architecture. We overcome the apparent challenge of quadratic memory and computation complexity of transformers by sampling the point cloud using a differentiable sampling network. We demonstrate the efficacy of our methods on the ubiquitous KITTI benchmark.