Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Symmetry and Geometry in Neural Representations

Does equivariance matter at scale?

Johann Brehmer · Sönke Behrends · Pim de Haan · Taco Cohen

Keywords: [ Geometric deep learning ] [ neural scaling laws ] [ equivariance ]


Abstract:

Given large data sets and sufficient compute, is it beneficial to design neural architectures for the structure and symmetries of a problem, or is it more efficient to learn them from data? We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation closes this gap. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget. Finally, the optimal allocation of a compute budget onto model size and training duration differs between equivariant and non-equivariant models.

Chat is not available.