Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Associative Memory & Hopfield Networks in 2023

Sparse Modern Hopfield Networks

AndrĂ© Martins · Vlad Niculae · Daniel McNamee


Abstract: Ramsauer et al. (2021) recently pointed out a connection between modern Hopfield networks and attention heads in transformers. In this paper, we extend their framework to a broader family of energy functions which can be written as a difference of a quadratic regularizer and a Fenchel-Young loss (Blondel et al., 2020), parametrized by a generalized negentropy function $\Omega$. By working with Tsallis negentropies, the resulting update rules become end-to-end differentiable sparse transformations, establishing a new link to adaptively sparse transformers (Correia et al., 2019) and allowing for exact convergence to single memory patterns. Experiments on simulated data show a higher tendency to avoid metastable states.

Chat is not available.