Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Associative Memory & Hopfield Networks in 2023

Associative Transformer Is A Sparse Representation Learner

Yuwei Sun · Hideya Ochiai · Zhirong Wu · Stephen Lin · Ryota Kanai


Abstract:

Emerging from the monolithic pairwise attention mechanism in conventional Transformer models, there is a growing interest in leveraging sparse interactions that align more closely with biological principles. Approaches including the Set Transformer and the Perceiver employ cross-attention consolidated with a latent space that forms an attention bottleneck with limited capacity. Building upon recent neuroscience studies of the Global Workspace Theory and associative memory, we propose the Associative Transformers (AiT). AiT induces low-rank explicit memory that serves as both priors to guide bottleneck attention in shared workspace and attractors within associative memory of a Hopfield network. We show that AiT is a sparse representation learner, learning distinct priors through the bottlenecks that are complexity-invariant to input quantities and dimensions. AiT demonstrates its superiority over methods such as the Set Transformer, Vision Transformer, and Coordination in various vision tasks.

Chat is not available.