Spotlight Poster
Neural networks with fast and bounded units learn flexible task abstractions
Jan Bauer · Alexandra Proca · Kai Sandbrink · Andrew Saxe · Christopher Summerfield · Ali Hummos
Artificial neural networks are an important model of neural circuits and cognition, but they differ in how they handle dynamically changing environments. Whereas animals leverage distribution changes to segment their continuous stream of experience and respond flexibly (flexible and adaptive regime), neural networks forget previous distributions after a change and are best trained with shuffled datasets that animals find challenging (forgetful regime). Here, we analyze a linear gated network where the weights and gates are jointly optimized via gradient descent, but with neuron-like constraints on the gates including non-negativity, faster timescale, and regularization. We observe that modules in the weights layer specialize to the tasks encountered, and gating units switch these modules as needed. We analytically reduce the learning dynamics to an effective eigenspace and show that efficient gating drives weight specialization by protecting previous knowledge, while weight specialization increases the update rate of the gating layer. Task switching in the gating layer accelerates as a function of curriculum block size and task training, mirroring key findings in cognitive neuroscience. We show that the discovered task abstractions support generalization through both task and subtask composition, and we extend our findings to the non-linear setting of a neural network trained on two tasks. Finally, to generalize our findings, we apply the neuronal constraints to the second layer weights of deep monolithic network and induce the adaptive learning regime, without any other architectural assumptions.Overall, our work offers a theory of blocked online learning in animals as arising from joint gradient descent on synaptic and neural gating in a neural network architecture.
Live content is unavailable. Log in and register to view live content