Skip to yearly menu bar Skip to main content


Poster
in
Workshop: MATH-AI: The 3rd Workshop on Mathematical Reasoning and AI

Continual Learning and Out of Distribution Generalization in a Systematic Reasoning Task

Mustafa Abdool · Andrew Nam · James McClelland

Keywords: [ transformers ] [ out of distribution generalization ] [ games ] [ Deep Neural Networks ] [ continual learning ] [ abstract reasoning ] [ systematic reasoning ]


Abstract: Humans have the remarkable ability to rapidly learn new problem solving strategies from a narrow range of examples and extend to examples out of the distribution (OOD) used in learning, but such generalization remains a challenge for neural networks. This seems especially important for learning new mathematical techniques, which apply to huge problem spaces (e.g. all real numbers). We explore this limitation by training neural networks on strategies for solving specified cells in $6\times6$ Sudoku puzzles using a novel curriculum of tasks that build upon each other. We train transformers sequentially on two preliminary tasks, then assess OOD generalization of a more complex solution strategy from a range of restricted training distributions. Baseline models master the training distribution, but fail to generalize to OOD data. However, we find that a combination of extensions is sufficient to support highly accurate and reliable OOD generalization. These results suggest directions for improving the robustness of larger transformer models under the highly imbalanced data distributions provided by natural data sets.

Chat is not available.