Poster
in
Workshop: Mathematics of Modern Machine Learning (M3L)
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Francesco Mori · Stefano Sarao Mannelli · Francesca Mignacco
Keywords: [ statistical physics ] [ continual learning ] [ optimal control theory ] [ sequential learning ] [ training dynamics ] [ multi-task learning ]
Artificial neural networks often struggle with catastrophic forgetting when learning tasks sequentially, as training on new tasks degrades the performance on earlier ones. Recent theoretical work tackled this issue by analysing learning curves in synthetic settings with predefined training protocols. However, these protocols were heuristic-based and lacked a solid theoretical foundation for assessing their optimality. We address this gap by combining exact training dynamics equations, derived using statistical physics, with optimal control methods. We apply this approach to teacher-student models of continual learning, obtaining a theory for task-selection protocols that optimise performance minimising forgetting. Our analysis offers non-trivial yet interpretable strategies, showing how optimal learning protocols modulate established effects, such as the influence of task similarity on forgetting. We validate our theoretical findings on real-world data.