Invited Talk
in
Affinity Event: Black in AI
Keynote: Scale is Not All You Need!
Elvis Dohmatob
Guided by neural scaling laws, the training of current large language models (LLMs) follows a simplistic mantra about dataset and model size: the bigger, the better. The success of these models has led to an explosion of synthetic data, some of which has now unintentionally become part of training corpora for future generations of the LLMs. However, recent research highlights several fundamental issues with this self-reinforcing loop: (1) synthetic data tends to exacerbate algorithmic bias, and (2) synthetic data can lead to model collapse, a phenomenon where the model’s performance eventually ceases to improve with additional training data, contrary to the picture predicted by idealized neural scaling laws.
About the speaker: Dr. Elvis Dohmatob is a Cameroonian computer scientist working on various topics in artificial intelligence (AI) and machine learning (ML). His current research agenda focuses around the following themes: Trustworthy AI/ML (algorithmic bias, adversarial robustness, out-of-distribution generalization, etc.); Learning Theory (neural scaling laws, model collapse, etc.); Neural Networks (attention, associative memories, etc.); Representation Learning; Optimization (convex, discrete, etc.).
Recently, he joined Concordia University, Montreal, as an Associate Professor. He is also an Affiliate Professor at the Mila Institute, and a Visiting Professor at Meta. Prior to Concordia, he held positions at INRIA (Paris, France), Criteo (Paris, France), and Meta (Paris, France).
Live content is unavailable. Log in and register to view live content