Mon 3:15 a.m. - 3:50 a.m.
|
Welcome event (gather.town)
(
Social event/Break
)
>
link
|
🔗
|
Mon 3:58 a.m. - 4:00 a.m.
|
Opening Remarks to Session 1
(
Organizer intro
)
>
SlidesLive Video
|
Sebastian Stich
🔗
|
Mon 4:00 a.m. - 4:25 a.m.
|
Deep Learning: Success, Failure, and the Border between them, Shai Shalev-Shwartz
(
Plenary Speaker
)
>
SlidesLive Video
|
Shai Shalev-Shwartz
🔗
|
Mon 4:25 a.m. - 4:30 a.m.
|
Q&A with Shai Shalev-Shwartz
(
Q&A
)
>
|
Shai Shalev-Shwartz
🔗
|
Mon 4:30 a.m. - 4:55 a.m.
|
Learning with Strange Gradients, Martin Jaggi
(
Plenary Speaker
)
>
SlidesLive Video
|
Martin Jaggi
🔗
|
Mon 4:55 a.m. - 5:00 a.m.
|
Q&A with Martin Jaggi
(
Q&A
)
>
|
Martin Jaggi
🔗
|
Mon 5:00 a.m. - 5:30 a.m.
|
Contributed Talks in Session 1 (Zoom)
(
Orals and spotlights
)
>
SlidesLive Video
|
Sebastian Stich · Futong Liu · Abdurakhmon Sadiev · Frederik Benzing · Simon Roburin
🔗
|
Mon 5:30 a.m. - 6:30 a.m.
|
Poster Session 1 (gather.town)
(
Poster session
)
>
link
|
20 presenters
Hamed Jalali · Robert Hönig · Maximus Mutschler · Manuel Madeira · Abdurakhmon Sadiev · Egor Shulgin · Alasdair Paren · Pascal Esser · Simon Roburin · Julius Kunze · Agnieszka Słowik · Frederik Benzing · Futong Liu · Hongyi Li · Ryotaro Mitsuboshi · Grigory Malinovsky · Jayadev Naram · Zhize Li · Igor Sokolov · Sharan Vaswani
🔗
|
Mon 6:30 a.m. - 6:58 a.m.
|
Break (gather.town)
link
|
🔗
|
Mon 6:58 a.m. - 7:00 a.m.
|
Opening Remarks to Session 2
(
Organizer intro
)
>
|
Courtney Paquette
🔗
|
Mon 7:00 a.m. - 7:25 a.m.
|
The global optimization of functions with low effective dimension - better than worst-case?, Coralia Cartis
(
Plenary Speaker
)
>
SlidesLive Video
|
Coralia Cartis
🔗
|
Mon 7:25 a.m. - 7:30 a.m.
|
Q&A with Coralia Cartis
(
Q&A
)
>
|
Coralia Cartis
🔗
|
Mon 7:30 a.m. - 7:55 a.m.
|
Non-Euclidean Differentially Private Stochastic Convex Optimization, Cristóbal Guzmán
(
Plenary Speaker
)
>
SlidesLive Video
|
Cristóbal Guzmán
🔗
|
Mon 7:55 a.m. - 8:00 a.m.
|
Q&A with Cristóbal Guzmán
(
Q&A
)
>
|
Cristóbal Guzmán
🔗
|
Mon 8:00 a.m. - 8:30 a.m.
|
Contributed Talks in Session 2 (Zoom)
(
Orals and spotlights
)
>
SlidesLive Video
|
Courtney Paquette · Chris Junchi Li · Jeffery Kline · Junhyung Lyle Kim · Pascal Esser
🔗
|
Mon 8:30 a.m. - 9:58 a.m.
|
Break
link
|
🔗
|
Mon 9:58 a.m. - 10:00 a.m.
|
Opening Remarks to Session 3
(
Organizer intro
)
>
|
Oliver Hinder
🔗
|
Mon 10:00 a.m. - 10:25 a.m.
|
Avoiding saddle points in nonsmooth optimization, Damek Davis
(
Plenary Speaker
)
>
SlidesLive Video
|
Damek Davis
🔗
|
Mon 10:25 a.m. - 10:30 a.m.
|
Q&A with Damek Davis
(
Q&A
)
>
|
Damek Davis
🔗
|
Mon 10:30 a.m. - 10:55 a.m.
|
Faster Empirical Risk Minimization, Jelena Diakonikolas
(
Plenary Speaker
)
>
SlidesLive Video
|
Jelena Diakonikolas
🔗
|
Mon 10:55 a.m. - 11:00 a.m.
|
Q&A with Jelena Diakonikolas
(
Q&A
)
>
|
Jelena Diakonikolas
🔗
|
Mon 11:00 a.m. - 11:30 a.m.
|
Contributed talks in Session 3 (Zoom)
(
Orals and spotlights
)
>
SlidesLive Video
|
Oliver Hinder · Wenhao Zhan · Akhilesh Soni · Grigory Malinovsky · Boyue Li
🔗
|
Mon 11:30 a.m. - 12:30 p.m.
|
Poster Session 2 (gather.town)
(
Poster session
)
>
link
|
28 presenters
Wenjie Li · Akhilesh Soni · Jinwuk Seok · Jianhao Ma · Jeffery Kline · Mathieu Tuli · Miaolan Xie · Robert Gower · Quanqi Hu · Matteo Cacciola · Yuanlu Bai · Boyue Li · Wenhao Zhan · Shentong Mo · Junhyung Lyle Kim · Sajad Fathi Hafshejani · Chris Junchi Li · Zhishuai Guo · Harshvardhan Harshvardhan · Neha Wadia · Tatjana Chavdarova · Difan Zou · Zixiang Chen · Aman Gupta · Jacques Chen · Betty Shea · Benoit Dherin · Aleksandr Beznosikov
🔗
|
Mon 12:30 p.m. - 12:58 p.m.
|
Break (gather.town)
link
|
🔗
|
Mon 12:58 p.m. - 1:00 p.m.
|
Opening Remarks to Session 4
(
Organizer intro
)
>
|
Quanquan Gu
🔗
|
Mon 1:00 p.m. - 1:25 p.m.
|
Online Learning via Linear Programming, Yinyu Ye
(
Plenary Speaker
)
>
SlidesLive Video
|
Yinyu Ye
🔗
|
Mon 1:25 p.m. - 1:30 p.m.
|
Q&A with Yinyu Ye
(
Q&A
)
>
|
Yinyu Ye
🔗
|
Mon 1:30 p.m. - 1:55 p.m.
|
Putting Randomized Matrix Algorithms in LAPACK, and Connections with Second-order Stochastic Optimization, Michael Mahoney
(
Plenary Speaker
)
>
SlidesLive Video
|
Michael Mahoney
🔗
|
Mon 1:55 p.m. - 2:00 p.m.
|
Q&A with Michael Mahoney
(
Q&A
)
>
|
Michael Mahoney
🔗
|
Mon 2:00 p.m. - 2:30 p.m.
|
Contributed talks in Session 4 (Zoom)
(
Orals and spotlights
)
>
SlidesLive Video
|
Quanquan Gu · Agnieszka Słowik · Jacques Chen · Neha Wadia · Difan Zou
🔗
|
Mon 2:30 p.m. - 2:35 p.m.
|
Closing remarks
(
Organizer closing
)
>
|
Courtney Paquette
🔗
|
-
|
Integer Programming Approaches To Subspace Clustering With Missing Data
(
Poster
)
>
|
Akhilesh Soni · Daniel Pimentel-Alarcón
🔗
|
-
|
Integer Programming Approaches To Subspace Clustering With Missing Data
(
Spotlight
)
>
|
Akhilesh Soni · Daniel Pimentel-Alarcón
🔗
|
-
|
Farkas' Theorem of the Alternative for Prior Knowledge in Deep Networks
(
Poster
)
>
|
Jeffery Kline · Joseph Bockhorst
🔗
|
-
|
Farkas' Theorem of the Alternative for Prior Knowledge in Deep Networks
(
Spotlight
)
>
|
Jeffery Kline · Joseph Bockhorst
🔗
|
-
|
Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes
(
Poster
)
>
|
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov
🔗
|
-
|
Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes
(
Spotlight
)
>
|
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov
🔗
|
-
|
Towards Modeling and Resolving Singular Parameter Spaces using Stratifolds
(
Poster
)
>
|
Pascal Esser · Frank Nielsen
🔗
|
-
|
Towards Modeling and Resolving Singular Parameter Spaces using Stratifolds
(
Spotlight
)
>
|
Pascal Esser · Frank Nielsen
🔗
|
-
|
Spherical Perspective on Learning with Normalization Layers
(
Poster
)
>
|
Simon Roburin · Yann de Mont-Marin · Andrei Bursuc · Renaud Marlet · Patrick Pérez · Mathieu Aubry
🔗
|
-
|
Spherical Perspective on Learning with Normalization Layers
(
Spotlight
)
>
|
Simon Roburin · Yann de Mont-Marin · Andrei Bursuc · Renaud Marlet · Patrick Pérez · Mathieu Aubry
🔗
|
-
|
Optimization with Adaptive Step Size Selection from a Dynamical Systems Perspective
(
Poster
)
>
|
Neha Wadia · Michael Jordan · Michael Muehlebach
🔗
|
-
|
Optimization with Adaptive Step Size Selection from a Dynamical Systems Perspective
(
Spotlight
)
>
|
Neha Wadia · Michael Jordan · Michael Muehlebach
🔗
|
-
|
Better Linear Rates for SGD with Data Shuffling
(
Poster
)
>
|
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik
🔗
|
-
|
Better Linear Rates for SGD with Data Shuffling
(
Spotlight
)
>
|
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik
🔗
|
-
|
Fast, Exact Subsampled Natural Gradients and First-Order KFAC
(
Poster
)
>
|
Frederik Benzing
🔗
|
-
|
Fast, Exact Subsampled Natural Gradients and First-Order KFAC
(
Spotlight
)
>
|
Frederik Benzing
🔗
|
-
|
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
(
Poster
)
>
|
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu
🔗
|
-
|
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
(
Spotlight
)
>
|
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu
🔗
|
-
|
DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization
(
Poster
)
>
|
Boyue Li · Zhize Li · Yuejie Chi
🔗
|
-
|
DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization
(
Spotlight
)
>
|
Boyue Li · Zhize Li · Yuejie Chi
🔗
|
-
|
Heavy-tailed noise does not explain the gap between SGD and Adam on Transformers
(
Poster
)
>
|
Jacques Chen · Frederik Kunstner · Mark Schmidt
🔗
|
-
|
Heavy-tailed noise does not explain the gap between SGD and Adam on Transformers
(
Spotlight
)
>
|
Jacques Chen · Frederik Kunstner · Mark Schmidt
🔗
|
-
|
Acceleration and Stability of the Stochastic Proximal Point Algorithm
(
Poster
)
>
|
Junhyung Lyle Kim · Panos Toulis · Anastasios Kyrillidis
🔗
|
-
|
Acceleration and Stability of the Stochastic Proximal Point Algorithm
(
Spotlight
)
>
|
Junhyung Lyle Kim · Panos Toulis · Anastasios Kyrillidis
🔗
|
-
|
Gaussian Graphical Models as an Ensemble Method for Distributed Gaussian Processes
(
Poster
)
>
|
Hamed Jalali · Gjergji Kasneci
🔗
|
-
|
DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning
(
Poster
)
>
|
Robert Hönig · Yiren Zhao · Robert Mullins
🔗
|
-
|
Barzilai and Borwein conjugate gradient method equipped with a non-monotone line search technique
(
Poster
)
>
|
Sajad Fathi Hafshejani · Daya Gaur · Shahadat Hossain · Robert Benkoczi
🔗
|
-
|
Using a one dimensional parabolic model of the full-batch loss to estimate learning rates during training
(
Poster
)
>
|
Maximus Mutschler · Andreas Zell
🔗
|
-
|
Community-based Layerwise Distributed Training of Graph Convolutional Networks
(
Poster
)
>
|
Hongyi Li · Junxiang Wang · Yongchao Wang · Yue Cheng · Liang Zhao
🔗
|
-
|
Optimum-statistical Collaboration Towards Efficient Black-boxOptimization
(
Poster
)
>
|
Wenjie Li · Chi-Hua Wang · Guang Cheng
🔗
|
-
|
COCO Denoiser: Using Co-Coercivity for Variance Reduction in Stochastic Convex Optimization
(
Poster
)
>
|
Manuel Madeira · Renato Negrinho · Joao Xavier · Pedro Aguiar
🔗
|
-
|
Stochastic Learning Equation using Monotone Increasing Resolution of Quantization
(
Poster
)
>
|
Jinwuk Seok ·
🔗
|
-
|
Sign-RIP: A Robust Restricted Isometry Property for Low-rank Matrix Recovery
(
Poster
)
>
|
Jianhao Ma · Salar Fattahi
🔗
|
-
|
Practice-Consistent Analysis of Adam-Style Methods
(
Poster
)
>
|
Zhishuai Guo · Yi Xu · Wotao Yin · Rong Jin · Tianbao Yang
🔗
|
-
|
Towards Robust and Automatic Hyper-Parameter Tunning
(
Poster
)
>
|
Mathieu Tuli · Mahdi Hosseini · Konstantinos N Plataniotis
🔗
|
-
|
Random-reshuffled SARAH does not need a full gradient computations
(
Poster
)
>
|
Aleksandr Beznosikov · Martin Takac
🔗
|
-
|
Shifted Compression Framework: Generalizations and Improvements
(
Poster
)
>
|
Egor Shulgin · Peter Richtarik
🔗
|
-
|
A New Scheme for Boosting with an Average Margin Distribution Oracle
(
Poster
)
>
|
Ryotaro Mitsuboshi · Kohei Hatano · Eiji Takimoto
🔗
|
-
|
The Geometric Occam Razor Implicit in Deep Learning
(
Poster
)
>
|
Benoit Dherin · Michael Munn · David Barrett
🔗
|
-
|
Escaping Local Minima With Stochastic Noise
(
Poster
)
>
|
Harshvardhan Harshvardhan · Sebastian Stich
🔗
|
-
|
Faking Interpolation Until You Make It
(
Poster
)
>
|
Alasdair Paren · Rudra Poudel · Pawan K Mudigonda
🔗
|
-
|
High Probability Step Size Lower Bound for Adaptive Stochastic Optimization
(
Poster
)
>
|
Katya Scheinberg · Miaolan Xie
🔗
|
-
|
Adaptive Optimization with Examplewise Gradients
(
Poster
)
>
|
Julius Kunze · James Townsend · David Barber
🔗
|
-
|
Structured Low-Rank Tensor Learning
(
Poster
)
>
|
Jayadev Naram · Tanmay Sinha · Pawan Kumar
🔗
|
-
|
ANITA: An Optimal Loopless Accelerated Variance-Reduced Gradient Method
(
Poster
)
>
|
Zhize Li
🔗
|
-
|
EF21 with Bells & Whistles: Practical Algorithmic Extensions of Modern Error Feedback
(
Poster
)
>
|
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin · Eduard Gorbunov · Zhize Li
🔗
|
-
|
Stochastic Polyak Stepsize with a Moving Target
(
Poster
)
>
|
Robert Gower · Aaron Defazio · Mike Rabbat
🔗
|
-
|
Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations
(
Poster
)
>
|
Tatjana Chavdarova · Michael Jordan · Emmanouil Zampetakis
🔗
|
-
|
Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent
(
Poster
)
>
|
Sharan Vaswani · Benjamin Dubois-Taine · Reza Babanezhad Harikandeh
🔗
|
-
|
On Server-Side Stepsizes in Federated Optimization: Theory Explaining the Heuristics
(
Poster
)
>
|
Grigory Malinovsky · Konstantin Mishchenko · Peter Richtarik
🔗
|
-
|
A Stochastic Momentum Method for Min-max Bilevel Optimization
(
Poster
)
>
|
Quanqi Hu · Bokun Wang · Tianbao Yang
🔗
|
-
|
Deep Neural Networks pruning via the Structured Perspective Regularization
(
Poster
)
>
|
Matteo Cacciola · Andrea Lodi · Xinlin Li
🔗
|
-
|
Efficient Calibration of Multi-Agent Market Simulators from Time Series with Bayesian Optimization
(
Poster
)
>
|
Yuanlu Bai · Svitlana Vyetrenko · Henry Lam · Tucker Balch
🔗
|
-
|
Faster Perturbed Stochastic Gradient Methods for Finding Local Minima
(
Poster
)
>
|
Zixiang Chen · Dongruo Zhou · Quanquan Gu
🔗
|
-
|
Adam vs. SGD: Closing the generalization gap on image classification
(
Poster
)
>
|
Aman Gupta · Rohan Ramanath · Jun Shi · Sathiya Keerthi
🔗
|
-
|
Simulated Annealing for Neural Architecture Search
(
Poster
)
>
|
Shentong Mo · Jingfei Xia · Pinxu Ren
🔗
|
-
|
Faster Quasi-Newton Methods for Linear Composition Problems
(
Poster
)
>
|
Betty Shea · Mark Schmidt
🔗
|
-
|
On the convergence of stochastic extragradient for bilinear games using restarted iteration averaging
(
Poster
)
>
|
Chris Junchi Li · Yaodong Yu · Nicolas Loizou · Gauthier Gidel · Yi Ma · Nicolas Le Roux perso · Michael Jordan
🔗
|
-
|
On the convergence of stochastic extragradient for bilinear games using restarted iteration averaging
(
Oral
)
>
|
Chris Junchi Li · Yaodong Yu · Nicolas Loizou · Gauthier Gidel · Yi Ma · Nicolas Le Roux perso · Michael Jordan
🔗
|
-
|
On the Relation between Distributionally Robust Optimization and Data Curation
(
Poster
)
>
|
Agnieszka Słowik · Leon Bottou
🔗
|
-
|
On the Relation between Distributionally Robust Optimization and Data Curation
(
Oral
)
>
|
Agnieszka Słowik · Leon Bottou
🔗
|
-
|
Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence
(
Poster
)
>
|
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi
🔗
|
-
|
Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence
(
Oral
)
>
|
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi
🔗
|
-
|
Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation
(
Poster
)
>
|
Futong Liu · Tao Lin · Martin Jaggi
🔗
|
-
|
Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation
(
Oral
)
>
|
Futong Liu · Tao Lin · Martin Jaggi
🔗
|