Fri 7:00 a.m. - 7:01 a.m.
|
Opening Remarks
(
Opening remarks
)
>
SlidesLive Video
|
Cristóbal Guzmán
🔗
|
Fri 7:00 a.m. - 7:30 a.m.
|
DoG is SGD’s best friend: toward tuning-free stochastic optimization, Yair Carmon
(
Plenary speaker
)
>
SlidesLive Video
|
Yair Carmon
🔗
|
Fri 7:30 a.m. - 8:00 a.m.
|
Contributed Talks 1: *Escaping mediocrity: how two-layer networks learn hard generalized linear models* and *Last Iterate Convergence of Popov Method for Non-monotone Stochastic Variational Inequalities*
(
Contributed talks
)
>
SlidesLive Video
|
Bruno Loureiro · Daniil Vankov · Courtney Paquette
🔗
|
Fri 8:00 a.m. - 9:00 a.m.
|
Poster Session 1
|
42 presenters
Egor Shulgin · Mingzhen He · Hanmin Li · Thibault Lahire · Eric Zelikman · Damien Scieur · Rajat Vadiraj Dwaraknath · Gene Li · Zhanhong Jiang · Rahul Jain · Zihan Zhou · Tianyue Zhang · Ilyas Fatkhullin · Frederik Kunstner · Utkarsh Singhal · Bruno Loureiro · Krishna C Kalagarla · Kai Liu · Michal Derezinski · Ross Clarke · Dimitri Papadimitriou · Mo Zhou · Jörg Franke · Chandler Smith · Darshan Chakrabarti · Trang H. Tran · Mokhwa Lee · Wei Kuang · Vincent Roulet · John Lazarsfeld · Donghyun Oh · Yihe Deng · Fu Wang · Junchi YANG · Dániel Rácz · Jeffrey Flanigan · Aaron Mishkin · Luca Scharr · Robert Gower · Chaoyue Liu · Yushen Huang · Nicholas Recker
🔗
|
Fri 9:00 a.m. - 9:30 a.m.
|
Contributed Talks 2: *An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization* and *Practical Principled Policy Optimization for Finite MDPs*
(
Contributed talks
)
>
SlidesLive Video
|
Guy Kornowski · Michael Lu · Aaron Sidford
🔗
|
Fri 9:30 a.m. - 10:00 a.m.
|
Aiming towards the minimizers: fast convergence of SGD for overparameterized problems, Dmitriy Drusvyatskiy
(
Plenary speaker
)
>
SlidesLive Video
|
Dmitriy Drusvyatskiy
🔗
|
Fri 10:00 a.m. - 12:00 p.m.
|
Lunch
|
🔗
|
Fri 12:00 p.m. - 12:30 p.m.
|
Evaluating Large-Scale Learning Systems, Virginia Smith
(
Plenary speaker
)
>
SlidesLive Video
|
Virginia Smith
🔗
|
Fri 12:30 p.m. - 1:00 p.m.
|
Contributed Talks 3: *Dueling Optimization with a Monotone Adversary* and *High-Dimensional Prediction for Sequential Decision Making*
(
Contributed talks
)
>
SlidesLive Video
|
Naren Manoj · Georgy Noarov · Cristóbal Guzmán
🔗
|
Fri 1:00 p.m. - 2:00 p.m.
|
Poster Session 2
|
43 presenters
Xiao-Yang Liu · Guy Kornowski · Philipp Dahlinger · Abbas Ehsanfar · Binyamin Perets · David Martinez-Rubio · Sudeep Raja Putta · Runlong Zhou · Connor Lawless · Julian J Stier · Chen Fan · Michal Šustr · James Spann · Jung Hun Oh · Yao Xie · Qi Zhang · Krishna Acharya · Sourabh Medapati · Sharan Vaswani · Sruthi Gorantla · Mohamed Elsayed · Hongyang Zhang · Reza Asad · Viktor Pavlovic · Betty Shea · Georgy Noarov · Chuan He · Daniil Vankov · Taoan Huang · Michael Lu · Anant Mathur · Konstantin Mishchenko · Stanley Wei · Francesco Faccio · Yuchen Zeng · Tianyue Zhang · Chris Junchi Li · Aaron Mishkin · Sina Baharlouei · Chen Xu · Sasha Abramowitz · Sebastian Stich · Felix Dangel
🔗
|
Fri 2:00 p.m. - 2:30 p.m.
|
Sharply predicting the behavior of complex iterative algorithms with random data, Ashwin Pananjady
(
Plenary speaker
)
>
SlidesLive Video
|
Ashwin Pananjady
🔗
|
Fri 2:30 p.m. - 3:00 p.m.
|
Provable Feature Learning in Gradient Descent, Jason Lee
(
Plenary speaker
)
>
SlidesLive Video
|
Jason Lee
🔗
|
Fri 3:00 p.m. - 3:01 p.m.
|
Closing Remarks
(
Closing
)
>
|
Cristóbal Guzmán
🔗
|
-
|
Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization
(
Poster
)
>
link
|
Hanmin Li · Avetik Karagulyan · Peter Richtarik
🔗
|
-
|
Accelerated Methods for Riemannian Min-Max Optimization Ensuring Bounded Geometric Penalties
(
Poster
)
>
link
|
David Martinez-Rubio · Christophe Roux · Christopher Criscitiello · Sebastian Pokutta
🔗
|
-
|
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression
(
Poster
)
>
link
|
Xuheng Li · Yihe Deng · Jingfeng Wu · Dongruo Zhou · Quanquan Gu
🔗
|
-
|
Follow the flow: Proximal flow inspired multi-step methods
(
Poster
)
>
link
|
Yushen Huang · Yifan Sun
🔗
|
-
|
A Predicting Clipping Asynchronous Stochastic Gradient Descent Method in Distributed Learning
(
Poster
)
>
link
|
Haoxiang Wang · Zhanhong Jiang · Chao Liu · Soumik Sarkar · Dongxiang Jiang · Young Lee
🔗
|
-
|
Last Iterate Convergence of Popov Method for Non-monotone Stochastic Variational Inequalities
(
Oral
)
>
link
|
Daniil Vankov · Angelia Nedich · Lalitha Sankar
🔗
|
-
|
Generalisable Agents for Neural Network Optimisation
(
Poster
)
>
link
|
Kale-ab Tessera · Callum R. Tilbury · Sasha Abramowitz · Ruan John de Kock · Omayma Mahjoub · Benjamin Rosman · Sara Hooker · Arnu Pretorius
🔗
|
-
|
Accelerated gradient descent: A guaranteed bound for a heuristic restart strategy
(
Poster
)
>
link
|
Walaa Moursi · Stephen Vavasis · Viktor Pavlovic
🔗
|
-
|
Adagrad Promotes Diffuse Solutions In Overparameterized Regimes
(
Poster
)
>
link
|
Andrew Rambidis · Jiayi Wang
🔗
|
-
|
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
(
Poster
)
>
link
|
Zihan Zhou · Honghao Wei · Lei Ying
🔗
|
-
|
Reducing Predict and Optimize to Convex Feasibility
(
Poster
)
>
link
|
Saurabh Mishra · Sharan Vaswani
🔗
|
-
|
Diversity-adjusted adaptive step size
(
Poster
)
>
link
|
Parham Yazdkhasti · Xiaowen Jiang · Sebastian Stich
🔗
|
-
|
Global CFR: Meta-Learning in Self-Play Regret Minimization
(
Poster
)
>
link
|
David Sychrovský · Michal Sustr · Michael Bowling · Martin Schmid
🔗
|
-
|
Noise Injection Irons Out Local Minima and Saddle Points
(
Poster
)
>
link
|
Konstantin Mishchenko · Sebastian Stich
🔗
|
-
|
How to Guess a Gradient
(
Poster
)
>
link
|
Utkarsh Singhal · Brian Cheung · Kartik Chandra · Jonathan Ragan-Kelley · Josh Tenenbaum · Tomaso Poggio · Stella X. Yu
🔗
|
-
|
Stochastic FISTA Step Search Algorithm for Convex Optimization
(
Poster
)
>
link
|
Trang H. Tran · Lam Nguyen · Katya Scheinberg
🔗
|
-
|
K-Spin Ising Model for Combinatorial Optimizations over Graphs: An Reinforcement Learning Approach
(
Poster
)
>
link
|
Xiao-Yang Liu · Ming Zhu
🔗
|
-
|
Parameter-Agnostic Optimization under Relaxed Smoothness
(
Poster
)
>
link
|
Florian Hübler · Junchi YANG · Xiang Li · Niao He
🔗
|
-
|
Escaping mediocrity: how two-layer networks learn hard generalized linear models
(
Oral
)
>
link
|
Luca Arnaboldi · Florent Krzakala · Bruno Loureiro · Ludovic Stephan
🔗
|
-
|
The Expressive Power of Low-Rank Adaptation
(
Poster
)
>
link
|
Yuchen Zeng · Kangwook Lee
🔗
|
-
|
FaDE: Fast DARTS Estimator on Hierarchical NAS Spaces
(
Poster
)
>
link
|
Simon Neumeyer · Julian J Stier · Michael Granitzer
🔗
|
-
|
Nesterov Meets Robust Multitask Learning Twice
(
Poster
)
>
link
|
Yifan Kang · Kai Liu
🔗
|
-
|
On the Interplay Between Stepsize Tuning and Progressive Sharpening
(
Poster
)
>
link
|
Vincent Roulet · Atish Agarwala · Fabian Pedregosa
🔗
|
-
|
Why Adam Outperforms Gradient Descent on Language Models: A Heavy-Tailed Class Imbalance Problem
(
Poster
)
>
link
|
Robin Yadav · Frederik Kunstner · Mark Schmidt · Alberto Bietti
🔗
|
-
|
Level Set Teleportation: the Good, the Bad, and the Ugly
(
Poster
)
>
link
|
Aaron Mishkin · Alberto Bietti · Robert Gower
🔗
|
-
|
An alternative approach to train neural networks using monotone variational inequality
(
Poster
)
>
link
|
Chen Xu · Xiuyuan Cheng · Yao Xie
🔗
|
-
|
Safe Posterior Sampling for Constrained MDPs with Bounded Constraint Violation
(
Poster
)
>
link
|
Krishna C Kalagarla · Rahul Jain · Pierluigi Nuzzo
🔗
|
-
|
Average-Constrained Policy Optimization
(
Poster
)
>
link
|
Akhil Agnihotri · Rahul Jain · Haipeng Luo
🔗
|
-
|
A novel analysis of gradient descent under directional smoothness
(
Poster
)
>
link
|
Aaron Mishkin · Ahmed Khaled · Aaron Defazio · Robert Gower
🔗
|
-
|
The Sharp Power Law of Local Search on Expanders
(
Poster
)
>
link
|
Nicholas Recker · Simina Branzei · Davin Choo
🔗
|
-
|
Regret Bounds for Optimistic Follow The Leader: Applications in Portfolio Selection and Linear Regression
(
Poster
)
>
link
|
Sudeep Raja Putta · Shipra Agrawal
🔗
|
-
|
Bandit-Driven Batch Selection for Robust Learning under Label Noise
(
Poster
)
>
link
|
Michal Lisicki · Mihai Nica · Graham Taylor
🔗
|
-
|
Practical Principled Policy Optimization for Finite MDPs
(
Oral
)
>
link
|
Michael Lu · Matin Aghaei · Anant Raj · Sharan Vaswani
🔗
|
-
|
Adaptive Gradient Methods at the Edge of Stability
(
Poster
)
>
link
|
Jeremy M Cohen · Behrooz Ghorbani · Shankar Krishnan · Naman Agarwal · Sourabh Medapati · Michal Badura · Daniel Suo · Zachary Nado · George Dahl · Justin Gilmer
🔗
|
-
|
Non-Uniform Sampling and Adaptive Optimizers in Deep Learning
(
Poster
)
>
link
|
Thibault Lahire
🔗
|
-
|
Large-scale Non-convex Stochastic Constrained Distributionally Robust Optimization
(
Poster
)
>
link
|
Qi Zhang · Shaofeng Zou · Yi Zhou · Lixin Shen · Ashley Prater-Bennette
🔗
|
-
|
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
(
Poster
)
>
link
|
Philipp Dahlinger · Philipp Becker · Maximilian Hüttenrauch · Gerhard Neumann
🔗
|
-
|
Decentralized Learning Dynamics in the Gossip Model
(
Poster
)
>
link
|
John Lazarsfeld · Dan Alistarh
🔗
|
-
|
Almost multisecant BFGS quasi-Newton method
(
Poster
)
>
link
|
Mokhwa Lee · Yifan Sun
🔗
|
-
|
From 6235149080811616882909238708 to 29: Vanilla Thompson Sampling Revisited
(
Poster
)
>
link
|
Bingshan Hu · Tianyue Zhang
🔗
|
-
|
Utility-based Perturbed Gradient Descent: An Optimizer for Continual Learning
(
Poster
)
>
link
|
Mohamed Elsayed · Rupam Mahmood
🔗
|
-
|
Revisiting Random Weight Perturbation for Efficiently Improving Generalization
(
Poster
)
>
link
|
Tao Li · Weihao weihao · Qinghua Tao · Zehao Lei · Yingwen Wu · Kun Fang · Mingzhen He · Xiaolin Huang
🔗
|
-
|
MSL: An Adaptive Momentem-based Stochastic Line-search Framework
(
Poster
)
>
link
|
Chen Fan · Sharan Vaswani · Christos Thrampoulidis · Mark Schmidt
🔗
|
-
|
Noise Stability Optimization for Flat Minima with Tight Rates
(
Poster
)
>
link
|
Haotian Ju · Dongyue Li · Hongyang Zhang
🔗
|
-
|
Dueling Optimization with a Monotone Adversary
(
Oral
)
>
link
|
Avrim Blum · Meghal Gupta · Gene Li · Naren Manoj · Aadirupa Saha · Yuanyuan Yang
🔗
|
-
|
Noise-adaptive (Accelerated) Stochastic Heavy-Ball Momentum
(
Poster
)
>
link
|
Anh Dang · Reza Babanezhad Harikandeh · Sharan Vaswani
🔗
|
-
|
Unnormalized Density Estimation with Root Sobolev Norm Regularization
(
Poster
)
>
link
|
Mark Kozdoba · Binyamin Perets · Shie Mannor
🔗
|
-
|
Accelerating Inexact HyperGradient Descent for Bilevel Optimization
(
Poster
)
>
link
|
Yang Haikuo · Luo Luo · Chris Junchi Li · Michael Jordan · Maryam Fazel
🔗
|
-
|
High Dimensional Unbiased Estimation for Sequential Decision Making
(
Oral
)
>
link
|
Georgy Noarov · Ramya Ramalingam · Aaron Roth · Stephan Xie
🔗
|
-
|
Efficient Learning in Polyhedral Games via Best Response Oracles
(
Poster
)
>
link
|
Darshan Chakrabarti · Gabriele Farina · Christian Kroer
🔗
|
-
|
On the Convergence of Local SGD Under Third-Order Smoothness and Hessian Similarity
(
Poster
)
>
link
|
Ali Zindari · Ruichen Luo · Sebastian Stich
🔗
|
-
|
Adam through a Second-Order Lens
(
Poster
)
>
link
|
Ross Clarke · Baiyu Su · José Miguel Hernández-Lobato
🔗
|
-
|
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
(
Poster
)
>
link
|
Nuoya Xiong · Lijun Ding · Simon Du
🔗
|
-
|
Exploring Modern Evolution Strategies in Portfolio Optimization
(
Poster
)
>
link
|
Ramin Hasani · Abbas Ehsanfar · Greg Banis · Rusty Bealer · Amir Ahmadi
🔗
|
-
|
Greedy Newton: Newton's Method with Exact Line Search
(
Poster
)
>
link
|
Betty Shea · Mark Schmidt
🔗
|
-
|
A proximal augmented Lagrangian based algorithm for federated learning with constraints
(
Poster
)
>
link
|
Chuan He · Le Peng · Ju Sun
🔗
|
-
|
Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets
(
Poster
)
>
link
|
Wu Lin · Felix Dangel · Runa Eschenhagen · Kirill Neklyudov · Agustinus Kristiadi · Richard Turner · Alireza Makhzani
🔗
|
-
|
Statistical Inference of Adaptive Inexact Stochastic Newton Method
(
Poster
)
>
link
|
Wei Kuang · Sen Na · Mihai Anitescu
🔗
|
-
|
$f$-FERM: A Scalable Framework for Robust Fair Empirical Risk Minimization
(
Poster
)
>
link
|
Sina Baharlouei · Shivam Patel · Meisam Razaviyayn
🔗
|
-
|
Oracle Efficient Algorithms for Groupwise Regret
(
Poster
)
>
link
|
Krishna Acharya · Eshwar Ram Arunachaleswaran · Juba Ziani · Aaron Roth · Sampath Kannan
🔗
|
-
|
(Un)certainty selection methods for Active Learning on Label Distributions
(
Poster
)
>
link
|
James Spann · Christopher Homan
🔗
|
-
|
SGD batch saturation for training wide neural networks
(
Poster
)
>
link
|
Chaoyue Liu · Dmitriy Drusvyatskiy · Misha Belkin · Damek Davis · Yian Ma
🔗
|
-
|
Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
(
Poster
)
>
link
|
Michal Derezinski
🔗
|
-
|
Enhancing the Misreport Network for Optimal Auction Design
(
Poster
)
>
link
|
Haiying Wu · shuyuan you · Zhiqiang Zhuang · Kewen Wang · Zhe Wang
🔗
|
-
|
Towards a Better Theoretical Understanding of Independent Subnetwork Training
(
Poster
)
>
link
|
Egor Shulgin · Peter Richtarik
🔗
|
-
|
Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates
(
Poster
)
>
link
|
Damien Scieur
🔗
|
-
|
Sion's Minimax Theorem in Geodesic Metric Spaces and a Riemannian Extragradient Algorithm
(
Poster
)
>
link
|
Peiyuan Zhang · Jingzhao Zhang · Suvrit Sra
🔗
|
-
|
Cup Curriculum: Curriculum Learning on Model Capacity
(
Poster
)
>
link
|
Luca Scharr · Vanessa Toborek
🔗
|
-
|
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization
(
Oral
)
>
link
|
Guy Kornowski · Ohad Shamir
🔗
|
-
|
Fair Minimum Representation Clustering
(
Poster
)
>
link
|
Connor Lawless · Oktay Gunluk
🔗
|
-
|
Fair Representation in Submodular Subset Selection: A Pareto Optimization Approach
(
Poster
)
>
link
|
Adriano Fazzone · Yanhao Wang · Francesco Bonchi
🔗
|
-
|
New Horizons in Parameter Regularization: A Constraint Approach
(
Poster
)
>
link
|
Jörg Franke · Michael Hefenbrock · Gregor Koehler · Frank Hutter
🔗
|
-
|
Continually Adapting Optimizers Improve Meta-Generalization
(
Poster
)
>
link
|
Wenyi Wang · Louis Kirsch · Francesco Faccio · Mingchen Zhuge · Jürgen Schmidhuber
🔗
|
-
|
Surrogate Minimization: An Optimization Algorithm for Training Large Neural Networks with Model Parallelism
(
Poster
)
>
link
|
Reza Asad · Reza Babanezhad Harikandeh · Issam Hadj Laradji · Nicolas Le Roux · Sharan Vaswani
🔗
|
-
|
On the Parallel Complexity of Multilevel Monte Carlo in Stocahstic Gradient Descent
(
Poster
)
>
link
|
Kei Ishikawa
🔗
|
-
|
Pruning Neural Networks with Velocity-Constrained Optimization
(
Poster
)
>
link
|
Donghyun Oh · Jinseok Chung · Namhoon Lee
🔗
|
-
|
Feature Selection in Generalized Linear models via the Lasso: To Scale or Not to Scale?
(
Poster
)
>
link
|
Anant Mathur · Sarat Moka
🔗
|
-
|
DIRECT Optimisation with Bayesian Insights: Assessing Reliability Under Fixed Computational Budgets
(
Poster
)
>
link
|
Fu Wang · Zeyu Fu · Xiaowei Huang · Wenjie Ruan
🔗
|
-
|
Understanding the Role of Optimization in Double Descent
(
Poster
)
>
link
|
Chris Liu · Jeffrey Flanigan
🔗
|
-
|
Variance Reduced Model Based Methods: New rates and adaptive step sizes
(
Poster
)
>
link
|
Robert Gower · Frederik Kunstner · Mark Schmidt
🔗
|
-
|
On the convergence of warped proximal iterations for solving nonmonotone inclusions and applications
(
Poster
)
>
link
|
Dimitri Papadimitriou · Bang Cong Vu
🔗
|
-
|
On the Synergy Between Label Noise and Learning Rate Annealing in Neural Network Training
(
Poster
)
>
link
|
Stanley Wei · Tongzheng Ren · Simon Du
🔗
|
-
|
Optimizing Group-Fair Plackett-Luce Ranking Models for Relevance and Ex-Post Fairness
(
Poster
)
>
link
|
Sruthi Gorantla · Eshaan Bhansali · Amit Deshpande · Anand Louis
🔗
|
-
|
Contrastive Predict-and-Search for Mixed Integer Linear Programs
(
Poster
)
>
link
|
Taoan Huang · Aaron Ferber · Arman Zharmagambetov · Yuandong Tian · Bistra Dilkina
🔗
|
-
|
Optimization dependent generalization bound for ReLU networks based on sensitivity in the tangent bundle
(
Poster
)
>
link
|
Dániel Rácz · Mihaly Petreczky · Balint Daroczy
🔗
|
-
|
Riemannian Optimization for Euclidean Distance Geometry
(
Poster
)
>
link
|
Chandler Smith · Samuel Lichtenberg · HanQin Cai · Abiy Tasissa
🔗
|
-
|
GUC: Unsupervised non-parametric Global Clustering and Anomaly Detection
(
Poster
)
>
link
|
Chris Solomou
🔗
|
-
|
Testing Approximate Stationarity Concepts for Piecewise Smooth Functions
(
Poster
)
>
link
|
Lai Tian · Anthony Man-Cho So
🔗
|
-
|
Multi-head CLIP: Improving CLIP with Diverse Representations and Flat Minima
(
Poster
)
>
link
|
Mo Zhou · Xiong Zhou · Erran Li Li · Stefano Ermon · Rong Ge
🔗
|
-
|
DynaLay: An Introspective Approach to Dynamic Layer Selection for Deep Networks
(
Poster
)
>
link
|
Mrinal Mathur · Sergey Plis
🔗
|
-
|
Optimal Transport for Kernel Gaussian Mixture Models
(
Poster
)
>
link
|
Jung Hun Oh · Rena Elkin · Anish Simhal · Jiening Zhu · Joseph Deasy · Allen Tannenbaum
🔗
|
-
|
Stochastic Optimization under Hidden Convexity
(
Poster
)
>
link
|
Ilyas Fatkhullin · Niao He · Yifan Hu
🔗
|
-
|
On Optimization Formulations of Finite Horizon MDPs
(
Poster
)
>
link
|
Rajat Vadiraj Dwaraknath · Lexing Ying
🔗
|
-
|
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
(
Poster
)
>
link
|
Eric Zelikman · Eliana Lorch · Lester Mackey · Adam Tauman Kalai
🔗
|
-
|
Learning Multi-Objective Optimization Problem Through Online Learning
(
Poster
)
>
link
|
Chaosheng Dong · Yijia Wang · Bo Zeng
🔗
|