Posters presented at this session:
Fast decentralized gradient tracking for federated learning with local updates: From mini to minimax optimization, Chris Junchi Li
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou
An Elementary Predictor Obtaining 2\sqrt{T} Distance to Calibration, Eshwar Ram Arunachaleswaran, Natalie Collina, Aaron Roth, Mirah Shi
Stochastic Quasi-Variational Inequalities: Convergence Analysis Beyond Strong Monotonicity, Zeinab Alizadeh, Afrooz Jalilzadeh
DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction, Xinwei Zhang, Zhiqi Bu, Borja Balle, Mingyi Hong, Meisam Razaviyayn, Vahab Mirrokni
DADA: Dual Averaging with Distance Adaptation, Mohammad Moshtaghifar, Anton Rodomanov, Daniil Vankov, Sebastian U Stich
Efficient Levenberg-Marquat for SLAM, Amir Belder, Refael Vivanti
Estimating Vote Choice in U.S. Elections with Approximate Poisson-Binomial Logistic Regression, Nic Fishman, Evan Rosenman
Online Nonconvex Bilevel Optimization with Bregman Divergences, Jason Bohne, David S Rosenberg, Gary Kazantsev, Pawel Polak
Hierarchical Simplicity Bias of Neural Networks, Zhehang Du
The Crucial Role of Samplers in Online Direct Preference Optimization, Ruizhe Shi, Runlong Zhou, Simon Shaolei Du
AdEMAMix: Better and Faster Training with Older Gradients, Matteo Pagliardini, Pierre Ablin, David Grangier
Aligned Multi-Objective Optimization, Yonathan Efroni, Daniel Jiang, Ben Kretzu, Jalaj Bhandari, Zheqing Zhu, Karen Ullrich
A fast and efficient randomized quasi-Newton method, Danny Duan, Hanbaek Lyu
Spurious Stationarity and Hardness Results for Mirror Descent, He Chen, Jiajin Li, Anthony Man-Cho So
On the Crucial Role of Initialization for Matrix Factorization, Bingcong Li, Liang Zhang, Aryan Mokhtari, Niao He
On the Convergence of DP-SGD with Adaptive Clipping, Egor Shulgin, Peter Richtárik
Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking, Sahar Rajabi, Sirisha Rambhatla
Intuitive Analysis of the Quantization based Optimization: From establishing a SDE to Quantum Mechanical Perspective, Jinwuk Seok, Changsik Cho
From Gradient Clipping to Normalization for Heavy Tailed SGD, Florian Hübler, Ilyas Fatkhullin, Niao He
Scalable Second-Order Optimization Algorithms for Minimizing Low-rank Functions, Edward Tansley, Coralia Cartis
Dimensionality Reduction Techniques for Global Bayesian Optimisation, LUO LONG, Coralia Cartis, Paz Fink Shustin
Solving hidden monotone variational inequalities with surrogate losses, Ryan D'Orazio, Danilo Vucetic, Zichu Liu, Junhyung Lyle Kim, Ioannis Mitliagkas, Gauthier Gidel
On the Convergence of FedProx with Extrapolation and Inexact Prox, Hanmin Li, Peter Richtárik
SICNN: Sparsity-induced Input Convex Neural Network for Optimal Transport, Peter Chen, Yue Xie, Qingpeng Zhang
Policy Optimization for Strictly Batch Imitation Learning, Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar
Understanding Adam Requires Better Rotation Dependent Assumptions, Tianyue H. Zhang, Lucas Maes, Charles Guille-Escuret, Alexia Jolicoeur-Martineau, Ioannis Mitliagkas, Simon Lacoste-Julien, Damien Scieur
In the Search for Optimal Portfolios of Counterstrategies in the Large Imperfect Information Games, Karolina Drabent, David Milec, Ondrej Kubicek, Viliam Lisý
Accelerated Stability in Performative Prediction, Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami, Simon Lacoste-Julien, Gauthier Gidel
Memory Efficient Adaptive Stochastic Optimization via Subset-Norm, Thien Hang Nguyen, Huy Nguyen
\muLO: Compute-Efficient Meta-Generalization of Learned Optimizers, Benjamin Thérien, Charles-Étienne Joseph, Boris Knyazev, Edouard Oyallon, Irina Rish, Eugene Belilovsky
Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent, Peter Richtárik, Simone Maria Giancola, Dymitr Lubczyk, Robin Yadav
Dueling in the Dark: An Efficient and Optimal Mirror Descent Approach for Online Optimization with Adversarial Preferences, Aadirupa Saha, Yonathan Efroni, Barry-John Theobald
Adaptive Partitioning Schemes for Black-Box Optimization, Raja Sunkara, Ardhendu Tripathy
Aggregating Data for Optimal and Private Learning, Sushant Agarwal, Yukti Makhija, Rishi Saket, Aravindan Raghuveer
Optimizing Attention, Hanno Ackermann, Hong Cai, Markus Nagel, Leyla Mirvakhabova, Farhad G. Zanjani, Fatih Porikli
Amplitude Modulated Riemannian Optimization for QAP, Timothee Leleu, Aron Vizkeleti, Sam Reifenstein*
The Dimension Strikes Back with Gradients: Generalization of Gradient Methods in Stochastic Convex Optimization, Matan Schliserman, Uri Sherman, Tomer Koren
Statistical Inference in Latent Convex Objectives with Stream Data, Rohan Chauhan, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Michael Jordan
MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times, Arto Maranjyan, Omar Shaikh Omar, Peter Richtárik
Personalized Federated Learning via Low-Rank Matrix Factorization, Ali Dadras, Sebastian U Stich, Alp Yurtsever
Communication-Efficient Loss Minimization over Heterogeneous Data with Federated Hierarchical Ensemble Aggregation via Distillation, Sayantan Chowdhury, Ben Liang, Ali Tizghadam, Ilijc Albanese
Differentially Private Random Block Coordinate Descent, Arto Maranjyan, Abdurakhmon Sadiev, Peter Richtárik
Connections between Schedule-Free SGD, Accelerated SGD Variants, and Weight Averaging, Depen Morwani, Nikhil Vyas, Hanlin Zhang, Sham M. Kakade
Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks, Shikai Qiu, Atish Agarwala, Lechao Xiao, Jeffrey Pennington
Learning Morphisms with Gauss-Newton Approximation for Growing Networks, Neal Gregory Lawton, Aram Galstyan, Greg Ver Steeg
Addax: Resource-Efficient Fine-Tuning of Language Models with a Combination of Forward-Backward and Forward-Only Passes, Zeman Li, Xinwei Zhang, Peilin Zhong, Yuan Deng, Vahab Mirrokni, Meisam Razaviyayn
u-muP: The Unit-Scaled Maximal Update Parametrization, Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
Old Optimizer, New Norm: An Anthology, Jeremy Bernstein, Laker Newhouse
High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems, Andrew Nicholas Cheng, Kiwon Lee, Courtney Paquette
Stochastic Proximal Point Methods for Monotone Inclusions under Expected Similarity, Abdurakhmon Sadiev, Laurent Condat, Peter Richtárik