Posters presented in this session:
Second-Order Forward-Mode Automatic Differentiation for Optimization, Adam D. Cobb, Atilim Gunes Baydin, Barak A. Pearlmutter, Susmit Jha
On the Hardness of Meaningful Local Guarantees in Nonsmooth Nonconvex Optimization, Guy Kornowski, Swati Padmanabhan, Ohad Shamir
Nonmonotone Line Searches Operate at the Edge of Stability, Curtis Fox, Leonardo Galli, Mark Schmidt, Holger Rauhut
Multimodal Federated Learning with Model Personalization, Ratun Rahman, Dinh C.Nguyen
A Stochastic Algorithm for Sinkhorn Distance-Regularized Distributionally Robust Optimization, Yufeng Yang, Yi Zhou, Zhaosong Lu
Applications of fractional calculus in learned optimization, Teodor Alexandru Szente, James Harrison, Mihai Zanfir, Cristian Sminchisescu
Uncoupled and Convergent Learning in Monotone Games under Bandit Feedback, Jing Dong, Baoxiang Wang, Yaoliang Yu
A Second-Order Algorithm for Empirical Group Distributionally Robust Regression, Naren Sarayu Manoj, Kumar Kshitij Patel
Glocal Smoothness: Line Search can really help!, Curtis Fox, Mark Schmidt
Don't Be So Positive: Negative Step Sizes in Second-Order Methods, Betty Shea, Mark Schmidt
ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training, Adel Nabli, Louis Fournier, Pierre ERBACHER, Louis Serrano, Eugene Belilovsky, Edouard Oyallon
A Unified Convergence Theory for Large Language Model Efficient Fine-tuning, Zhanhong Jiang, Nastaran Saadati, Aditya Balu, Minh Pham, Joshua Russell Waite, Nasla Saleem, Chinmay Hegde, Soumik Sarkar
Remove Symmetries to Control Model Expressivity and Improve Optimization, Liu Ziyin, Yizhou Xu, Isaac L. Chuang
Deconstructing What Makes a Good Optimizer for Language Models, Rosie Zhao, Depen Morwani, David Brandfonbrener, Nikhil Vyas, Sham M. Kakade
Graph Neural Networks for Hyperparameter Inference in Ising Solvers, Edward Jiang, Timothee Leleu, Sam Reifenstein, Milin Doppalapudi
Neural Entropic Multimarginal Optimal Transport, Dor Tsur, Ziv Goldfeld, Kristjan Greenewald, Haim H. Permuter
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition, Robert Joseph George, David Pitt, Jiawei Zhao, Jean Kossaifi, Cheng Luo, Yuandong Tian, Anima Anandkumar
Revisiting the Initial Steps in Adaptive Gradient Descent Optimization, ABULIKEMU ABUDUWEILI, Changliu Liu
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training, Hiroki Naganuma, Xinzhi Zhang, Man-Chung Yue, Ioannis Mitliagkas, Russell J. Hewett, Philipp Andre Witte, Yin Tat Lee
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks, Amrutha Varshini Ramesh, Vignesh Ganapathiraman, Issam H. Laradji, Mark Schmidt
Fast Convergence of Softmax Policy Mirror Ascent for Bandits & Tabular MDPs, Reza Asad, Reza Babanezhad Harikandeh, Issam H. Laradji, Nicolas Le Roux, Sharan Vaswani
On the Inherent Privacy of Two Point Zeroth Order Projected Gradient Descent, Devansh Gupta, Meisam Razaviyayn, Vatsal Sharan
Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials, August Y Chen, Ayush Sekhari, Karthik Sridharan
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts, Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Stephen Rawls, Sambit Sahu, Supriyo Chakraborty, Tom Goldstein
Improving Deep Learning Speed and Performance through Synaptic Neural Balance, Antonios Alexos, ian domingo, Pierre Baldi
Incentivizing Truthful Collaboration in Heterogeneous Federated Learning, Dimitar Chakarov, Nikita Tsoy, Kristian Minchev, Nikola Konstantinov
Normalization Matters for Optimization Performance on Graph Neural Networks, Alan Milligan, Frederik Kunstner, Hamed Shirzad, Mark Schmidt, Danica J. Sutherland
LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression, Laurent Condat, Arto Maranjyan, Peter Richtárik
Optimal Transport for Probabilistic Circuits, Adrian Ciotinga, YooJung Choi
Extra-Gradient and Optimistic Gradient Descent Converge in Iterates Faster than O(1/\sqrt{T}) in All Monotone Lipschitz Variational Inequalities, Kimon Antonakopoulos
Weak to Strong Learning from Aggregate Labels, Yukti Makhija, Rishi Saket
A Continuous Variable Optimization method for the Quadratic Assignment Problem, Aron Vizkeleti, Timothee Leleu
Neural Networks with Complex-Valued Weights Have No Spurious Local Minima, Xingtu Liu
SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Nonconvex Cross-Device Federated Learning, Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richtárik
Modularity aided consistent attributed graph clustering via coarsening, Samarth Bhatia, Yukti Makhija, Manoj Kumar, Sandeep Kumar
Simple and Scalable Federated Learning with Uncertainty via Improved Variational Online Newton, Shivam Pal, Aishwarya Gupta, Saqib Sarwar, Piyush Rai
Path Integral Optimiser: Global Optimisation via Neural Schr\"odinger-F\"ollmer Diffusion, Max McGuinness, Eirik Fladmark, Francisco Vargas
Consensus Based Optimization Accelerates Gradient Descent, Anagha Satish, Ricardo Baptista, Franca Hoffmann
SOAP: Improving and Stabilizing Shampoo using Adam, Nikhil Vyas, Depen Morwani, Rosie Zhao, Itai Shapira, David Brandfonbrener, Lucas Janson, Sham M. Kakade
Nonlinear tomographic reconstruction via nonsmooth optimization, Vasileios Charisopoulos, Rebecca Willett
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average, Louis Fournier, Adel Nabli, Masih Aminbeidokhti, Marco Pedersoli, Eugene Belilovsky, Edouard Oyallon
Cyclic Data Parallelism for Efficient Parallelism of Deep Neural Networks, Louis Fournier, Edouard Oyallon
Discrete-Continuous Variational Optimization with Local Gradients, Jonathan H Warrell, Francesco Alesiani, Cameron Smith, Anja Mösch, Martin Renqiang Min
Structured Regularization on the SPD Manifold, Andrew Nicholas Cheng, Melanie Weber
Communication-efficient Algorithms Under Generalized Smoothness Assumptions, Sarit Khirirat, Abdurakhmon Sadiev, Artem Riabinin, Eduard Gorbunov, Peter Richtárik