Sun 9:00 a.m. - 9:05 a.m.
|
Opening Remarks
(
Opening Remarks
)
>
SlidesLive Video
|
Courtney Paquette
🔗
|
Sun 9:01 a.m. - 9:30 a.m.
|
Optimizing Optimization Methods with Computer Assistance, Ben Grimmer
(
Plenary Speaker
)
>
SlidesLive Video
|
Benjamin Grimmer
🔗
|
Sun 9:30 a.m. - 10:00 a.m.
|
Talk 1: *On the Inherent Privacy of Two Point Zeroth Order Projected Gradient Descent* and Talk 2: *The Dimension Strikes Back with Gradients: Generalization of Gradient Methods in Stochastic Convex Optimization*
(
Contributed Talks
)
>
SlidesLive Video
|
Devansh Gupta · Matan Schliserman
🔗
|
Sun 10:00 a.m. - 11:00 a.m.
|
Poster Session 1
(
Poster Session
)
>
|
🔗
|
Sun 11:00 a.m. - 11:30 a.m.
|
Talk 1: *SOAP: Improving and Stabilizing Shampoo using Adam* and Talk 2: *μLO: Compute-Efficient Meta-Generalization of Learned Optimizers*
(
Contributed Talks
)
>
SlidesLive Video
|
Depen Morwani · Benjamin Thérien
🔗
|
Sun 11:30 a.m. - 12:00 p.m.
|
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning, Misha Belkin
(
Plenary Speaker
)
>
SlidesLive Video
|
Misha Belkin
🔗
|
Sun 12:00 p.m. - 2:00 p.m.
|
Lunch
|
🔗
|
Sun 2:00 p.m. - 2:30 p.m.
|
Acceleration by Stepsize Hedging, Jason Altschuler
(
Plenary Speaker
)
>
SlidesLive Video
|
Jason Altschuler
🔗
|
Sun 2:30 p.m. - 3:00 p.m.
|
Talk 1: *MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times* and Talk 2: *Provable non-accelerations of the heavy-ball method*
(
Contributed Talks
)
>
SlidesLive Video
|
Artavazd Maranjyan · Gauthier Gidel
🔗
|
Sun 3:00 p.m. - 4:00 p.m.
|
Poster Session 2
(
Poster Session
)
>
|
🔗
|
Sun 4:00 p.m. - 4:30 p.m.
|
Online Learning Guided Quasi-Newton Methods: Improved Global Non-asymptotic Guarantees, Aryan Mokhtari
(
Plenary Speaker
)
>
SlidesLive Video
|
Aryan Mokhtari
🔗
|
Sun 4:30 p.m. - 5:00 p.m.
|
Future of OPT-ML, Discussion
(
Panel Discussion
)
>
SlidesLive Video
|
Cristóbal Guzmán
🔗
|
Sun 5:00 p.m. - 5:05 p.m.
|
Closing Remarks
(
Closing Remarks
)
>
SlidesLive Video
|
Cristóbal Guzmán
🔗
|
-
|
Remove Symmetries to Control Model Expressivity and Improve Optimization
(
Poster
)
>
link
|
Liu Ziyin · Yizhou Xu · Isaac Chuang
🔗
|
-
|
Second-Order Forward-Mode Automatic Differentiation for Optimization
(
Poster
)
>
link
|
Adam Cobb · Atilim Gunes Baydin · Barak Pearlmutter · Susmit Jha
🔗
|
-
|
Graph Neural Networks for Hyperparameter Inference in Ising Solvers
(
Poster
)
>
link
|
Edward Jiang · Timothee Leleu · Sam Reifenstein · Milin Doppalapudi
🔗
|
-
|
Multimodal Federated Learning with Model Personalization
(
Poster
)
>
link
|
Ratun Rahman · Dinh C.Nguyen
🔗
|
-
|
Extra-Gradient and Optimistic Gradient Descent Converge in Iterates Faster than $O(1/\sqrt{T})$ in All Monotone Lipschitz Variational Inequalities
(
Poster
)
>
link
|
Kimon Antonakopoulos
🔗
|
-
|
A theoretical study of the $(L_0,L_1)$-smoothness condition in deep learning
(
Poster
)
>
link
|
Y Cooper
🔗
|
-
|
Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent
(
Poster
)
>
link
|
Peter Richtarik · Simone Maria Giancola · Dymitr Lubczyk · Robin Yadav
🔗
|
-
|
Old Optimizer, New Norm: An Anthology
(
Poster
)
>
link
|
Jeremy Bernstein · Laker Newhouse
🔗
|
-
|
Improving Deep Learning Speed and Performance through Synaptic Neural Balance
(
Poster
)
>
link
|
Antonios Alexos · ian domingo · Pierre Baldi
🔗
|
-
|
Uncoupled and Convergent Learning in Monotone Games under Bandit Feedback
(
Poster
)
>
link
|
Jing Dong · Baoxiang Wang · Yaoliang Yu
🔗
|
-
|
Statistical Inference in Latent Convex Objectives with Stream Data
(
Poster
)
>
link
|
Rohan Chauhan · Emmanouil-Vasileios Vlatakis-Gkaragkounis · Michael Jordan
🔗
|
-
|
On the Convergence of FedProx with Extrapolation and Inexact Prox
(
Poster
)
>
link
|
Hanmin Li · Peter Richtarik
🔗
|
-
|
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
(
Poster
)
>
link
|
Robert Joseph George · David Pitt · Jiawei Zhao · Jean Kossaifi · cheng Luo · Yuandong Tian · Animashree Anandkumar
🔗
|
-
|
Neural Entropic Multimarginal Optimal Transport
(
Poster
)
>
link
|
Dor Tsur · Ziv Goldfeld · Kristjan Greenewald · haim permuter
🔗
|
-
|
Addax: Resource-Efficient Fine-Tuning of Language Models with a Combination of Forward-Backward and Forward-Only Passes
(
Poster
)
>
link
|
Zeman Li · Xinwei Zhang · Peilin Zhong · Yuan Deng · Vahab Mirrokni · Meisam Razaviyayn
🔗
|
-
|
Adaptive Partitioning Schemes for Black-Box Optimization
(
Poster
)
>
link
|
Raja Sunkara · Ardhendu S Tripathy
🔗
|
-
|
Optimal Transport for Probabilistic Circuits
(
Poster
)
>
link
|
Adrian Ciotinga · YooJung Choi
🔗
|
-
|
A Stochastic Algorithm for Sinkhorn Distance-Regularized Distributionally Robust Optimization
(
Poster
)
>
link
|
Yufeng Yang · Yi Zhou · Zhaosong Lu
🔗
|
-
|
Accelerated Stability in Performative Prediction
(
Poster
)
>
link
|
Pedram Khorsandi · Rushil Gupta · Mehrnaz Mofakhami · Simon Lacoste-Julien · Gauthier Gidel
🔗
|
-
|
Consensus Based Optimization Accelerates Gradient Descent
(
Poster
)
>
link
|
Anagha Satish · Ricardo Baptista · Franca Hoffmann
🔗
|
-
|
Hierarchical Simplicity Bias of Neural Networks
(
Poster
)
>
link
|
Zhehang Du
🔗
|
-
|
Understanding Critical Batch Sizes: Scheduling and Batch-Size Invariance in Data-constrained Pre-training
(
Poster
)
>
link
|
Hanlin Zhang · Depen Morwani · Nikhil Vyas · Jingfeng Wu · Difan Zou · Udaya Ghai · Dean Foster · Sham Kakade
🔗
|
-
|
Efficient Levenberg-Marquat for SLAM
(
Poster
)
>
link
|
Amir Belder · REFAEL VIVANTI
🔗
|
-
|
Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials
(
Poster
)
>
link
|
August Chen · Ayush Sekhari · Karthik Sridharan
🔗
|
-
|
On the Inherent Privacy of Two Point Zeroth Order Projected Gradient Descent
(
Poster
)
>
link
|
Devansh Gupta · Meisam Razaviyayn · Vatsal Sharan
🔗
|
-
|
An Elementary Predictor Obtaining 2\sqrt{T} Distance to Calibration
(
Poster
)
>
link
|
Eshwar Ram Arunachaleswaran · Natalie Collina · Aaron Roth · Mirah Shi
🔗
|
-
|
Simple and Scalable Federated Learning with Uncertainty via Improved Variational Online Newton
(
Poster
)
>
link
|
Shivam Pal · Aishwarya Gupta · Saqib Sarwar · Piyush Rai
🔗
|
-
|
A fast and efficient randomized quasi-Newton method
(
Poster
)
>
link
|
Danny Duan · Hanbaek Lyu
🔗
|
-
|
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
(
Poster
)
>
link
|
Louis Fournier · Adel Nabli · Masih Aminbeidokhti · Marco Pedersoli · Eugene Belilovsky · Edouard Oyallon
🔗
|
-
|
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
(
Poster
)
>
link
|
Ashwinee Panda · Vatsal Baherwani · Zain Sarwar · Benjamin Thérien · Stephen Rawls · Sambit Sahu · Supriyo Chakraborty · Tom Goldstein
🔗
|
-
|
Normalization Matters for Optimization Performance on Graph Neural Networks
(
Poster
)
>
link
|
Alan Milligan · Frederik Kunstner · Hamed Shirzad · Mark Schmidt · Danica J. Sutherland
🔗
|
-
|
AdEMAMix: Better and Faster Training with Older Gradients
(
Poster
)
>
link
|
Matteo Pagliardini · Pierre Ablin · David Grangier
🔗
|
-
|
Batch size invariant Adam
(
Poster
)
>
link
|
Xi Wang · Laurence Aitchison
🔗
|
-
|
Cyclic Data Parallelism for Efficient Parallelism of Deep Neural Networks
(
Poster
)
>
link
|
Louis Fournier · Edouard Oyallon
🔗
|
-
|
Neural Networks with Complex-Valued Weights Have No Spurious Local Minima
(
Poster
)
>
link
|
Xingtu Liu
🔗
|
-
|
High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems
(
Poster
)
>
link
|
Andrew Cheng · Kiwon Lee · Courtney Paquette
🔗
|
-
|
Amplitude Modulated Riemannian Optimization for QAP
(
Poster
)
>
link
|
Timothee Leleu · Aron Vizkeleti · Sam Reifenstein
🔗
|
-
|
A Continuous Variable Optimization method for the Quadratic Assignment Problem
(
Poster
)
>
link
|
Aron Vizkeleti · Timothee Leleu
🔗
|
-
|
Aligned Multi-Objective Optimization
(
Poster
)
>
link
|
Yonathan Efroni · Daniel Jiang · Ben Kretzu · Jalaj Bhandari · Zheqing Zhu · Karen Ullrich
🔗
|
-
|
SICNN: Sparsity-induced Input Convex Neural Network for Optimal Transport
(
Poster
)
>
link
|
Peter Chen · Yue Xie · Qingpeng Zhang
🔗
|
-
|
Deconstructing What Makes a Good Optimizer for Language Models
(
Poster
)
>
link
|
Rosie Zhao · Depen Morwani · David Brandfonbrener · Nikhil Vyas · Sham Kakade
🔗
|
-
|
Scalable Second-Order Optimization Algorithms for Minimizing Low-rank Functions
(
Poster
)
>
link
|
Edward Tansley · Coralia Cartis
🔗
|
-
|
Role of Parametrization in Learning Dynamics of Recurrent Neural Networks
(
Poster
)
>
link
|
Adwait Datar · Chinmay Datar · Zahra Monfared · Felix Dietrich
🔗
|
-
|
SOAP: Improving and Stabilizing Shampoo using Adam
(
Poster
)
>
link
|
Nikhil Vyas · Depen Morwani · Rosie Zhao · Itai Shapira · David Brandfonbrener · Lucas Janson · Sham Kakade
🔗
|
-
|
Fast Convergence of Softmax Policy Mirror Ascent for Bandits & Tabular MDPs
(
Poster
)
>
link
|
Reza Asad · Reza Babanezhad Harikandeh · Issam Hadj Laradji · Nicolas Le Roux · Sharan Vaswani
🔗
|
-
|
A Unified Convergence Theory for Large Language Model Efficient Fine-tuning
(
Poster
)
>
link
|
Zhanhong Jiang · Nastaran Saadati · Aditya Balu · Minh Pham · Joshua R Waite · Nasla Saleem · Chinmay Hegde · Soumik Sarkar
🔗
|
-
|
KFOpt: Noise Reduction with Kalman Filter for Improving Differentially Private Optimization
(
Poster
)
>
link
|
Xinwei Zhang · Zhiqi Bu · Borja Balle · Mingyi Hong · Meisam Razaviyayn · Vahab Mirrokni
🔗
|
-
|
Connections between Schedule-Free SGD, Accelerated SGD Variants, and Weight Averaging
(
Poster
)
>
link
|
Depen Morwani · Nikhil Vyas · Hanlin Zhang · Sham Kakade
🔗
|
-
|
On the Hypomonotone Class of Variational Inequalities
(
Poster
)
>
link
|
Khaled Alomar · Tatjana Chavdarova
🔗
|
-
|
Nonlinear tomographic reconstruction via nonsmooth optimization
(
Poster
)
>
link
|
Vasileios Charisopoulos · Rebecca Willett
🔗
|
-
|
Path Integral Optimiser: Global Optimisation via Neural Schr\"odinger-F\"ollmer Diffusion
(
Poster
)
>
link
|
Max McGuinness · Eirik Fladmark · Francisco Vargas
🔗
|
-
|
On the Hardness of Meaningful Local Guarantees in Nonsmooth Nonconvex Optimization
(
Poster
)
>
link
|
Guy Kornowski · Swati Padmanabhan · Ohad Shamir
🔗
|
-
|
MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
(
Poster
)
>
link
|
Artavazd Maranjyan · Omar Shaikh Omar · Peter Richtarik
🔗
|
-
|
Multi Objective Regionalized Bayesian Optimization via Entropy Search
(
Poster
)
>
link
|
Thomas James · Sinnu Thomas
🔗
|
-
|
Incentivizing Truthful Collaboration in Heterogeneous Federated Learning
(
Poster
)
>
link
|
Dimitar Chakarov · Nikita Tsoy · Kristian Minchev · Nikola Konstantinov
🔗
|
-
|
Solving hidden monotone variational inequalities with surrogate losses
(
Poster
)
>
link
|
Ryan D'Orazio · Danilo Vucetic · Zichu Liu · Junhyung Lyle Kim · Ioannis Mitliagkas · Gauthier Gidel
🔗
|
-
|
Linear Attention Sequence Parallelism
(
Poster
)
>
link
|
Weigao Sun · Zhen Qin · Dong Li · Xuyang Shen · Yu Qiao · Yiran Zhong
🔗
|
-
|
Revisiting the Initial Steps in Adaptive Gradient Descent Optimization
(
Poster
)
>
link
|
ABULIKEMU ABUDUWEILI · Changliu Liu
🔗
|
-
|
Modularity aided consistent attributed graph clustering via coarsening
(
Poster
)
>
link
|
Samarth Bhatia · Yukti Makhija · Manoj Kumar · Sandeep Kumar
🔗
|
-
|
On the Crucial Role of Initialization for Matrix Factorization
(
Poster
)
>
link
|
Bingcong Li · Liang Zhang · Aryan Mokhtari · Niao He
🔗
|
-
|
Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking
(
Poster
)
>
link
|
Sahar Rajabi · Sirisha Rambhatla
🔗
|
-
|
ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training
(
Poster
)
>
link
|
Adel Nabli · Louis Fournier · Pierre ERBACHER · Louis Serrano · Eugene Belilovsky · Edouard Oyallon
🔗
|
-
|
Stochastic Proximal Point Methods for Monotone Inclusions under Expected Similarity
(
Poster
)
>
link
|
Abdurakhmon Sadiev · Laurent Condat · Peter Richtarik
🔗
|
-
|
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
(
Poster
)
>
link
|
Hiroki Naganuma · Xinzhi Zhang · Man-Chung Yue · Ioannis Mitliagkas · Russell J. Hewett · Philipp Witte · Yin Tat Lee
🔗
|
-
|
Weak to Strong Learning from Aggregate Labels
(
Poster
)
>
link
|
Yukti Makhija · Rishi Saket
🔗
|
-
|
Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks
(
Poster
)
>
link
|
Shikai Qiu · Atish Agarwala · Lechao Xiao · Jeffrey Pennington
🔗
|
-
|
Stop Being So Positive: Negative Step Sizes in Second-Order Methods
(
Poster
)
>
link
|
Betty Shea · Mark Schmidt
🔗
|
-
|
Structured Regularization on the SPD Manifold
(
Poster
)
>
link
|
Andrew Cheng · Melanie Weber
🔗
|
-
|
Dual Feature Reduction for the Sparse-Group Lasso and its Adaptive Variant
(
Poster
)
>
link
|
Fabio Feser · Marina Evangelou
🔗
|
-
|
Communication-Efficient Loss Minimization over Heterogeneous Data with Federated Hierarchical Ensemble Aggregation via Distillation
(
Poster
)
>
link
|
Sayantan Chowdhury · Ben Liang · Ali Tizghadam · Ilijc Albanese
🔗
|
-
|
Memory Efficient Adaptive Stochastic Optimization via Subset-Norm
(
Poster
)
>
link
|
Thien H Nguyen · Huy Nguyen
🔗
|
-
|
Spurious Stationarity and Hardness Results for Mirror Descent
(
Poster
)
>
link
|
He Chen · Jiajin Li · Anthony Man-Cho So
🔗
|
-
|
LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression
(
Poster
)
>
link
|
Laurent Condat · Artavazd Maranjyan · Peter Richtarik
🔗
|
-
|
Discrete-Continuous Variational Optimization with Local Gradients
(
Poster
)
>
link
|
Jonathan Warrell · Francesco Alesiani · Cameron Smith · Anja Mösch · Martin Renqiang Min
🔗
|
-
|
Differentially Private Random Block Coordinate Descent
(
Poster
)
>
link
|
Artavazd Maranjyan · Abdurakhmon Sadiev · Peter Richtarik
🔗
|
-
|
$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers
(
Poster
)
>
link
|
Benjamin Thérien · Charles-Étienne Joseph · Boris Knyazev · Edouard Oyallon · Irina Rish · Eugene Belilovsky
🔗
|
-
|
SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Nonconvex Cross-Device Federated Learning
(
Poster
)
>
link
|
Avetik Karagulyan · Egor Shulgin · Abdurakhmon Sadiev · Peter Richtarik
🔗
|
-
|
The Dimension Strikes Back with Gradients: Generalization of Gradient Methods in Stochastic Convex Optimization
(
Poster
)
>
link
|
Matan Schliserman · Uri Sherman · Tomer Koren
🔗
|
-
|
On Convergence of SGD with Adaptive Clipping
(
Poster
)
>
link
|
Egor Shulgin · Peter Richtarik
🔗
|
-
|
Optimizing Attention
(
Poster
)
>
link
|
Hanno Ackermann · Hong Cai · Markus Nagel · Leyla Mirvakhabova · Farhad G. Zanjani · Fatih Porikli
🔗
|
-
|
The Crucial Role of Samplers in Online Direct Preference Optimization
(
Poster
)
>
link
|
Ruizhe Shi · Runlong Zhou · Simon Du
🔗
|
-
|
Applications of fractional calculus in learned optimization
(
Poster
)
>
link
|
Teodor Szente · James Harrison · Mihai Zanfir · Cristian Sminchisescu
🔗
|
-
|
Stochastic Quasi-Variational Inequalities: Convergence Analysis Beyond Strong Monotonicity
(
Poster
)
>
link
|
zeinab alizadeh · Afrooz Jalilzadeh
🔗
|
-
|
From Gradient Clipping to Normalization for Heavy Tailed SGD
(
Poster
)
>
link
|
Florian Hübler · Ilyas Fatkhullin · Niao He
🔗
|
-
|
Estimating Vote Choice in U.S. Elections with Approximate Poisson-Binomial Logistic Regression
(
Poster
)
>
link
|
Nic Fishman · Evan Rosenman
🔗
|
-
|
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
(
Poster
)
>
link
|
Charles Blake · Constantin Eichenberg · Josef Dean · Lukas Balles · Luke Prince · Björn Deiseroth · Andres Felipe Cruz-Salinas · Carlo Luschi · Samuel Weinbach · Douglas Orr
🔗
|
-
|
Personalized Federated Learning via Low-Rank Matrix Factorization
(
Poster
)
>
link
|
Ali Dadras · Sebastian Stich · Alp Yurtsever
🔗
|
-
|
Online Nonconvex Bilevel Optimization with Bregman Divergences
(
Poster
)
>
link
|
Jason Bohne · David Rosenberg · Gary Kazantsev · Pawel Polak
🔗
|
-
|
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
(
Poster
)
>
link
|
Neal G. Lawton · Aram Galstyan · Greg Ver Steeg
🔗
|
-
|
A Second-Order Algorithm for Empirical Group Distributionally Robust Regression
(
Poster
)
>
link
|
Naren Manoj · Kumar Kshitij Patel
🔗
|
-
|
Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior
(
Poster
)
>
link
|
Anming Gu · Edward Chien · Kristjan Greenewald
🔗
|
-
|
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks
(
Poster
)
>
link
|
Amrutha Varshini Ramesh · Vignesh Ganapathiraman · Issam Hadj Laradji · Mark Schmidt
🔗
|
-
|
Nonmonotone Line Searches Operate at the Edge of Stability
(
Poster
)
>
link
|
Curtis Fox · Leonardo Galli · Mark Schmidt · Holger Rauhut
🔗
|
-
|
Glocal Smoothness: Line Search can really help!
(
Poster
)
>
link
|
Curtis Fox · Mark Schmidt
🔗
|
-
|
Dueling in the Dark: An Efficient and Optimal Mirror Descent Approach for Online Optimization with Adversarial Preferences
(
Poster
)
>
link
|
Aadirupa Saha · Barry-John Theobald · Yonathan Efroni
🔗
|
-
|
Fast decentralized gradient tracking for federated learning with local updates: From mini to minimax optimization
(
Poster
)
>
link
|
Chris Junchi Li
🔗
|
-
|
Communication-efficient Algorithms Under Generalized Smoothness Assumptions
(
Poster
)
>
link
|
Sarit Khirirat · Abdurakhmon Sadiev · Artem Riabinin · Eduard Gorbunov · Peter Richtarik
🔗
|
-
|
Policy Optimization for Strictly Batch Imitation Learning
(
Poster
)
>
link
|
Rishabh Agrawal · Nathan Dahlin · Rahul Jain · Ashutosh Nayyar
🔗
|
-
|
Understanding Adam Requires Better Rotation Dependent Assumptions
(
Poster
)
>
link
|
Tianyue Zhang · Lucas Maes · Charles Guille-Escuret · Alexia Jolicoeur-Martineau · Ioannis Mitliagkas · Simon Lacoste-Julien · Damien Scieur
🔗
|
-
|
Aggregating Data for Optimal and Private Learning
(
Poster
)
>
link
|
Sushant Agarwal · Yukti Makhija · Rishi Saket · Aravindan Raghuveer
🔗
|
-
|
Intuitive Analysis of the Quantization based Optimization : From establishing a SDE to Quantum Mechanical Perspective
(
Poster
)
>
link
|
Jinwuk Seok · Changsik Cho
🔗
|
-
|
In the Search for Optimal Portfolios of Counterstrategies in the Large Imperfect Information Games
(
Poster
)
>
link
|
Karolina Drabent · David Milec · Ondrej Kubicek · Viliam Lisy
🔗
|
-
|
Lion's sign noise can make training more stable
(
Poster
)
>
link
|
Simon Elistratov · Andrey Podivilov · Timofei Iuzhakov · Dmitry Vetrov
🔗
|
-
|
Dimensionality Reduction Techniques for Global Bayesian Optimisation
(
Poster
)
>
link
|
Luo Long · Coralia Cartis · Paz Fink Shustin
🔗
|
-
|
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time
(
Poster
)
>
link
|
Yingyu Liang · Zhizhou Sha · Zhenmei Shi · Zhao Song · Yufa Zhou
🔗
|