Workshop
The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models
Mehdi Rezagholizadeh · Peyman Passban · Yu Cheng · Soheila Samiee · Yue Dong · Vahid Partovi Nia · Qun Liu · Boxing Chen
West Meeting Room 301
Sat 14 Dec, 8:15 a.m. PST
The fourth version of the Efficient Natural Language and Speech Processing (ENLSP-IV) workshop will focus on how to make large language and foundation models more efficient in terms of Architecture, Training, and Inference in their real-world applications. This year, following the trend of industry and academia, we put more emphasis on investigating new architectures to make future language and foundation models more efficient. Moreover, we highlight the importance of comprehensive evaluation and benchmarking new efficient models from different practical aspects. The workshop program offers an interactive platform for gathering experts and talents from academia and industry through invited talks, panel discussion, paper submission, reviews, interactive poster sessions, oral presentations and a couple of mentorship sessions for new researchers. This will be a unique opportunity to discuss and share challenging problems, build connections, exchange ideas and brainstorm, and foster future collaborations. The topics of this workshop can be of interest for people working on general machine learning, deep learning, hardware, optimization, theory and applications.
Schedule
Sat 8:15 a.m. - 8:30 a.m.
|
Opening Remarks
(
Opening
)
>
link
SlidesLive Video |
Mehdi Rezagholizadeh 🔗 |
Sat 8:30 a.m. - 9:00 a.m.
|
Efficiency through Learning from Experience
(
KeyNote Talk
)
>
SlidesLive Video |
Bhavana Dalvi Mishra · Peter Clark 🔗 |
Sat 9:00 a.m. - 9:30 a.m.
|
Multi-Teacher Distillation: An Ensemble-Then-Distill Approach
(
KeyNote Talk
)
>
SlidesLive Video |
Lili Mou 🔗 |
Sat 9:30 a.m. - 10:00 a.m.
|
Morning Break
|
🔗 |
Sat 10:00 a.m. - 10:30 a.m.
|
Hardware-aware Algorithms for Language Modeling
(
KeyNote Talk
)
>
SlidesLive Video |
Tri Dao 🔗 |
Sat 10:30 a.m. - 11:00 a.m.
|
Speech generative modeling with little tokenization
(
KeyNote Talk
)
>
SlidesLive Video |
Navdeep Jaitly 🔗 |
Sat 11:00 a.m. - 11:30 a.m.
|
Optimizing Data Use for Efficient Pre-training
(
KeyNote Talk
)
>
SlidesLive Video |
Danqi Chen 🔗 |
Sat 11:30 a.m. - 11:36 a.m.
|
Sparsified State-Space Models are Efficient Highway Networks
(
Oral
)
>
SlidesLive Video |
Woomin Song · Jihoon Tack · Sangwoo Mo · Seunghyuk Oh · Jinwoo Shin 🔗 |
Sat 11:36 a.m. - 11:42 a.m.
|
Longhorn: State Space Models are Amortized Online Learners
(
Oral
)
>
SlidesLive Video |
Bo Liu · Rui Wang · Lemeng Wu · Yihao Feng · Peter Stone · Qiang Liu 🔗 |
Sat 11:42 a.m. - 11:48 a.m.
|
GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference
(
Oral
)
>
SlidesLive Video |
· Qingru Zhang · Souvik Kundu · Geonhwa Jeong · Zaoxing Liu · Tushar Krishna · Tuo Zhao 🔗 |
Sat 11:48 a.m. - 11:54 a.m.
|
An Evolved Universal Transformer Memory
(
Oral
)
>
SlidesLive Video |
Edoardo Cetin · Qi Sun · Tianyu Zhao · Yujin Tang 🔗 |
Sat 11:54 a.m. - 12:00 p.m.
|
OLMoE: Open Mixture-of-Experts Language Models
(
Oral
)
>
SlidesLive Video |
23 presentersNiklas Muennighoff · Luca Soldaini · Dirk Groeneveld · Kyle Lo · Jacob Morrison · Sewon Min · Weijia Shi · Evan Walsh · Oyvind Tafjord · Nathan Lambert · Yuling Gu · Shane Arora · Akshita Bhagia · Dustin Schwenk · David Wadden · Alexander Wettig · Binyuan Hui · Tim Dettmers · Douwe Kiela · Noah Smith · Pang Wei Koh · Amanpreet Singh · Hannaneh Hajishirzi |
Sat 12:00 p.m. - 1:30 p.m.
|
Lunch Break
|
🔗 |
Sat 12:30 p.m. - 1:30 p.m.
|
Poster Session I- (Paper IDs #1 - #50) ( Posters ) > link | 🔗 |
Sat 1:30 p.m. - 1:36 p.m.
|
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
(
Oral
)
>
link
SlidesLive Video |
14 presentersDi Liu · Meng Chen · Baotong Lu · Huiqiang Jiang · Zhenhua Han · Qianxi Zhang · Qi Chen · Chengruidong Zhang · Bailu Ding · Kai Zhang · Chen Chen · Fan Yang · Yuqing Yang · Lili Qiu |
Sat 1:36 p.m. - 1:42 p.m.
|
Post-Training Statistical Calibration for Higher Activation Sparsity
(
Oral
)
>
SlidesLive Video |
Vui Seng Chua · Yujie Pan · Nilesh Jain · Vui Seng Chua 🔗 |
Sat 1:42 p.m. - 1:48 p.m.
|
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
(
Oral
)
>
SlidesLive Video |
Niklas Schmidinger · Lisa Schneckenreiter · Philipp Seidl · Johannes Schimunek · Pieter-Jan Hoedt · Johannes Brandstetter · Andreas Mayr · Sohvi Luukkonen · Sepp Hochreiter · GĂźnter Klambauer 🔗 |
Sat 1:48 p.m. - 1:54 p.m.
|
Inference-Friendly Models With MixAttention
(
Oral
)
>
SlidesLive Video |
Shashank Rajput · Ying Sheng · Sean Owen · Vitaliy Chiley 🔗 |
Sat 1:54 p.m. - 2:00 p.m.
|
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation
(
Oral
)
>
SlidesLive Video |
Fabian Paischer · Lukas Hauzenberger · Thomas Schmied · Benedikt Alkin · Marc Deisenroth · Sepp Hochreiter 🔗 |
Sat 2:00 p.m. - 2:30 p.m.
|
The LoRA Journey and Learnings: from Creation to Industrial-Scale Adoption
(
KeyNote Talk
)
>
SlidesLive Video |
Weizhu Chen 🔗 |
Sat 2:30 p.m. - 3:00 p.m.
|
How to build fully open language models: from pre-training to post-training
(
KeyNote Talk
)
>
SlidesLive Video |
Hannaneh Hajishirzi 🔗 |
Sat 3:00 p.m. - 3:30 p.m.
|
Afternoon Break
|
🔗 |
Sat 3:30 p.m. - 4:20 p.m.
|
Panel Discussion
(
Panel
)
>
SlidesLive Video |
Joel Hestness · Navdeep Jaitly · Marjan Ghazvininejad · Katie Derthick · Yue Dong · Soheila Samiee 🔗 |
Sat 4:20 p.m. - 4:30 p.m.
|
Best Paper Awards and Closing Remarks
(
Closing
)
>
SlidesLive Video |
Mehdi Rezagholizadeh 🔗 |
Sat 4:30 p.m. - 5:30 p.m.
|
Poster Session II- (Paper IDs #51 - #105) ( Posters ) > link | 🔗 |
-
|
BiRNA-BERT: Adaptive Tokenization for Efficient RNA Language Modeling
(
Oral
)
>
SlidesLive Video |
Toki Tahmid · Haz Sameen Shahgir · Sazan Mahbub · Yue Dong · Md. Shamsuzzoha Bayzid 🔗 |
-
|
Snakes and Ladders: Accelerating SSM Inference with Speculative Decoding
(
Poster
)
>
SlidesLive Video |
Yangchao Wu · Yonatan Dukler · Matthew Trager · Alessandro Achille · Wei Xia · Stefano Soatto 🔗 |
-
|
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
(
Poster
)
>
SlidesLive Video |
12 presentersMichael Pieler · Marco Bellagente · Hannah Teufel · Duy Phung · Nathan Cooper · Jonathan Tow · Paulo Rocha · Reshinth Adithyan · Zaid Alyafeai · Nikhil Pinnaparaju · Maksym Zhuravinskyi · Carlos Riquelme Ruiz |
-
|
ThinK: Thinner Key Cache by Query-Driven Pruning
(
Poster
)
>
SlidesLive Video |
Yuhui Xu · Allan Jie · Hanze Dong · Lei Wang · Xudong LU · Aojun Zhou · Amrita Saha · Caiming Xiong · Doyen Sahoo 🔗 |
-
|
A dynamic parallel method for performance optimization on hybrid CPUs
(
Poster
)
>
|
Yu Luo · Yucheng Liu · Haihao Shen 🔗 |
-
|
Disentangling Questions from Query Generation for Task-Adaptive Retrieval
(
Poster
)
>
SlidesLive Video |
Yoonsang Lee · Minsoo Kim · seung-won hwang 🔗 |
-
|
The N-Grammys: Accelerating Autoregressive Inference with Learning-Free Batched Speculation
(
Poster
)
>
|
Lawrence Stewart · Matthew Trager · Sujan Gonugondla · Stefano Soatto 🔗 |
-
|
Different Rates for Different Weights: Decoupled Relative Learning Rate Schedules
(
Poster
)
>
SlidesLive Video |
Jan Ludziejewski · Jan MaĹaĹnicki · Maciej PiĂłro · MichaĹ Krutul · Kamil Ciebiera · Jakub Krajewski · Marek Cygan · Kamil Adamczewski · Sebastian Jaszczur 🔗 |
-
|
Distributed Speculative Inference of Large Language Models is Provably Faster
(
Poster
)
>
SlidesLive Video |
Nadav Timor · Jonathan Mamou · Oren Pereg · Moshe Berchansky · Daniel Korat · Moshe Wasserblat · Tomer Galanti · Michal Gordon (Kiwkowitz) · David Harel 🔗 |
-
|
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
(
Poster
)
>
|
Joao Monteiro · Etienne Marcotte · Pierre-Andre Noel · Valentina Zantedeschi · David Vazquez · Nicolas Chapados · Christopher Pal · Perouz Taslakian 🔗 |
-
|
Text Summarization With Graph Attention Networks
(
Poster
)
>
SlidesLive Video |
Mohammadreza Ardestani · Yllias Chali 🔗 |
-
|
How Redundant Is the Transformer Stack in Speech Representation Models?
(
Poster
)
>
SlidesLive Video |
Albert Kjøller Jacobsen · Teresa Scheidt · Lenka HĂ˝lovĂĄ · Lars Kai Hansen 🔗 |
-
|
Dynamic Speculation Lookahead Accelerates Speculative Decoding of Large Language Models
(
Poster
)
>
SlidesLive Video |
Jonathan Mamou · Oren Pereg · Daniel Korat · Moshe Berchansky · Nadav Timor · Moshe Wasserblat · Roy Schwartz 🔗 |
-
|
AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
(
Poster
)
>
SlidesLive Video |
Sudhanshu Agrawal · Wonseok Jeon · Mingu Lee 🔗 |
-
|
OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters
(
Poster
)
>
SlidesLive Video |
Zexin Chen · Chengxi Li · Xiangyu Xie · Parijat Dube 🔗 |
-
|
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
(
Poster
)
>
|
Ashwinee Panda · Vatsal Baherwani · Zain Sarwar · Benjamin Therien · Sambit Sahu · Stephen Rawls · Supriyo Chakraborty · Tom Goldstein 🔗 |
-
|
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
(
Poster
)
>
SlidesLive Video |
11 presentersYun Zhu · Jia-Chen Gu · Caitlin Sikora · Ho Ko · Yinxiao Liu · Chu-Cheng Lin · Lei Shu · Liangchen Luo · Lei Meng · Bang Liu · Jindong Chen |
-
|
VL-Mamba: Exploring State Space Models for Multimodal Learning
(
Poster
)
>
|
Yanyuan Qiao · Zheng Yu · Zijia Zhao · Sihan Chen · Mingzhen Sun · Longteng Guo · Qi Wu · Jing Liu 🔗 |
-
|
Less is Enough: Adapting Pre-trained Vision Transformers for Audio-Visual Speaker Verification
(
Poster
)
>
SlidesLive Video |
Gnana Praveen Rajasekhar · MD JAHANGIR ALAM 🔗 |
-
|
Composite Attention: A Framework for Combining Sequence Mixing Primitives
(
Poster
)
>
|
Jake Cunningham · Marc Deisenroth 🔗 |
-
|
S2D: Sorted Speculative Decoding For More Efficient Deployment of Large Language Models
(
Poster
)
>
|
Parsa Kavehzadeh · Mohammadreza Pourreza · Mojtaba Valipour · Tianshu Zhu · Haoli Bai · Ali Ghodsi · Boxing Chen · Mehdi Rezaghoizadeh 🔗 |
-
|
On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning
(
Poster
)
>
SlidesLive Video |
Soheila Samiee · Anton Thielmann 🔗 |
-
|
Scaling Smart: Accelerating Large Language Model Pre-Training with Small Model Initialization
(
Poster
)
>
SlidesLive Video |
Mohammad Samragh · Iman Mirzadeh · Keivan Alizadeh-Vahid · Fartash Faghri · Minsik Cho · Moin Nabi · Devang Naik · Mehrdad Farajtabar 🔗 |
-
|
MisD-MoE: A Multimodal Misinformation Detection Framework with Adaptive Feature Selection
(
Poster
)
>
SlidesLive Video |
Moyang Liu · Kaiying Yan · Yukun Liu · Ruibo Fu · zhengqi wen · Xuefei Liu · Chenxing Li 🔗 |
-
|
Approximations may be all you need: Towards Pre-training LLMs with Low-Rank Decomposition and Optimizers
(
Poster
)
>
SlidesLive Video |
Namrata Shivagunde · Mayank Kulkarni · Giannis Karamanolakis · Jack FitzGerald · Yannick Versley · Saleh Soltan · Volkan Cevher · Jianhua Lu · Anna Rumshisky 🔗 |
-
|
Partially Shared Query-Key for Lightweight Language Models
(
Poster
)
>
SlidesLive Video |
Kai Yang · Vahid Partovi Nia · Boxing Chen · Masoud Asgharian 🔗 |
-
|
RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection
(
Poster
)
>
SlidesLive Video |
Ali Saheb Pasand · Pouya Bashivan 🔗 |
-
|
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
(
Poster
)
>
SlidesLive Video |
Vicky Zayats · Peter Chen · Melissa Ferrari · Dirk Padfield 🔗 |
-
|
Improving Multi-candidate Speculative Decoding
(
Poster
)
>
SlidesLive Video |
XiaoFan Lu · Yixiao Zeng · marco levorato · Feiyang Ma · ZiXu Yu 🔗 |
-
|
Efficient Alignment of Large Language Models via Data Sampling
(
Poster
)
>
SlidesLive Video |
Amrit Khera · Rajat Ghosh · Debojyoti Dutta 🔗 |
-
|
Towards Low-bit Communication for Tensor Parallel LLM Inference
(
Poster
)
>
SlidesLive Video |
Harry Dong · Tyler Johnson · Minsik Cho · Emad Soroush 🔗 |
-
|
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
(
Poster
)
>
SlidesLive Video |
Luning Wang · Shiyao Li · Xuefei Ning · Zhihang Yuan · Shengen Yan · Guohao Dai · Yu Wang 🔗 |
-
|
StructMoE : Structured Mixture of Experts Using Low Rank Experts
(
Poster
)
>
|
11 presentersZain Sarwar · Ashwinee Panda · Benjamin ThĂŠrien · Stephen Rawls · Anirban Das · Kartik Balasubramaniam · Berkcan Kapusuzoglu · Shixiong Zhang · Sambit Sahu · MILIND NAPHADE · Supriyo Chakraborty |
-
|
A Systematic Evaluation of Decoding-Free Generative Candidate Selection Methods
(
Poster
)
>
|
Mingyu Derek Ma · Yanna Ding · Zijie Huang · Jianxi Gao · Yizhou Sun · Wei Wang 🔗 |
-
|
Dynamic layer selection in decoder-only transformers
(
Poster
)
>
SlidesLive Video |
Theodore Glavas · Joud Chataoui · florence regol · Wassim Jabbour · Antonios Valkanas · Mark Coates · Boris Oreshkin 🔗 |
-
|
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation
(
Poster
)
>
SlidesLive Video |
Rambod Azimi · Rishav Rishav · Marek Teichmann · Samira Ebrahimi Kahou 🔗 |
-
|
Sparse Upcycling: Inference Inefficient Finetuning
(
Poster
)
>
SlidesLive Video |
Sasha Doubov · Nikhil Sardana · Vitaliy Chiley 🔗 |
-
|
A Unified Framework for Speculative Decoding with Multiple Drafters as a Bandit
(
Poster
)
>
|
Taehyeon Kim · Hojung Jung · Se-Young Yun 🔗 |
-
|
Residual vector quantization for KV cache compression in large language model
(
Poster
)
>
SlidesLive Video |
Ankur Kumar 🔗 |
-
|
Accelerating the Low-Rank Decomposed Models
(
Poster
)
>
|
Habib Hajimolahoseini · Walid Ahmed · Shuangyue Wen · Yang Liu 🔗 |
-
|
CROSS-JEM: Accurate and Efficient Cross-encoders for Short-text Ranking Tasks
(
Poster
)
>
SlidesLive Video |
Bhawna Paliwal · Deepak Saini · Mudit Dhawan · Siddarth Asokan · Nagarajan Natarajan · Surbhi Aggarwal · Pankaj Malhotra · Jian Jiao · Manik Varma 🔗 |
-
|
Enhanced label noise robustness through early adaptive filtering for the self-supervised speaker verification task
(
Poster
)
>
SlidesLive Video |
Abderrahim Fathan · Xiaolin Zhu · MD JAHANGIR ALAM 🔗 |
-
|
Speculative Streaming: Fast LLM Inference without Auxiliary Models
(
Poster
)
>
SlidesLive Video |
Nikhil Bhendawade · Mahyar Najibi · Irina Belousova · Qichen Fu · Henry Mason · Mohammad Rastegari 🔗 |
-
|
On the Implicit Relation between Low-Rank Adaptation and Differential Privacy
(
Poster
)
>
SlidesLive Video |
Saber Malekmohammadi · Golnoosh Farnadi 🔗 |
-
|
Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
(
Poster
)
>
|
Habib Hajimolahoseini · Walid Ahmed · Shuangyue Wen · Yang Liu 🔗 |
-
|
Hysteresis Activation Function for Efficient Inference
(
Poster
)
>
SlidesLive Video |
Moshe Kimhi · Idan Kashani · Chaim Baskin · Avi Mendelson 🔗 |
-
|
Approximate Top-k for Increased Parallelism
(
Poster
)
>
link
SlidesLive Video |
Oscar Key · Luka Ribar · Alberto Cattaneo · Luke Hudlass-Galley · Douglas Orr 🔗 |
-
|
A Simple and Effective L2 Norm-Based Strategy for KV Cache Compression
(
Poster
)
>
SlidesLive Video |
Alessio Devoto · Yu Zhao · Simone Scardapane · Pasquale Minervini 🔗 |
-
|
RAEE: A Robust Retrieval-Augmented Early Exiting Framework for Efficient Inference
(
Poster
)
>
|
Lianming HUANG · Shangyu Wu · Yufei Cui · Ying Xiong · Xue (Steve) Liu · Tei-Wei Kuo · Nan Guan · Chun Jason XUE 🔗 |
-
|
Efficiently Dispatching Flash Attention For Partially Filled Attention Masks
(
Poster
)
>
SlidesLive Video |
Agniv Sharma · Jonas Geiping 🔗 |
-
|
Speculative Diffusion Decoding for Accelerated Language Generation
(
Poster
)
>
SlidesLive Video |
Jacob K Christopher · Brian Bartoldson · Tal Ben-Nun · Michael Cardei · Bhavya Kailkhura · Nando Fioretto 🔗 |
-
|
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
(
Poster
)
>
SlidesLive Video |
Keivan Alizadeh-Vahid · Iman Mirzadeh · Hooman Shahrkokhi · Dmitry Belenko · Frank Sun · Minsik Cho · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar 🔗 |
-
|
The EarlyBird Gets the WORM: Heuristically Accelerating EarlyBird Convergence
(
Poster
)
>
SlidesLive Video |
Adithya Vasudev 🔗 |
-
|
Post Training Quantization of Large Language Models with Microscaling Formats
(
Poster
)
>
SlidesLive Video |
Sayeh Sharify · Utkarsh Saxena · Zifei Xu · Wanzin Yazar · Ilya Soloveychik · Xin Wang 🔗 |
-
|
Computational Bottlenecks of Training Small-scale Large Language Models
(
Poster
)
>
SlidesLive Video |
Saleh Ashkboos · Iman Mirzadeh · Keivan Alizadeh-Vahid · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar · Fartash Faghri 🔗 |
-
|
QuAILoRA: Quantization-Aware Initialization for LoRA
(
Poster
)
>
SlidesLive Video |
Neal G. Lawton · Aishwarya Padmakumar · Judith Gaspers · Jack FitzGerald · Anoop Kumar · Greg Ver Steeg · Aram Galstyan 🔗 |
-
|
Mai Ho`omÄuna i ka `Ai: Language Models Improve Automatic Speech Recognition in Hawaiian
(
Poster
)
>
SlidesLive Video |
Kaavya Chaparala · Guido Zarrella · Bruce Torres Fischer · Larry Kimura · Oiwi Parker Jones 🔗 |
-
|
Inducing Elasticity in Foundation Models: Post-Training Techniques for Adaptable Inference
(
Poster
)
>
|
Aashiq Muhamed · Jiarui Liu · Mona Diab · Virginia Smith 🔗 |
-
|
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
(
Poster
)
>
SlidesLive Video |
Yongchang Hao · Yanshuai Cao · Lili Mou 🔗 |
-
|
Enabling Resource-Efficient On-Device Fine-Tuning of LLMs Using Only Inference Engines
(
Poster
)
>
SlidesLive Video |
Lei Gao · Amir Ziashahabi · Yue Niu · Salman Avestimehr · Murali Annavaram 🔗 |
-
|
EchoAtt: Attend, Copy, then Adjust\\ for More Efficient Large Language Models
(
Poster
)
>
|
Hossein Rajabzadeh · Aref Jafari · Aman Sharma · Benyamin Jami · HYOCK JU KWON · Ali Ghodsi · Boxing Chen · Mehdi Rezaghoizadeh 🔗 |
-
|
ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain
(
Poster
)
>
SlidesLive Video |
Ali Shiraee Kasmaee · Mohammad Khodadad · Mohammad Arshi Saloot · Nick Sherck · Stephen Dokas · Hamidreza Mahyar · Soheila Samiee 🔗 |
-
|
Scaling laws for post-training quantized large language models
(
Poster
)
>
SlidesLive Video |
Zifei Xu · Alexander Lan · Wanzin Yazar · Tristan Webb · Sayeh Sharify · Xin Wang 🔗 |
-
|
Beyond Token Generation: Adaptive Chunk-Distilled Language Modeling
(
Poster
)
>
SlidesLive Video |
Yanhong Li · Karen Livescu · Jiawei Zhou 🔗 |
-
|
Dataset Distillation for Audio Classification: A Data-Efficient Alternative to Active Learning
(
Poster
)
>
SlidesLive Video |
Gautham Krishna Gudur · Edison Thomaz 🔗 |
-
|
SharedContextBench: How Lossy are Long-context Methods in KV Cache Reuse
(
Poster
)
>
SlidesLive Video |
11 presentersYucheng LI · Huiqiang Jiang · Qianhui Wu · Xufang Luo · Surin Ahn · Chengruidong Zhang · Amir Abdi · Dongsheng Li · Jianfeng Gao · Yuqing Yang · Lili Qiu |
-
|
Dynamic Vocabulary Pruning in Early-Exit LLMs
(
Poster
)
>
SlidesLive Video |
· Karim Abdel Sadek · Joan Velja · Matteo Nulli · Metod Jazbec 🔗 |
-
|
FastDraft: How to Train Your Draft
(
Poster
)
>
SlidesLive Video |
Ofir Zafrir · Igor Margulis · Dorin Shteyman · Guy Boudoukh 🔗 |
-
|
Lightweight Neural Networks for Speech Emotion Recognition using Layer-wise Adaptive Quantization
(
Poster
)
>
|
Tushar Shinde · RITIKA JAIN · Avinash Kumar Sharma 🔗 |
-
|
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
(
Poster
)
>
|
Rongzhi Zhang · Kuan Wang · Liyuan Liu · Shuohang Wang · Hao Cheng · Chao Zhang · Yelong Shen 🔗 |
-
|
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
(
Poster
)
>
SlidesLive Video |
Youngseog Chung · Dhruv Malik · Jeff Schneider · Yuanzhi Li · Aarti Singh 🔗 |
-
|
SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings
(
Poster
)
>
SlidesLive Video |
Mohammad Ali Sadraei Javaheri · Ehsaneddin Asgari · Alice McHardy · Hamid Rabiee 🔗 |
-
|
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning
(
Poster
)
>
SlidesLive Video |
Soumajyoti Sarkar · Leonard Lausen · Volkan Cevher · Thomas Brox · Sheng Zha · George Karypis 🔗 |
-
|
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
(
Poster
)
>
SlidesLive Video |
Costin-Andrei Oncescu · Sanket Purandare · Stratos Idreos · Sham Kakade 🔗 |