NeurIPS Local LoRA: Memory-Efficient Fine-Tuning of Large Language Models

Poster
in
Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants

Local LoRA: Memory-Efficient Fine-Tuning of Large Language Models

Oscar Key · Jean Kaddour · Pasquale Minervini

[ Abstract ] [ Project Page ]

[ Poster]

Abstract:

We present Local LoRA, a memory-flexible fine-tuning approach that, in principle, can fine-tune an arbitrarily large model on fixed hardware, including consumer grade GPUs. Our approach aims to decouple the size of the model and the memory required to fine-tune it by dividing the model into chunks and sequentially fine tuning each chunk. Our results show that Local LoRA closes the gap between the un-tuned model and end-to-end LoRA on math reasoning tasks.

Chat is not available.

Poster in Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants

Local LoRA: Memory-Efficient Fine-Tuning of Large Language Models

Oscar Key · Jean Kaddour · Pasquale Minervini

Poster
in
Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants