Poster
in
Workshop: Workshop on Machine Learning and Compression
Sustainable AI: Efficient Pruning of Large Language Models in Resource-Limited Environments
Ashhadul Islam · Samir Brahim Belhaouari · Amine Bermak
The rapid growth and deployment of large language models (LLMs) like ChatGPT have revolutionized artificial intelligence, particularly in natural language processing, but they come with significant computational and environmental costs, including high energy consumption and carbon emissions. Addressing these challenges, our research introduces novel pruning techniques—"evolution of weights" and "smart pruning"—to enhance the efficiency of deep neural networks, especially on embedded devices. By systematically evaluating the importance of individual parameters during training, our methods achieve higher compression rates and faster computations while preserving accuracy, outperforming traditional pruning approaches. Extensive experiments with both scaled-down and larger multimodal LLMs demonstrate that moderate pruning can improve efficiency and reduce resource consumption with minimal accuracy loss, though excessive pruning can degrade performance. Our LLM experiment, available on GitHub, underscores the critical need for optimized AI models that balance technological advancement with ecological sustainability.