NeurIPS How Does LLM Compression Affect Weight Exfiltration Attacks?

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)

How Does LLM Compression Affect Weight Exfiltration Attacks?

Davis Brown · Mantas Mazeika

Keywords: [ compression ] [ safety ] [ security ] [ weight exfiltration ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: As frontier AIs become more powerful and costly to develop, adversaries have increasing incentives to mount weight exfiltration attacks. In this work, we explore how advanced compression techniques can significantly heighten this risk, particularly for large language models (LLMs). By tailoring compression specifically for exfiltration rather than inference, we demonstrate that attackers could achieve up to $16\times$ compression with minimal trade-offs, reducing exfiltration time from months to days. To quantify this risk, we propose a model for exfiltration success and show how compression tactics can greatly reduce exfiltration time and increase attack success rate. With AIs becoming increasingly valuable to industry and government, our findings underscore the urgent need to develop defenses for weight exfiltration and secure model weights.

Chat is not available.

Poster in Workshop: Socially Responsible Language Modelling Research (SoLaR)

How Does LLM Compression Affect Weight Exfiltration Attacks?

Davis Brown · Mantas Mazeika

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)