Poster
in
Workshop: Safe Generative AI
Auditing Empirical Privacy Protection of Private LLM Adaptations
BartÅ‚omiej Marek · Vincent Hanke · Xun Wang · Michael Backes · Adam Dziedzic · Franziska Boenisch
A recent position paper (Tramer et al., ICML'24) challenges the common approach for privacy-preserving machine learning in the era of foundation models, namely the practice of pretraining these large models on ``public'' data and then privately adapting them to sensitive downstream datasets. In particular, it raises the conceptual concern that the expected privacy protection might not hold in practice.To analyze the issue from a practical standpoint, we conduct a thorough investigation of privacy risks under "private" adaptations in large language models (LLMs). Relying on the latest privacy attack, robust membership inference, we study the actual privacy risks for the pretraining and adaptation data. We benchmark the privacy risks by systematically varying the distribution of adaptation data, ranging from perfect overlap with the pretraining data to out-of-distribution (OOD) examples. Additionally, we evaluate how different kinds of adaptation methods and different privacy regimes impact the vulnerability. Our results reveal that distribution shifts significantly affect the vulnerability to privacy attacks: the closer the distribution of the adaption data is to the pretraining distribution, the higher its practical privacy risk, even when there is no overlap between pretraining and adaptation data.We find that the highest empirical privacy protection can be achieved for OOD data using parameter-efficient fine-tuning (PEFT) methods, such as LoRA and prefix tuning.Finally, our results show that private adaptations, especially done with full fine-tuning on OOD data, can also decrease the empirical leakage from the pretraining data.