Invited Talk
in
Affinity Event: Women in Machine Learning
Invited Talk by Aditi Raghunathan (Carnegie Mellon University)
Aditi Raghunathan
Title: Robust machine learning with foundation models
Abstract: State-of-the-art large language models (LLMs) and other foundation models, despite their remarkable success and widespread use, remain alarmingly brittle and insecure. They can hallucinate false information, such as citing fake legal cases or non-existent medical records, or generate toxic and dangerous content with potentially catastrophic consequences. Even with extensive safety measures implemented by leading providers, these models can still be "jailbroken" to bypass safety guardrails. In this talk, we explore how these vulnerabilities often stem from distribution shifts. Although pretrained on broad datasets, these models are adapted or aligned using limited fine-tuning data, which often fails to generalize beyond the narrow training distribution to the diverse scenarios encountered in deployment. Addressing this gap is critical for improving robustness and ensuring reliable performance in real-world scenarios.