KeyNote Talk
in
Workshop: The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models
How to build fully open language models: from pre-training to post-training
Hannaneh Hajishirzi
Language models (LMs) have become ubiquitous in both AI research and commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed.
In this talk, I present our OLMo project aimed at building strong language models and making them fully accessible to researchers along with open-source code for data, training, and inference. Training language models are expensive, therefore we optimize for quality vs. compute cost. I focus on how data, architecture, and training improvements advance models at pre-training and post-training stages with less compute cost.