Invited Talk
in
Workshop: Adaptive Experimental Design and Active Learning in the Real World
RL in the Real World: From Chip Design to LLMs - Anna Goldie
Reinforcement learning (RL) is famously powerful but difficult to wield, and until recently, had demonstrated impressive results on games, but little real world impact. I will start the talk with a discussion of RL for Large Language Models (LLMs), including scalable supervision techniques to better align models with human preferences (Constitutional AI / RLAIF). Next, I will discuss RL for chip floorplanning, one of the first examples of RL solving a real world engineering problem. This learning-based method can generate placements that are superhuman or comparable on modern accelerator chips in a matter of hours, whereas the strongest baselines require human experts in the loop and can take several weeks. This method was published in Nature and used in production to generate superhuman chip layouts for the last four generations of Google’s flagship AI accelerator (TPU), including the recently announced TPU v5p.