Poster
in
Workshop: LaReL: Language and Reinforcement Learning
Tackling AlfWorld with Action Attention and Common Sense from Language Models
Yue Wu · So Yeon Min · Yonatan Bisk · Russ Salakhutdinov · Shrimai Prabhumoye
Keywords: [ language models ] [ AlfWorld ] [ Zero shot ] [ text games ]
Pre-trained language models (LMs) capture strong prior knowledge about the world. This common sense knowledge can be used in control tasks. However, directly generating actions from LMs may result in a reasonable narrative, but not executable by a low level agent. We propose to instead use the knowledge in LMs to simplify the control problem, and assist the low-level actor training. We implement a novel question answering framework to simplify observations and an agent that handles arbitrary roll-out length and action space size based on action attention. On the Alfworld benchmark for indoor instruction following, we achieve a significantly higher success rate (50% over the baseline) with our novel object masking - action attention method.