KeyNote Talk
in
Workshop: The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models
Efficiency through Learning from Experience
Bhavana Dalvi Mishra · Peter Clark
Despite the physiological limitations of the human brain, humans are remarkably efficient thinkers, in large part because they can learn from experience, allowing them to avoid prior reasoning errors and quickly jump to conclusions that previously took substantial effort. Similarly, language models (LMs) can rapidly improve their inference-time efficiency through inference-time learning, supplementing lower-level methods like fast decoding and caching. I'll describe two agent-based systems (CLIN and SSO) that do this, using an external RAG (retrieval-augmented generation) memory to help the agent navigate a complex, virtual environment. Unlike typical RAG systems, the memory is dynamic and updated after each task (including forgetting unhelpful learnings). In addition, unlike reinforcement-based continual learning techniques, these systems rapidly learn from just a handful of examples by exploiting LMs to conjecture useful generalizations of past experiences. I'll outline three critical activities in this process - what to remember, how to index those memories, and how to retrieve from that index - and how those choices impact the effectiveness of the resulting agent. While this concept of efficiency is a little different to foundational architectural considerations, I'll show that it is nonetheless powerful, and an important additional tool in the toolbox for efficient future applications.