NeurIPS Poster On the Complexity of Teaching a Family of Linear Behavior Cloning Learners

Poster

On the Complexity of Teaching a Family of Linear Behavior Cloning Learners

Shubham Bharti · Stephen Wright · Adish Singla · Jerry Zhu

West Ballroom A-D #6603

[ Abstract ]

[ Paper] [ OpenReview]

Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract: We study optimal teaching for a family of Behavior Cloning learners that learn using a linear hypothesis class. In this setup, a knowledgeable teacher can demonstrate a dataset of state and action tuples and is required to teach an optimal policy to an entire family of BC learners using the smallest possible dataset. We analyze the linear family and design a novel teaching algorithm called `TIE' that achieves the instance optimal Teaching Dimension for the entire family. However, we show that this problem is NP-hard for action spaces with $|\mathcal{A}| > 2$ and provide an efficient approximation algorithm with a $\log(|\mathcal{A}| - 1)$ guarantee on the optimal teaching size. We present empirical results to demonstrate the effectiveness of our algorithm and compare it to various baselines in different teaching environments.

Chat is not available.