NeurIPS Sample-Efficient Policy Search with a Trajectory Autoencoder

Poster
in
Workshop: 4th Robot Learning Workshop: Self-Supervised and Lifelong Learning

Sample-Efficient Policy Search with a Trajectory Autoencoder

Alexander Fabisch · Frank Kirchner

[ Abstract ]

Abstract:

We introduce a trajectory generator that can be used to perform sample-efficient policy search with Bayesian optimization (BO). BO is a sample-efficient approach to direct policy search that usually does not scale well with the number of parameters. Our trajectory generator is able to map a compact representation of trajectories to a high-dimensional trajectory space so that BO can search in the low-dimensional space. The trajectory generator will be trained as part of a variational autoencoder on demonstrations from an expert. The trajectory generator contains a trajectory layer, which is a new building block for neural networks that enforces smoothness on generated trajectories. We evaluate our approach with grasping on a real robot.

Chat is not available.

Poster in Workshop: 4th Robot Learning Workshop: Self-Supervised and Lifelong Learning

Sample-Efficient Policy Search with a Trajectory Autoencoder

Alexander Fabisch · Frank Kirchner

Poster
in
Workshop: 4th Robot Learning Workshop: Self-Supervised and Lifelong Learning