Poster
in
Workshop: Safe and Robust Control of Uncertain Systems
Safe Online Exploration with Nonlinear Constraints
Eleanor Quint · Garrett Wirka · Stephen Scott
Abstract:
Safe exploration is critical to using reinforcement learning in complex, hazardous, real-world environments for which offline data aren't available. We propose a nonlinear safety layer that, unlike prior work, requires no restrictions on the policy or environment, and doesn't require offline training. We demonstrate that a nonlinear model has higher prediction accuracy than a similar linear model and that a linear safety layer fails to learn a non-conservative policy in Safety Gym environments where the nonlinear layer does not.