Oral Presentation
in
Affinity Workshop: LatinX in AI
Oral Presentation 7: Adapting the Function Approximation Architecture in Online Reinforcement Learning
Fatima Davelouis
The performance of a reinforcement learning (RL) system depends on the computational architecture used to approximate a value function. Deep learning methods provide architectures for approximating nonlinear functions from noisy, high-dimensional observations. However, their prevailing optimization techniques are not designed for strictly-incremental online updates. Standard architectures are also not designed to efficiently represent observational patterns from an a priori unknown structure: for example, light receptors randomly dispersed in space. Nor are standard architectures designed to efficiently represent observational patterns from an a priori unknown structure: for example, light receptors randomly dispersed in space. We propose an online RL algorithm for adapting a value function’s architecture and efficiently finding useful nonlinear features. The algorithm is evaluated in a spatial domain with high-dimensional, stochastic observations. We further show that the algorithm outperforms baselines and approaches the performance of an architecture given side-channel information about observational structure.