NeurIPS Reward Copilot for RL-driven Systems Optimization

Poster
in
Workshop: Machine Learning for Systems

Reward Copilot for RL-driven Systems Optimization

Karan Tandon · Manav Mishra · Gagan Somashekar · Mayukh Das · Nagarajan Natarajan

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

Systems optimization problems such as workload auto-scaling, kernel parameter tuning, and cluster management arising in large-scale enterprise infrastructure are becoming increasingly RL-driven. While effective, it is difficult to set up the RL framework for such real-world problems --- designing correct and useful reward functions or state spaces is highly challenging and needs a lot of domain expertise. Our proposed novel reward co-pilot solution can help design suitable and interpretable reward functions guided by client-provided specifications for any RL framework. Using experiments on standard benchmarks as well as systems-specific optimization problems, we show that our solution can return reward functions with a certain (informal) feasibility certificate in addition to pareto-optimality.

Chat is not available.

Poster in Workshop: Machine Learning for Systems

Reward Copilot for RL-driven Systems Optimization

Karan Tandon · Manav Mishra · Gagan Somashekar · Mayukh Das · Nagarajan Natarajan

Poster
in
Workshop: Machine Learning for Systems