Poster
in
Workshop: NeurIPS 2023 Workshop: Machine Learning and the Physical Sciences
Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e
Jerry Yao-Chieh Hu · Chenwei Xu · Aakaash Narayanan · Mattson Thieme · Vladimir Nagaslaev · Mark Austin · Jeremy Arnold · Jose Berlioz · Pierrick Hanlet · Aisha Ibrahim · Dennis Nicklaus · Jovan Mitrevski · Gauri Pradhan · Andrea Saewert · Kiyomi Seiya · Brian Schupbach · Randy Thurman-Keup · Nhan Tran · Rui Shi · Seda Ogrenci · Alexis Maya-Isabelle Shuping · Kyle Hazelwood · Han Liu
We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at addressing the challenge of maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab). Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an automated controller capable of providing real-time feedback and calibration of the Spill Regulation System (SRS) parameters on a millisecond timescale. We treat the Mu2e accelerator system as a Markov Decision Process suitable for Reinforcement Learning (RL), utilizing PPO to reduce bias and enhance training stability. A key innovation in our approach is the integration of neuralized PID controller into the policy function, resulting in a significant improvement in the Spill Duty Factor (SDF) by 9.4\%, surpassing the performance of the current PID controller baseline by an additional 2.2\%. This paper presents the preliminary offline results based on a differentiable simulator of the Mu2e accelerator. It paves the ground works for real-time implementations and applications, representing a crucial step towards automated proton beam intensity control for the Mu2e experiment.