Spotlight
in
Workshop: ML For Systems
Towards Intelligent Load Balancing in Data Centers
Zhiyuan Yao · Thomas Heide Clausen
Network load balancers (LBs) are important components in data centers (DCs) to provide scalable services. Workload distribution algorithms are based on heuristics (ECMP, WCMP) or naive machine learning (ML) algorithms (ridge regression). Advanced ML-based approaches help achieve performance gain in different networking and system problems. However, it is challenging to apply ML algorithms on networking problems in real-life systems. It requires domain knowledge to collect features from low-latency, high-throughput, and scalable networking systems, which are dynamic and heterogenous. This paper proposes Aquarius to bridge the gap between ML and networking systems and demonstrates its usage in the context of network LBs. This paper demonstrates its ability of conducting both offline data analysis and online model deployment in realistic systems. The results show that the ML model trained and deployed using Aquarius improves load balancing performance yet they also reveals more challenges to be resolved to apply ML for networking systems.