Poster
in
Workshop: Machine Learning for Systems
Multi-Agent Join
Arash Termehchy · Bakhtiyar Doskenov · Bharghav Srikhakollu · Summit Haque · Huazheng Wang
Real-time performance is crucial for interactive and exploratory data analysis,where users require quick access to subsets or progressive presentations of queryresults. Delivering real-time results over large data for common relational binaryoperators like join is challenging, as join algorithms often spend considerable timescanning and attempting to join parts of relations that may not produce any results.Existing solutions often involve repetitive preprocessing, which is costly and maynot be feasible for interactive workloads or evolving datasets. Additionally, thesesolutions may support only restricted types of joins. This paper presents a novelapproach for achieving efficient progressive join processing. The scan operator ofthe join learns online during query execution, identifying portions of its underlyingrelation that satisfy the join condition. Additionally, an algorithm is introducedwhere both scan operators collaboratively learn to optimize join execution.