DAGs with No Fears: A Closer Look at Continuous Optimization for Learning Bayesian Networks
Dennis Wei, Tian Gao, Yue Yu
Spotlight presentation: Orals & Spotlights Track 27: Unsupervised/Probabilistic
on 2020-12-10T08:20:00-08:00 - 2020-12-10T08:30:00-08:00
on 2020-12-10T08:20:00-08:00 - 2020-12-10T08:30:00-08:00
Poster Session 6 (more posters)
on 2020-12-10T09:00:00-08:00 - 2020-12-10T11:00:00-08:00
GatherTown: Probabilistic Methods ( Town D1 - Spot D1 )
on 2020-12-10T09:00:00-08:00 - 2020-12-10T11:00:00-08:00
GatherTown: Probabilistic Methods ( Town D1 - Spot D1 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Toggle Abstract Paper (in Proceedings / .pdf)
Abstract: This paper re-examines a continuous optimization framework dubbed NOTEARS for learning Bayesian networks. We first generalize existing algebraic characterizations of acyclicity to a class of matrix polynomials. Next, focusing on a one-parameter-per-edge setting, it is shown that the Karush-Kuhn-Tucker (KKT) optimality conditions for the NOTEARS formulation cannot be satisfied except in a trivial case, which explains a behavior of the associated algorithm. We then derive the KKT conditions for an equivalent reformulation, show that they are indeed necessary, and relate them to explicit constraints that certain edges be absent from the graph. If the score function is convex, these KKT conditions are also sufficient for local minimality despite the non-convexity of the constraint. Informed by the KKT conditions, a local search post-processing algorithm is proposed and shown to substantially and universally improve the structural Hamming distance of all tested algorithms, typically by a factor of 2 or more. Some combinations with local search are both more accurate and more efficient than the original NOTEARS.