Poster
in
Workshop: Your Model is Wrong: Robustness and misspecification in probabilistic modeling
Inferior Clusterings in Misspecified Gaussian Mixture Models
Siva Rajesh Kasa · Vaibhav Rajan
Gaussian Mixture Model (GMM) is a widely used probabilistic model for clustering. In many practical settings, the true data distribution, which is unknown, may be non-Gaussian and may be contaminated by noise or outliers. In such cases, clustering may still be done with a misspecified GMM. However, this may lead to incorrect classification of the underlying subpopulations. In this work, we examine the performance of both Expectation Maximization (EM) and Gradient Descent (GD) on unconstrained Gaussian Mixture Models when there is misspecification. Our simulation study reveals a previously unreported class of \textit{inferior} clustering solutions, different from spurious solutions, that occurs due to asymmetry in the fitted component variances.