Oral
in
Workshop: Learning in Presence of Strategic Behavior
Strategic clustering
Ana-Andreea Stoica · Christos Papadimitriou
This paper studies the problem of clustering through the lens of utility and welfare. In particular, we ask, how much does the quality of the clustering --- typically measured by the conductance, or by the number of edges cut, or the average distance to the centers --- deteriorate if the nodes are strategic and can change clusters? And among reasonable utilities for the nodes, which one hurts quality the least? We investigate these questions both theoretically, by studying the equilibria of hedonic games (simplified clustering games with unconstrained number of clusters), and experimentally, by measuring the quality of pure Nash equilibria of more realistic clustering games. We introduce a new utility function for the nodes which we call closeness, and which we believe is an attractive alternative to previously studied node utilities. We study the properties of the closeness utility theoretically and demonstrate experimentally its advantages over other established utilities such as the modified fractional utility. Finally, we present a polynomial-time algorithm which, given a clustering with optimal quality, finds another clustering with better average utility, and in fact the one that maximizes the ratio of the gain in average utility over the loss in quality. This submission and the topics covered are particularly related to the theme of the Strategic ML workshop, especially to welfare-aware machine learning and modeling strategic behavior. In the case of clustering, this work aims to cover gaps between the classical ML algorithms for clustering that optimize certain connectivity metrics (such as conductance) and incentives that people have in creating or changing groups. With an increasing demand for algorithmic tools in identifying and facilitating groups, understanding the incentives people may have in group formation become important in order to ensure retention, stability, and success of groups. This work aims to open avenues of research for investigating trade-offs between robustness and welfare in designing algorithms for clustering.