Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations

A Unified Analysis of Label Inference Attacks

Andres Munoz Medina · Travis Dick · Claudio Gentile · RĂ³bert Busa-Fekete · Marika Swanberg


Abstract:

Randomized response and label aggregation are two common ways of sharing sensitive label information in a private way. In spite of their popularity in the privacy literature, there is a lack of consensus on how to compare the privacy properties of these two different mechanisms. In this work, we investigate the privacy risk of sharing label information for these privacy enhancing technologies through the lens of label reconstruction advantage measures. A reconstruction advantage measure quantifies the increase in an attacker's ability to infer the true label of an unlabeled example when provided with a private version of the labels in a dataset (e.g., averages of labels from different users or noisy labels output by randomized response), compared to an attacker that only observes the feature vectors, but may have prior knowledge of the correlation between features and labels. We extend the Expected Attack Utility (EAU) and Advantage of previous work to mechanisms that involve aggregation of labels across different examples. We theoretically quantify this measure for Randomized Response and random aggregates under various correlation assumptions with public features, and then empirically corroborate these findings by quantifying EAU on real-world data. To the best of our knowledge, these are the first experiments where randomized response and label proportions are placed on the same privacy footing.We finally point out that simple modifications to the random aggregate approach can provide extra DP-like protection.

Chat is not available.