Oral
in
Workshop: Human Evaluation of Generative Models
Are GAN Biased? Evaluating GAN-Generated Facial Images via Crowdsourcing
Hangzhi Guo · Lizhen Zhu · Ting-Hao Huang
Generative models produce astonishingly high-resolution and realistic facial images. However, reliably evaluating the quality of these images remains challenging, not to mention performing a systematic investigation of the potential biases in generative adversarial models (GAN). In this paper, we argue that crowdsourcing can be used to measure the biases in GAN quantitatively. We showcase an investigation that examines whether GAN-generated facial images with darker skin tones are of worse quality. We ask crowd workers to guess whether the image is real or fake, and use this as a proxy metric for estimating the quality of facial images generated by state-of-the-art GANs. The results show that GANs generate worse quality images with darker skin tones as compared to images with lighter skin tones.