Oral
in
Workshop: Evaluating Evaluations: Examining Best Practices for Measuring Broader Impacts of Generative AI
(Mis)use of nude images in machine learning research
Arshia Arya · Princessa Cintaqia · Deepak Kumar · Allison McDonald · Lucy Qin · Elissa Redmiles
Keywords: [ nudity detection ] [ data collection ] [ image-based sexual abuse ] [ ethics ]
Nudity detection is a task that has been studied by researchers for decades. In order to do this work, researchers need datasets of nude content for training, testing, and benchmarking their nudity detection algorithms. To assemble these datasets, prior work indicates that researchers typically scrape images from the internet or use existing datasets of nude images. While this practice is common for assembling datasets for general image-recognition tasks, nude images are particularly sensitive. In addition, the nonconsensual collection and distribution of nude images is a common form of image-based sexual abuse, a category of technology-facilitated sexual violence. Our team is conducting ongoing work that investigates the use of nude datasets in machine learning and computer vision research. Based on our initial results we identify several ethical challenges. In this provocation, we aim to raise questions for researchers considering work in this space to evaluate at the start of their projects and prior to dataset collection.