Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)
SocialStigmaQA Spanish and Japanese - Towards Multicultural Adaptation of Social Bias Benchmarks
Clara Higuera-Cabañes · Ryo Iwaki · Beñat San Sebastian · ROSARIO UCEDA-SOSA · Manish Nagireddy · Hiroshi Kanayama · Mikio Takeuchi · Gakuto Kurata · Karthikeyan Natesan Ramamurthy
Keywords: [ Multilingual ] [ Large language model ] [ Social bias ] [ Multicultural ] [ Evaluation ]
Many existing benchmarks for social bias evaluation of large language models are based in English. Given that finding similar datasets natively or creating them from scratch in other languages is difficult, one solution is to adapt these English-based benchmarks to other languages. However, such conversions are non-trivial given both the linguistic and cultural aspects of social bias. In this work, we present ongoing efforts to port an existing dataset - SocialStigmaQA - to both Spanish and Japanese languages. We speak on the efforts required to perform a faithful adaptation of this dataset, with respect to the specific societal and cultural norms for both of these languages. We hope our work provides insightful guidance on the adaptation of existing English-based bias benchmarks to other languages and provide further steps that can be taken for that purpose.Warning: This paper contains examples of text which are toxic, biased, and potentially harmful.