NeurIPS Value pluralism and AI value alignment

Poster
in
Workshop: Pluralistic Alignment Workshop

Value pluralism and AI value alignment

Atoosa Kasirzadeh

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

This paper addresses the challenge of aligning LLMs with human values in a democratically legitimate and theoretically robust manner. We develop a multi-dimensional framework, COMAL (Criteria, Origin, Measurement, Aggregation, and Legitimacy), for analyzing value pluralism in AI alignment. We apply COMAL to demonstrate that the space of possible pluralistic values is vastly larger than typically recognized. Consequently, we argue that recent AI alignment efforts that purport to embrace pluralistic or collective approaches can fall short of true democratic legitimacy. Despite moving beyond the narrow confines of developer-centric value systems, these efforts remain theoretically fragile and democratically suspect. To address this critical gap, we propose a comprehensive evaluation of the legitimacy of value pluralism across all COMAL dimensions. Moreover, we sketch a set of metrics designed to quantify the depth and breadth of pluralistic integration in AI systems. This paper serves as a call to action for elevating standards of value pluralism in pursuit of truly legitimate and robust value alignment.

Chat is not available.

Poster in Workshop: Pluralistic Alignment Workshop

Value pluralism and AI value alignment

Atoosa Kasirzadeh

Poster
in
Workshop: Pluralistic Alignment Workshop