Pluralistic Alignment Workshop

Workshop

Pluralistic Alignment Workshop

Mikhail Terekhov · Moksh Jain · Ruyuan Wan · Maarten Sap · Mitchell Gordon · Dongyeop Kang · Caglar Gulcehre · Amy Zhang · He He

West Meeting Room 116, 117

Sat 14 Dec, 9 a.m. PST

[ Abstract ] Workshop Website

[ OpenReview]

Aligning AI with human preferences and societal values is increasingly important. Yet, today’s AI alignment methods have been shown to be insufficient for capturing the vast space of complex – and often conflicting – real-world values. Our workshop will discuss how to integrate diverse perspectives, values, and expertise into pluralistic AI alignment. We aim to explore new methods for multi-objective alignment by drawing inspiration from governance and consensus-building practices to address conflicting values in pluralistic AI alignment. Discussion will include technical approaches for dataset collection, algorithms development, and the design of human-AI interaction workflows that reflect pluralistic values among diverse populations. By gathering experts from various fields, this workshop seeks to foster interdisciplinary collaboration and push the boundaries of the understanding, development and practice of pluralistic AI alignment.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 9:00 a.m. - 9:10 a.m.	Opening Remarks ( Introduction ) > SlidesLive Video	🔗
Sat 9:10 a.m. - 9:55 a.m.	Keynote Talk: Monojit Choudhury ( Invited Talk ) > SlidesLive Video	🔗
Sat 9:55 a.m. - 10:40 a.m.	Keynote Talk: Hannah Rose Kirk ( Invited Talk ) > SlidesLive Video	Hannah Rose Kirk 🔗
Sat 10:40 a.m. - 11:40 a.m.	Poster Session ( Poster Session ) >	🔗
Sat 11:40 a.m. - 12:25 p.m.	Keynote Talk: Yejin Choi ( Invited Talk ) > SlidesLive Video	Yejin Choi 🔗
Sat 12:25 p.m. - 1:30 p.m.	Lunch Break	🔗
Sat 1:30 p.m. - 2:15 p.m.	Keynote Talk: Seth Lazar ( Invited Talk ) > SlidesLive Video	Seth Lazar 🔗
Sat 2:15 p.m. - 3:00 p.m.	Keynote Talk: Melanie Mitchell ( Invited Talk ) > SlidesLive Video	Melanie Mitchell 🔗
Sat 3:00 p.m. - 3:30 p.m.	Coffee Break	🔗
Sat 3:30 p.m. - 3:45 p.m.	Contributed Talk: MID-Space: Aligning Diverse Communities' Needs to Inclusive Public Spaces ( Contributed Talk ) > SlidesLive Video	🔗
Sat 3:45 p.m. - 4:00 p.m.	Contributed Talk: Multilingual Trolley Problems for Language Models ( Contributed Talk ) > SlidesLive Video	🔗
Sat 4:00 p.m. - 4:15 p.m.	Contributed Talk: Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning ( Contributed Talk ) > SlidesLive Video	🔗
Sat 4:15 p.m. - 4:30 p.m.	Contributed Talk: Representative Social Choice: From Learning Theory to AI Alignment ( Contributed Talk ) > SlidesLive Video	🔗
Sat 4:30 p.m. - 4:45 p.m.	Contributed Talk: Toward Democracy Levels for AI ( Contributed Talk ) > SlidesLive Video	🔗
Sat 4:45 p.m. - 5:30 p.m.	Keynote Talk: Michael Bernstein ( Invited Talk ) > SlidesLive Video	Michael Bernstein 🔗
-	Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning ( Poster ) > link Link	Sriyash Poddar · Yanming Wan · Hamish Ivison · Abhishek Gupta · Natasha Jaques 🔗
-	Virtual Personas for Language Models via an Anthology of Backstories ( Poster ) > link Link	Suhong Moon · Marwa Abdulhai · Minwoo Kang · Joseph Suh · Widyadewi Soedarmadji · Eran Kohen Behar · David Chan 🔗
-	Pareto-Optimal Learning from Preferences with Hidden Context ( Poster ) > link Link	Ryan Boldi · Li Ding · Lee Spector · Scott Niekum 🔗
-	Aligning LLMs using Reinforcement Learning from Market Feedback (RLMF) for Regime Adaptation ( Poster ) > link Link	Raeid Saqur 🔗
-	Efficacy of the SAGE-RT Dataset for Model Safety Alignment: A Comparative Study ( Poster ) > link Link	Tanay Baswa · Nitin Aravind Birur · Divyanshu Kumar · Jatan Loya · Anurakt Kumar · Prashanth Harshangi · Sahil Agarwal 🔗
-	Multilingual Trolley Problems for Language Models ( Poster ) > link Link	12 presenters Zhijing Jin · Sydney Levine · Max Kleiman-Weiner · Giorgio Piatti · Jiarui Liu · Fernando Gonzalez Adauto · Francesco Ortu · András Strausz · Mrinmaya Sachan · Rada Mihalcea · Yejin Choi · Bernhard Schölkopf 🔗
-	Conditioned Language Policy: A General Framework For Steerable Multi-Objective Finetuning ( Poster ) > link Link	20 presenters Kaiwen Wang · Rahul Kidambi · Ryan Sullivan · Alekh Agarwal · Christoph Dann · Andrea Michi · Marco Gelmi · Yunxuan Li · Raghav Gupta · Kumar Avinava Dubey · Alexandre Rame · Johan Ferret · Geoffrey Cideron · Le Hou · Hongkun Yu · Amr Ahmed · Aranyak Mehta · Leonard Hussenot · Olivier Bachem · Edouard Leurent 🔗
-	Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities ( Poster ) > link Link	Zheyuan Zhang · Fengyuan Hu · Jayjun Lee · Freda Shi · Parisa Kordjamshidi · Joyce Chai · Ziqiao Ma 🔗
-	From Distributional to Overton Pluralism: Investigating Large Language Model Alignment ( Poster ) > link Link	Thom Lake · Eunsol Choi · Greg Durrett 🔗
-	Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions ( Poster ) > link Link	Haoxian Chen · HANYANG ZHAO · Henry Lam · David Yao · Wenpin Tang 🔗
-	Contrastive Learning Neuromotor Interface From Teacher ( Poster ) > link Link	Kilian Freitag · Ran Wei 🔗
-	MID-Space: Aligning Diverse Communities' Needs to Inclusive Public Spaces ( Poster ) > link Link	Shravan Nayak · Rashid Mushkani · Hugo Berard · Allison Cohen · Shin Koseki · Hadrien Bertrand 🔗
-	Model Plurality: A Taxonomy for Pluralistic AI ( Poster ) > link Link	Christina Lu · Max Van Kleek 🔗
-	Mechanism Design for LLM Fine-tuning with Multiple Reward Models ( Poster ) > link Link	Haoran Sun · Yurong Chen · Siwei Wang · Wei Chen · Xiaotie Deng 🔗
-	Towards Representative Social Choice Using Statistical Learning Theory ( Poster ) > link Link	Tianyi (Alex) Qiu 🔗
-	Multi-objective Reinforcement Learning: A Tool for Pluralistic Alignment ( Poster ) > link Link	Peter Vamplew · Conor Hayes · Cameron Foale · Richard Dazeley · Hadassah Harland 🔗
-	Value-Aligned Imitation via focused Satisficing ( Poster ) > link Link	Rushit Shah · Nikolaos Agadakos · Synthia Sasulski · Ali Farajzadeh · Sanjiban Choudhury · Brian Ziebart 🔗
-	Democracy Levels for AI ( Poster ) > link Link	Aviv Ovadya · Luke Thorburn · Kyle Redman · Manon Revel · Flynn Devine · Atoosa Kasirzadeh · Smitha Milli · Andrew Konya 🔗
-	Intuitions of Compromise: Utilitarianism vs. Contractualism ( Poster ) > link Link	Jared Moore · Yejin Choi · Sydney Levine 🔗
-	Learning from Personal Preferences ( Poster ) > link Link	Kelly Jiang · Berk Ustun · Jessica Hullman 🔗
-	AI, Pluralism, and (Social) Compensation ( Poster ) > link Link	Nandhini Swaminathan · David Danks 🔗
-	PersonalLLM: Tailoring LLMs to Individual Preferences ( Poster ) > link Link	Thomas Zollo · Andrew Siah · Naimeng Ye · Ang Li · Hongseok Namkoong 🔗
-	PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences ( Poster ) > link Link	Daiwei Chen · Yi Chen · Aniket Rege · Ramya Korlakai Vinayak 🔗
-	A Case Study in Plural Governance Design ( Poster ) > link Link	Joel Miller · Christopher Kanich · Glen Weyl 🔗
-	RLDF: Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs ( Poster ) > link Link	Ruoxi Cheng · Haoxuan Ma · Shuirong Cao · Jiaqi Li · Aihua Pei · Zhiqiang wang · Pengliang Ji · Haoyu Wang · Jiaqi Huo 🔗
-	Plurals: A system for pluralistic AI via simulated social ensembles ( Poster ) > link Link	Joshua Ashkinaze · Eric Gilbert · Ceren Budak 🔗
-	Can Language Models Reason about Individualistic Human Values and Preferences? ( Poster ) > link Link	Liwei Jiang · Sydney Levine · Yejin Choi 🔗
-	Bottom-Up and Top-Down Analysis of Values, Agendas, and Observations in Corpora and LLMs ( Poster ) > link Link	11 presenters Scott Friedman · Noam Benkler · Drisana Mosaphir · Jeffrey Rye · Sonja Schmer-Galunder · Micah Goldwater · Matthew McLure · Ruta Wheelock · Jeremy Gottlieb · Robert Goldman · Christopher Miller 🔗
-	Chain of Alignment ( Poster ) > link Link	Andrew Konya · Aviv Ovadya · K. J. Kevin Feng · Quan Ze Chen · Lisa Schirch · Colin Irwin · Amy Zhang 🔗
-	Being Considerate as a Pathway to Pluralist Alignment ( Poster ) > link Link	Parand A. Alamdari · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith 🔗
-	Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI ( Poster ) > link Link	Hadassah Harland · Richard Dazeley · Peter Vamplew · Hashini Senaratne · Bahareh nakisa · Francisco Cruz 🔗
-	Group Robust Best-of-K Decoding of Language Models for Pluralistic Alignment ( Poster ) > link Link	Anja Petrovic · Seongho Son · William Bankes · Xiaohang Tang · Shyam Sundhar Ramesh · Sangwoong Yoon · Ilija Bogunovic 🔗
-	"There are no solutions, only trade-offs.'' Taking A Closer Look At Safety Data Annotations. ( Poster ) > link Link	Elle Michelle Yang · Matthias Gallé · Seraphina Goldfarb-Tarrant 🔗
-	Value Alignment from Unstructured Text ( Poster ) > link Link	Inkit Padhi · Karthikeyan Natesan Ramamurthy · Prasanna Sattigeri · Manish Nagireddy · Pierre Dognin · Kush Varshney 🔗
-	Critique-out-Loud Reward Models ( Poster ) > link Link	Zachary Ankner · Mansheej Paul · Brandon Cui · Jonathan Chang · Prithviraj Ammanabrolu 🔗
-	Legal Theory for Pluralistic Alignment ( Poster ) > link Link	Nicholas A. Caputo 🔗
-	Controllable Safety Alignment: Adapting LLMs to Diverse Safety Requirements without Re-Training ( Poster ) > link Link	Jingyu Zhang · Ahmed Elgohary Ghoneim · Ahmed Magooda · Daniel Khashabi · Ben Van Durme 🔗
-	Pluralistic Alignment Over Time ( Poster ) > link Link	Toryn Klassen · Parand A. Alamdari · Sheila McIlraith 🔗
-	Bridging in Social Media Feeds Censors Controversial Topics ( Poster ) > link Link	Smitha Milli · Luke Thorburn · Paul Bouchaud · Yixin Wang · Nikhil Garg · Emma Pierson 🔗
-	Value pluralism and AI value alignment ( Poster ) > link Link	Atoosa Kasirzadeh 🔗
-	Are Large Language Models Consistent over Value-laden Questions? ( Poster ) > link Link	Jared Moore · Tanvi Deshpande · Diyi Yang 🔗
-	Selective Preference Aggregation ( Poster ) > link Link	Shreyas Kadekodi · Hayden McTavish · Berk Ustun 🔗
-	FairPlay: A Collaborative Approach to Mitigate Bias in Datasets for Improved AI Fairness ( Poster ) > link Link	Tina Behzad · Mithilesh Kumar Singh · Anthony Ripa · Klaus Mueller 🔗
-	Policy Aggregation ( Poster ) > link Link	Parand A. Alamdari · Soroush Ebadian · Ariel Procaccia 🔗
-	Evaluating the Prompt Steerability of Large Language Models ( Poster ) > link Link	Erik Miehling · Michael Desmond · Karthikeyan Natesan Ramamurthy · Elizabeth Daly · Pierre Dognin · Jesus Rios · Djallel Bouneffouf · Miao Liu 🔗
-	AGR: Age Group fairness Reward for Bias Mitigation in LLMs ( Poster ) > link Link	Shuirong Cao · Ruoxi Cheng · Zhiqiang wang 🔗
-	Diverging Preferences: Why do Annotators Sensibly Disagree? ( Poster ) > link Link	Michael Zhang · Zhilin Wang · Jena Hwang · Yi Dong · Olivier Delalleau · Yejin Choi · Eunsol Choi · Xiang Ren · Valentina Pyatkin 🔗
-	Aligning to Thousands of Varying Preferences via System Message Generalization ( Poster ) > link Link	Seongyun Lee · Sue Hyun Park · Seungone Kim · Minjoon Seo 🔗
-	Trustworthy Human-AI Interaction Through Agreement Protocols ( Poster ) > link Link	Natalie Collina · Surbhi Goel · Varun Gupta · Aaron Roth 🔗