NeurIPS Datasets and Benchmarks Dataset and Benchmark Track 1

Datasets and Benchmarks

Dataset and Benchmark Track 1

Joaquin Vanschoren · Serena Yeung

Moderator s: Ludwig Schmidt · Viorica Patraucean

[ Abstract ]

Abstract:

The Datasets and Benchmarks track serves as a novel venue for high-quality publications, talks, and posters on highly valuable machine learning datasets and benchmarks, as well as a forum for discussions on how to improve dataset development. Datasets and benchmarks are crucial for the development of machine learning methods, but also require their own publishing and reviewing guidelines. For instance, datasets can often not be reviewed in a double-blind fashion, and hence full anonymization will not be required. On the other hand, they do require additional specific checks, such as a proper description of how the data was collected, whether they show intrinsic bias, and whether they will remain accessible.

Chat is not available.

Schedule

Tue 12:00 a.m. - 12:05 a.m.	Intro ( Intro ) > SlidesLive Video	🔗
Tue 12:05 a.m. - 12:15 a.m.	Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management ( Oral ) > SlidesLive Video	Cécile Logé · Emily Ross · David Dadey · Saahil Jain · Adriel Saporta · Andrew Ng · Pranav Rajpurkar 🔗
Tue 12:15 a.m. - 12:25 a.m.	It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks ( Oral ) > SlidesLive Video	Michelle Bao · Angela Zhou · Samantha Zottola · Brian Brubach · Sarah Desmarais · Aaron Horowitz · Kristian Lum · Suresh Venkatasubramanian 🔗
Tue 12:25 a.m. - 12:35 a.m.	Mitigating dataset harms requires stewardship: Lessons from 1000 papers ( Oral ) > SlidesLive Video	Kenneth Peng · Arunesh Mathur · Arvind Narayanan 🔗
Tue 12:35 a.m. - 12:45 a.m.	Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks ( Oral ) > SlidesLive Video	Curtis Northcutt · Anish Athalye · Jonas Mueller 🔗
Tue 12:45 a.m. - 1:00 a.m.	Joint Q&A ( Q&A ) >	🔗
Tue 1:00 a.m. - 1:05 a.m.	Break	🔗
Tue 1:05 a.m. - 1:15 a.m.	RadGraph: Extracting Clinical Entities and Relations from Radiology Reports ( Oral ) > SlidesLive Video	12 presenters Saahil Jain · Ashwin Agrawal · Adriel Saporta · Steven Truong · Du Nguyen Duong · Tan Bui · Pierre Chambon · Yuhao Zhang · Matthew Lungren · Andrew Ng · Curtis Langlotz · Pranav Rajpurkar 🔗
Tue 1:15 a.m. - 1:25 a.m.	CommonsenseQA 2.0: Exposing the Limits of AI through Gamification ( Oral ) > SlidesLive Video	Alon Talmor · Ori Yoran · Ronan Le Bras · Chandra Bhagavatula · Yoav Goldberg · Yejin Choi · Jonathan Berant 🔗
Tue 1:25 a.m. - 1:35 a.m.	ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation ( Oral ) > SlidesLive Video	24 presenters Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins 🔗
Tue 1:35 a.m. - 1:45 a.m.	Chest ImaGenome Dataset for Clinical Reasoning ( Oral ) > SlidesLive Video	12 presenters Joy T Wu · Nkechinyere Agu · Ismini Lourentzou · Arjun Sharma · Joseph Alexander Paguio · Jasper Seth Yao · Edward C Dee · William Mitchell · Satyananda Kashyap · Andrea Giovannini · Leo Anthony Celi · Mehdi Moradi 🔗
Tue 1:45 a.m. - 2:00 a.m.	Joint Q&A ( Q&A ) >	🔗