Datasets and Benchmarks
Dataset and Benchmark Poster Session 1
Joaquin Vanschoren · Serena Yeung
Moderator s: Viorica Patraucean · Ludwig Schmidt
The Datasets and Benchmarks track serves as a novel venue for high-quality publications, talks, and posters on highly valuable machine learning datasets and benchmarks, as well as a forum for discussions on how to improve dataset development. Datasets and benchmarks are crucial for the development of machine learning methods, but also require their own publishing and reviewing guidelines. For instance, datasets can often not be reviewed in a double-blind fashion, and hence full anonymization will not be required. On the other hand, they do require additional specific checks, such as a proper description of how the data was collected, whether they show intrinsic bias, and whether they will remain accessible.
Schedule
-
|
Programming Puzzles
(
Poster
)
>
link
SlidesLive Video |
Tal Schuster · Ashwin Kalyan · Alex Polozov · Adam Kalai 🔗 |
-
|
FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information
(
Poster
)
>
SlidesLive Video |
Rami Aly · Zhijiang Guo · Michael Schlichtkrull · James Thorne · Andreas Vlachos · Christos Christodoulopoulos · Oana Cocarascu · Arpit Mittal 🔗 |
-
|
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling
(
Poster
)
>
SlidesLive Video |
Zhaojiang Lin · Andrea Madotto · Genta Winata · Peng Xu · Feijun Jiang · Yuxiang Hu · Chen Shi · Pascale N Fung 🔗 |
-
|
Towards a robust experimental framework and benchmark for lifelong language learning
(
Poster
)
>
link
SlidesLive Video |
Aman Hussain · Nithin Holla · Pushkar Mishra · Helen Yannakoudakis · Ekaterina Shutova 🔗 |
-
|
The People’s Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
(
Poster
)
>
SlidesLive Video |
Daniel Galvez · Greg Diamos · Juan Torres · Juan Cerón · Keith Achorn · Anjali Gopi · David Kanter · Max Lam · Mark Mazumder · Vijay Janapa Reddi 🔗 |
-
|
CrowdSpeech and Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription
(
Poster
)
>
SlidesLive Video |
Nikita Pavlichenko · Ivan Stelmakh · Dmitry Ustalov 🔗 |
-
|
ReaSCAN: Compositional Reasoning in Language Grounding
(
Poster
)
>
SlidesLive Video |
Zhengxuan Wu · Elisa Kreiss · Desmond Ong · Christopher Potts 🔗 |
-
|
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
(
Poster
)
>
SlidesLive Video |
22 presentersShuai Lu · Daya Guo · Shuo Ren · Junjie Huang · Alexey Svyatkovskiy · Ambrosio Blanco · Colin Clement · Dawn Drain · Daxin Jiang · Duyu Tang · Ge Li · Lidong Zhou · Linjun Shou · Long Zhou · Michele Tufano · MING GONG · Ming Zhou · Nan Duan · Neel Sundaresan · Shao Kun Deng · Shengyu Fu · Shujie LIU |
-
|
Variance-Aware Machine Translation Test Sets
(
Poster
)
>
SlidesLive Video |
Runzhe Zhan · Xuebo Liu · Derek Wong · Lidia Chao 🔗 |
-
|
Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
(
Poster
)
>
|
Loren Lugosch · Piyush Papreja · Mirco Ravanelli · Abdelwahab HEBA · Titouan Parcollet 🔗 |
-
|
LiRo: Benchmark and leaderboard for Romanian language tasks
(
Poster
)
>
SlidesLive Video |
20 presentersStefan Dumitrescu · Petru Rebeja · Beata Lorincz · Mihaela Gaman · Andrei Avram · Mihai Ilie · Andrei Pruteanu · Adriana Stan · Lorena Rosia · Cristina Iacobescu · Luciana Morogan · George Dima · Gabriel Marchidan · Traian Rebedea · Madalina Chitez · Dani Yogatama · Sebastian Ruder · Radu Tudor Ionescu · Razvan Pascanu · Viorica Patraucean |
-
|
A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning
(
Poster
)
>
SlidesLive Video |
11 presentersGaoussou Kebe · Padraig Higgins · Patrick Jenkins · Kasra Darvish · Rishabh Sachdeva · Ryan Barron · John Winder · Donald Engel · Edward Raff · Francis Ferraro · Cynthia Matuszek |
-
|
NaturalProofs: Mathematical Theorem Proving in Natural Language
(
Poster
)
>
SlidesLive Video |
Sean Welleck · Jiacheng Liu · Ronan Le Bras · Hanna Hajishirzi · Yejin Choi · Kyunghyun Cho 🔗 |
-
|
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
(
Poster
)
>
SlidesLive Video |
Dan Hendrycks · Collin Burns · Anya Chen · Spencer Ball 🔗 |
-
|
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
(
Poster
)
>
SlidesLive Video |
17 presentersRuchir Puri · David Kung · Geert Janssen · Wei Zhang · Giacomo Domeniconi · Vladimir Zolotov · Julian T Dolby · Jie Chen · Mihir Choudhury · Lindsey Decker · Veronika Thost · Luca Buratti · Saurabh Pujar · Shyam Ramji · Ulrich Finkler · Susan Malaika · Frederick Reiss |
-
|
DUE: End-to-End Document Understanding Benchmark
(
Poster
)
>
link
SlidesLive Video |
Łukasz Borchmann · Michał Pietruszka · Tomasz Stanislawek · Dawid Jurkiewicz · Michał Turski · Karolina Szyndler · Filip Graliński 🔗 |
-
|
COVID-19 Sounds: A Large-Scale Audio Dataset for Digital Respiratory Screening
(
Poster
)
>
SlidesLive Video |
12 presentersTong Xia · Dimitrios Spathis · Chlo{\"e} Brown · J Ch · Andreas Grammenos · Jing Han · Apinan Hasthanasombat · Erika Bondareva · Ting Dang · Andres Floto · Pietro Cicuta · Cecilia Mascolo |
-
|
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
(
Poster
)
>
SlidesLive Video |
Joel Frank · Lea Schönherr 🔗 |
-
|
$\texttt{RP-Mod}\ \&\ \texttt{RP-Crowd:}$ Moderator- and Crowd-Annotated German News Comment Datasets
(
Poster
)
>
SlidesLive Video |
Dennis Assenmacher · Marco Niemann · Kilian Müller · Moritz Seiler · Dennis Riehle · Heike Trautmann 🔗 |
-
|
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models
(
Poster
)
>
SlidesLive Video |
Boxin Wang · Chejian Xu · Shuohang Wang · Zhe Gan · Yu Cheng · Jianfeng Gao · Ahmed Awadallah · Bo Li 🔗 |
-
|
Multilingual Spoken Words Corpus
(
Poster
)
>
SlidesLive Video |
14 presentersMark Mazumder · Sharad Chitlangia · Colby Banbury · Yiping Kang · Juan Ciro · Keith Achorn · Daniel Galvez · Mark Sabini · Peter Mattson · David Kanter · Greg Diamos · Pete Warden · Josh Meyer · Vijay Janapa Reddi |
-
|
Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension
(
Poster
)
>
SlidesLive Video |
Shusheng Xu · Yichen Liu · Xiaoyu Yi · Siyuan Zhou · Huizi Li · Yi Wu 🔗 |
-
|
Measuring Coding Challenge Competence With APPS
(
Poster
)
>
SlidesLive Video |
11 presentersDan Hendrycks · Steven Basart · Saurav Kadavath · Mantas Mazeika · Akul Arora · Ethan Guo · Collin Burns · Samir Puranik · Horace He · Dawn Song · Jacob Steinhardt |
-
|
NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation
(
Poster
)
>
SlidesLive Video |
David Alfonso-Hermelo · Ahmad Rashid · Abbas Ghaddar · Philippe Langlais · Mehdi Rezagholizadeh 🔗 |
-
|
CSFCube - A Test Collection of Computer Science Research Articles for Faceted Query by Example
(
Poster
)
>
link
SlidesLive Video |
Sheshera Mysore · Tim O'Gorman · Andrew McCallum · Hamed Zamani 🔗 |
-
|
RAFT: A Real-World Few-Shot Text Classification Benchmark
(
Poster
)
>
SlidesLive Video |
12 presentersNeel Alex · Eli Lifland · Lewis Tunstall · Abhishek Thakur · Pegah Maham · C. Riedel · Emmie Hine · Carolyn Ashurst · Paul Sedille · Alexis Carlier · Michael Noetel · Andreas Stuhlmüller |
-
|
A Dataset for Answering Time-Sensitive Questions
(
Poster
)
>
SlidesLive Video |
Wenhu Chen · Xinyi Wang · William Yang Wang 🔗 |
-
|
DEBAGREEMENT: A comment-reply dataset for (dis)agreement detection in online debates
(
Poster
)
>
SlidesLive Video |
John Pougué-Biyong · Valentina Semenova · Alexandre Matton · Rachel Han · Aerin Kim · Renaud Lambiotte · Doyne Farmer 🔗 |
-
|
Task Agnostic and Task Specific Self-Supervised Learning from Speech with LeBenchmark
(
Poster
)
>
SlidesLive Video |
18 presentersSolène Evain · Ha Nguyen · Hang Le · Marcely Zanon Boito · Salima Mdhaffar · Sina Alisamir · Ziyi Tong · Natalia Tomashenko · Marco Dinarelli · Titouan Parcollet · Alexandre Allauzen · Yannick Estève · Benjamin Lecouteux · François Portet · Solange Rossato · Fabien Ringeval · Didier Schwab · laurent besacier |
-
|
SynthBio: A Case Study in Faster Curation of Text Datasets
(
Poster
)
>
SlidesLive Video |
Ann Yuan · Daphne Ippolito · Vitaly Nikolaev · Chris Callison-Burch · Andy Coenen · Sebastian Gehrmann 🔗 |
-
|
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs
(
Poster
)
>
SlidesLive Video |
Zihao Wang · Hang Yin · Yangqiu Song 🔗 |
-
|
KLUE: Korean Language Understanding Evaluation
(
Poster
)
>
SlidesLive Video |
31 presentersSungjoon Park · Jihyung Moon · Sungdong Kim · Won Ik Cho · Ji Yoon Han · Jangwon Park · Chisung Song · Junseong Kim · Youngsook Song · Taehwan Oh · Joohong Lee · Juhyun Oh · Sungwon Lyu · Younghoon Jeong · Inkwon Lee · Sangwoo Seo · Dongjun Lee · Hyunwoo Kim · Myeonghwa Lee · Seongbo Jang · Seungwon Do · Sunkyoung Kim · Kyungtae Lim · Jongwon Lee · Kyumin Park · Jamin Shin · Seonghyun Kim · Lucy Park · Alice Oh · Jung-Woo Ha · Kyunghyun Cho |
-
|
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
(
Poster
)
>
SlidesLive Video |
Yasumasa Onoe · Michael Zhang · Eunsol Choi · Greg Durrett 🔗 |
-
|
Few-Shot Learning Evaluation in Natural Language Understanding
(
Poster
)
>
link
SlidesLive Video |
Subhabrata Mukherjee · Xiaodong Liu · Guoqing Zheng · Saghar Hosseini · Hao Cheng · Ge Yang · Christopher Meek · Ahmed Awadallah · Jianfeng Gao 🔗 |
-
|
SciGen: a Dataset for Reasoning-Aware Text Generation from Scientific Tables
(
Poster
)
>
SlidesLive Video |
Nafise Moosavi · Andreas Rücklé · Dan Roth · Iryna Gurevych 🔗 |
-
|
HumBugDB: A Large-scale Acoustic Mosquito Dataset
(
Poster
)
>
SlidesLive Video |
16 presentersIvan Kiskin · Marianne Sinka · Adam Cobb · Waqas Rafique · Lawrence Wang · Davide Zilli · Benjamin Gutteridge · Rinita Dam · Theodoros Marinos · Yunpeng Li · Dickson Msaky · Emmanuel Kaindoa · Gerard Killeen · Eva Herreros-Moya · Kathy Willis · Stephen J Roberts |
-
|
KeSpeech: An Open Source Speech Dataset of Mandarin and Its Eight Subdialects
(
Poster
)
>
SlidesLive Video |
15 presentersZhiyuan Tang · Dong Wang · Yanguang Xu · Jianwei Sun · Xiaoning Lei · Shuaijiang Zhao · cheng wen · Xingjun Tan · Chuandong Xie · Shuran Zhou · Rui Yan · Chenjia Lv · Yang Han · Wei Zou · Xiangang Li |
-
|
Measuring Mathematical Problem Solving With the MATH Dataset
(
Poster
)
>
SlidesLive Video |
Dan Hendrycks · Collin Burns · Saurav Kadavath · Akul Arora · Steven Basart · Eric Tang · Dawn Song · Jacob Steinhardt 🔗 |