Skip to yearly menu bar Skip to main content


Poster
in
Workshop: NeurIPS 2024 Workshop: Machine Learning and the Physical Sciences

ChemLit-QA: A human evaluated dataset for chemistry RAG tasks

Geemi Wellawatte · Philippe Schwaller · Huixuan Guo · Marta Brucka · Anna Borisova · Matthew Hart · Magdalena Lederbauer


Abstract:

The evaluation of Large-Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems, particularly in knowledge-intensive fields like chemistry, is hindered by the number of available high-quality datasets. Existing datasets are often small due to the labor-intensive nature of manual curation, or require further quality checks when generated automatically. This study addresses the need for robust scientific datasets by introducing ChemLit-QA, an open-source, expert-validated dataset with over 1,000 entries tailored for chemistry. The dataset was initially generated using an automated framework and subsequently reviewed by experts.

Chat is not available.