Poster
in
Workshop: AI4Mat-2024: NeurIPS 2024 Workshop on AI for Accelerated Materials Design
LLaMat: Large Language Models for Materials Science Information Extraction
Vaibhav Mishra · Somaditya Singh · Mohd Zaki · Dhruv Ahlawat · Hargun Grover · Biswajit Mishra · Santiago Miret · Mausam · N M Anoop Krishnan
Keywords: [ large language models ] [ materials science ] [ materials discovery ] [ table understanding ] [ information extraction ]
Large language models have emerged as an important tool for information extraction and as scientific assistants in materials science and discovery. However, their performance is limited due to a lack of domain expertise. In this work, we propose LLaMat models, namely, LLaMat-2-7B and LLaMat-3-8B, which are obtained by continuously pre-training META's LLaMA-2-7B and LLaMA-3-8B models, respectively, on a large corpus of 30B tokens of materials science text to improve their domain expertise. We also developed LLaMat-Chat models, instruction fine-tuned variants of LLaMat models tailored through a dataset of one million instruction-output pairs, enabling interaction and information extraction abilities for the materials science domain. We show that LLaMat achieves state-of-the-art performance on several information extraction tasks from materials science text, where LLaMat-3-8B emerges as the best model. We also demonstrate the application of the developed model on structured information extraction capabilities of the developed chat models and compare their performance on 4 datasets ranging from named entity and relation extraction from text and understanding composition tables from materials science research papers.