NeurIPS THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)

THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models

Mengfei Liang · Archish Arun · Zekun Wu · CRISTIAN VILLALOBOS · Jonathan Lutch · Emre Kazim · Adriano Koshiyama · Philip Treleaven

Keywords: [ Hallucination Mitigation ] [ Factual Consistency in LMs ] [ Auditing LMs ] [ Answer Faithfulness ] [ LLM Evaluation Metrics ] [ Hallucination Detection Framework ] [ Model Accountability ] [ AI Fairness and Equity ] [ Language Model Hallucination ] [ Ethical AI Development ] [ Robustness in Language Models ] [ Transparency in Language Models ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Hallucination, the generation of factually incorrect and confabulated content, is a rising issue in the realm of Large Language Models (LLMs). While hallucination detection and mitigation methods exist, they are largely isolated and often inadequate for domain-specific use cases. There is no standardized pipeline combining the necessary components of domain-pertinent dataset generation, hallucination detection benchmarking, and mitigation strategies into one tool. This paper proposes the THaMES framework and library---a Tool for Hallucination Mitigations and EvaluationS. THaMES is an end-to-end solution that evaluates and mitigates hallucinations in LLMs through automated testset generation, multifaceted benchmarking techniques, and flexible mitigation strategies. The THaMES framework is capable of automating testset generation from any corpus of information while achieving high data quality and diversity, maintaining cost-effectiveness through batch processing, weighted sampling, counterfactual validation, and the usage of complex question types. THaMES can also evaluate a model’s capability to identify hallucinations and generate less hallucinated outputs across multiple types of evaluation tasks, including text generation and binary classification. The framework also applies optimal hallucination mitigation strategies tailored to different models and knowledge bases. THaMES contains a variety of hallucination mitigation strategies, including In-Context Learning (ICL), Retrieval Augmented Generation (RAG), and Parameter-Efficient Fine-tuning (PEFT). Evaluating a variety of state-of-the-art LLMs using a knowledge base consisting of academic papers, political news articles, and Wikipedia articles, we find that commercial models such as GPT-4o benefit more from RAG strategies than ICL, and that while open weight models such as Llama-3.1-8B-Instruct and Mistral-Nemo also show improvements with RAG mitigations, they benefit more from the reasoning provided by ICL. In an experiment with open weight model Llama-3.1-8B-Instruct, the PEFT mitigation significantly improved over the base model in aspects of both evaluation tasks.

Chat is not available.

Poster in Workshop: Socially Responsible Language Modelling Research (SoLaR)

THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models

Mengfei Liang · Archish Arun · Zekun Wu · CRISTIAN VILLALOBOS · Jonathan Lutch · Emre Kazim · Adriano Koshiyama · Philip Treleaven

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)