Poster
in
Workshop: Table Representation Learning Workshop (TRL)
Multi-Stage QLoRA with Augmented Structured Dialogue Corpora: Efficient and Improved Conversational Healthcare AI
Dasun Wickrama Arachchi Athukoralage · Thushari Atapattu
Keywords: [ Multi-Stage QLoRA ] [ LoRA ] [ Synthetic Data ] [ Large Language Models ] [ Conversational Healthcare AI ]
This work proposes a cost-effective approach for developing a powerful conversational healthcare AI, Med-Nirvana 8B, utilizing the QLoRA supervised fine-tuning (SFT) technique. Given the significant computational demands for the full fine- tuning of large language models (LLMs), a two-stage QLoRA-based fine-tuning process is adopted using the open-source LLaMA 3.1 8B Instruct model. The first stage focuses on fine-tuning the model on a mixture of medical benchmark datasets (MedQA, MedMCQA, and PubMedQA) to strengthen the model’s factual knowledge, reasoning, and decision-making skills in a structured environment. In the second stage, the model is fine-tuned using the NoteChat dataset, which contains synthetic patient-physician conversations, enabling it to handle more complex, real-life situations, such as diagnosing patients and managing conversations with them. The composition of SFT data significantly impacts an LLM’s ability to acquire multiple skills. Hence, we implemented a novel SFT strategy known as Dual-stage Mixed Fine-tuning (DMT). By employing this approach, we successfully developed a promising and cost-effective conversational healthcare LLM. Med-Nirvana 8B demonstrates strong performance on medical benchmarks compared to similar-scale models and excels in providing accurate, concise, and human-like responses in real patient interactions, validating the effectiveness of this low-resource fine-tuning methodology.