Skip to yearly menu bar Skip to main content


Poster
in
Affinity Event: Black in AI

Can Large Language Models Provide Emergency Medical Help Where There Is No Ambulance? A Comparative Study on Large Language Model Understanding of Emergency Medical Scenarios in Resource-Constrained Settings

Paulina Boadiwaa Mensah · Nana Seraa Quao · Sesinam Dagadu · Project Genie


Abstract:

There are a few medicine-oriented evaluation datasets and benchmarks for assessing the performance of various LLMs in clinical scenarios; however, there is a paucity of information on the real-world utility of LLMs in context-specific scenarios in resource-constrained settings. In this study, 16 iterations of a decision support tool for medical emergencies using 4 distinct generalized LLMs were constructed, alongside a combination of 4 Prompt Engineering techniques. In total 428 model responses were quantitatively and qualitatively evaluated by 22 clinicians familiar with the medical scenarios and background contexts. The best model-technique pair had a mean rating of 8.05/10. Our study highlights the benefits of In-Context Learning with few-shot prompting, and the utility of the relatively novel self-questioning prompting technique. We demonstrate the benefits of combining various prompting techniques to elicit the best performance of LLMs. We also highlight the need for continuous human expert verification in the development and deployment of LLM-based health applications, especially in use cases where context is paramount.

Live content is unavailable. Log in and register to view live content