Evaluating and Monitoring Large Language Models for Effective Clinical Translation
Danielle S. Bitterman, MD
There is immense enthusiasm about the potential of large language models to support clinical and administrative workflows in healthcare. However, a barrier to effective and safe clinical translation is the lack of standardized approaches to evaluate and monitor the knowledge quality, reasoning ability, and risks of these models. In this lecture, Dr. Bitterman will discuss current limitations of language model knowledge representation in the context of high-impact clinical applications. She will present research into measuring language model risks, including misinformation and logical reasoning errors, in ways that are systematic, generalizable, and clinically-relevant.