Machine learning – in which a computer finds patterns that humans might not have recognized – is increasingly being used to develop algorithms that help clinicians diagnose and treat patients.
But machine-learning algorithms are not magic. If a computer did its learning at one hospital, its algorithm may not work as well at another. Many machine-learning algorithms produced before COVID-19 saw their performance decay during the pandemic, when so much changed about healthcare and life in general. Even without a pandemic, algorithms can become outdated and lose their effectiveness over time.
Jean Feng, PhD, is, like many, optimistic about the potential of machine learning but concerned about its potential for inequitable effects. Unlike most, she is drawing on her background in biostatistics, computer science, and machine learning to build guardrails for clinical algorithms. Feng has received a $1-million funding award from the Patient-Centered Outcomes Research Institute (PCORI) to develop tools that help diagnose machine learning algorithms over three years beginning this March.
When a clinical-support algorithm doesn’t work as well as it once did or did in another setting, it also needs a "doctor" to step in to figure out what might be causing its symptoms and how to alleviate them. Feng will develop methods to quantify the impact of potential causes of performance decay, which she hopes will guide healthcare IT teams and machine learning developers performing root cause analyses.
“If we can leverage methods for generating causal explanations, maybe we’ll be able to decouple how the different factors contribute to the performance change,” she said.
Feng’s lab will also build open-source software that researchers and hospitals can use to help guard against the decay of their machine-learning algorithms. Eventually, Feng hopes that hospitals will assemble quality improvement teams that monitor all clinical-support computer models in use, to ensure that the algorithms are reliable and to fix them if and when performance decays.
“As a machine learning developer, it can be very tempting to fix the algorithm whenever we observe a performance drop. I’m guilty of that myself. This project aims to take a step back and ask the question of why the performance decayed in the first place,” Feng said. “With an explanation of what’s going on, healthcare IT teams and model developers can decide which adjustments to make. Rather than fixing the model, they may decide it’s the data they need to fix.”
PCORI spurs medical research that directly benefits patients. One way it does that is by supporting an electronic health record (EHR) network, PCORnet, in which records from multiple systems are standardized. PCORnet allows researchers to analyze data from different hospitals and diverse patient populations. PCORI sees the potential of machine learning algorithms to unlock information in EHRs that has thus far been difficult to process and integrate in research.
Feng’s project will focus on two clinical algorithms, both of which identify patients who may need additional help, rather than pointing doctors to one type of basic care over another. Feng will study how well SHIELD-RT, a model developed at Duke University to predict acute care needs for radiation therapy patients, works at UCSF. She will also seek to explain why a model that identifies UCSF patients at highest risk of hospital readmission, in order to provide them with additional medical and/or social support, performs differently at UCSF and at Zuckerberg San Francisco General hospitals.