A study published in The Lancet Digital Health led by UNC School of Medicine’s Emily R. Pfaff, PhD, shows how the National COVID Cohort Collaborative used XGBoost machine learning models to better define long COVID and identify potential long-COVID patients with a high degree of accuracy.
Newswise — CHAPEL HILL, NC – Clinical scientists used machine learning (ML) models to explore de-identified electronic health record (EHR) data in the National COVID Cohort Collaborative (N3C), a National Institutes of Health-funded national clinical database, to help discern characteristics of people with long-COVID and factors that may help identify such patients using data from medical records.
The findings, published in The Lancet Digital Health, have the potential to improve clinical research on long COVID and inform a more standardized care regimen for the condition.
“Characterizing, diagnosing, treating and caring for long-COVID patients has proven to be a challenge due to the list of characteristic symptoms continuously evolving over time,” said first author Emily R. Pfaff, PhD, assistant professor in the Division of Endocrinology and Metabolism at the UNC School of Medicine. “We needed to gain a better understanding of the complexities of long-COVID, and for that it made sense to take advantage of modern data analysis tools and a unique big data resource like N3C, where many features of long COVID are represented.”