Machine learning with echocardiographic, EHR data improves survival predictions

Machine learning models using echocardiographic data and variables from the electronic health record (EHR) can significantly improve mortality predictions compared to traditional risk scores, researchers reported in JACC: Cardiovascular Imaging.

Manar D. Samad, PhD, and colleagues from Geisinger Health in Danville, Pennsylvania, compared a machine learning method to validated tools such as the Framingham Risk Score in 171,510 patients who underwent more than 330,000 echocardiograms.

Echocardiograms have traditionally been used to guide treatment decisions when considered alongside clinical variables, the authors noted. But an overwhelming supply of data is giving rise to the idea that machine learning algorithms may be better equipped than humans to process the results.

“There are less than 500 measurements derived from echocardiography at our institution, each echocardiogram consists of thousands of images, and there are 76 … high-level diagnostic codes that fall into the category of ‘diseases of the circulatory system,’” Samad and coauthors wrote. “With this amount of data and the limited time available to physicians for interpretation, it is highly likely that the full potential of echocardiographic data is not being realized in current clinical practice.”

Baseline clinical risk scores were able to achieve an area under the curve (AUC) ranging from 0.61 to 0.79 in predicting five-year mortality. A nonlinear forest model derived from machine learning improved the AUC to 0.89 when factoring in clinical variables, physician-reported left ventricular ejection fraction (LVEF) and echocardiographic measurements. The AUC for one-year mortality was 0.85 using those same factors.

“Machine learning models have far superior accuracy to predict survival after echocardiography compared with these standard clinical approaches, which is in line with previous studies,” Samad et al. wrote. “In the past, these clinical risk scoring systems were used out of simplicity, when the data were not readily available or easily automated as inputs into large models. However, with improved information technology systems and computational power available in healthcare, more complicated and accurate models, such as the one proposed in the present study, can be implemented in many health systems and may soon be ubiquitous.”

Age and tricuspid regurgitation jet maximum velocity—a measure of pulmonary systolic pressure—were the two most important variables for predicting survival. Next came heart rate, LDL cholesterol and physician reported LVEF. Overall, five clinical and five echocardiographic variables were in the top 10 for one-year survival predictions, while six echocardiographic and four clinical variables were the most important for the five-year estimates.

“Two other measures of pulmonary artery systolic pressure, the pulmonary artery acceleration time and slope, were also within the 10 most important variables for predicting survival,” the authors noted. “These high rankings suggest that measures of pulmonary systolic pressure derived from echocardiography may in fact be more important than previously recognized.”

Samad and colleagues pointed out only the top 10 variables were required to reach 96 percent of the maximum prediction accuracy, further cementing the importance of the echocardiography-derived data.

They acknowledged their study simply shows the predictive value of machine learning in this setting. More research is needed to determine whether these methods can help risk-stratify patients or lead to improved care decisions and better outcomes.