An EHR-derived model more accurately estimated cerebrovascular and cardiovascular mortality than did the Framingham Risk Score (FRS), highlighting the potential to use EHRs for clinical and policy purposes. The method was published as a feasibility study in the March issue of Medical Care.
Jeremy B. Sussman, MD, MSc, of the Veterans Affairs Center for Clinical Management Research in Ann Arbor, Mich., and colleagues observed several trends that favor the use of EHRs in predictive models: current prediction methods rely on resource-intensive chart reviews and patient interviews; large amounts of data now exist in EHRs; and the use of EHRs is growing, thanks to federal incentives. At the same time, state-of-the art methods may be applied to nudge up predictive performance.
“Clinical risk prediction also could benefit from concurrent progress in the development of flexible and adaptive (or ‘nonparametric’) regression and machine learning methods, many of which are particularly well suited for large datasets with many variables,” Sussman et al wrote. “For example, ‘ensemble methods’ (which are also known as ‘metaclassifiers,’ and include random forests and boosting) work by incorporating predictions from a large number of small, simple models, and would have been computationally infeasible until relatively recently.”
To compare the performance using EHR data and traditional risk predictors, the researchers designed a retrospective cohort study with data from patients at 12 Veterans Health Administration (VHA) facilities from 2003 and 2007. The patients were all 18 years old or older with at least two clinic visits in 2003. Patients with a known cerebrovascular and cardiovascular diagnosis or event in 2003 were excluded. Data were obtained through a variety of VHA databases and resources. The outcome measure was cerebrovascular- or cardiovascular-related death within five years.
For comparative analyses, they examined the FRS versus internally developed EHR-derived models; parametric versus nonparametric regression methodology; and traditional risk predictors versus additional risk predictors such as hypertensive medications and comorbid conditions available in the VHA EHR. For their evaluation, they assessed the methods’ discrimination and calibration as well as changes in estimated risk after reclassification.
They identified 113,973 patients, of whom 4.4 percent died from cerebrovascular- or cardiovascular-related causes within five years. Those patients were, on average, 10.1 years older in 2003 and were more likely to have diabetes or be prescribed hypertension or diabetes medications. Results showed the FRS predicted 3,031 cerebrovascular- and cardiovascular-related deaths—39.9 percent fewer than the actual number of deaths—and was particularly weak at predicting low-risk patients.
Reclassification analyses favored internally developed models over the FRS, and the flexible and adaptive method bested logistic regression. Compared with FRS, boosting (one of the metaclassifiers) showed the largest net reclassification improvement, reclassifying 61.7 percent of patients with an event into a higher risk group and 8.3 percent of patients without an event into lower risk group.
Sussman et al put that in a clinical context, using as an example patients in their dataset with a risk of 5 percent or greater taking aspirin as a preventive. “Using boosting, 12,568 people would be treated who would not have been treated under the FRS, and 5,767 people would not be treated who would have been treated under the FRS,” they wrote. “The boosting treatment regime would be more accurate, as 1,366 of the 12,568 people (10.9 percent) who would have been treated only when using boosting went on to die of CCV [cerebrovascular- or cardiovascular]-related causes, compared with only 203 of the 5,767 (3.5 percent) who would have been treated only when using the FRS.”
They suggested that including more risk predictors (even if EHR data are imperfect), larger datasets with more clinical data and changes over time may achieve greater performance and improve accuracy while automation may facilitate implementation.
“Our study shows that, once the data exist, developing good risk scores is within the scope of most large health systems,” they concluded, adding that the methods they applied are no more difficult than traditional methods and can be applied using freely available software. “The greatest opportunity for future work, though, is in clinical