ACC: Comparing between hospital 30-day mortality rates "wrongheaded"
Creating hospital-specific outcome measures can often be challenging and comparing between hospital rates during this process is difficult and often “wrongheaded,” particularly when assessing 30-day mortality, said Sharon-Lise T. Normand, PhD, of the Harvard Medical School in Boston at the American College of Cardiology (ACC) scientific sessions in Atlanta on March 15.

“There are many ways to assess or measure quality,” said Normand. These include process measures or structure measures, “but what we are actually talking about today are outcome measures,” she said.

Currently, at hospitals, quality of care is measured in regard to all-cause 30-day mortality rates. This was a selected quality measurement by the Centers of Medicare & Medicaid Services (CMS) because the results are unambiguous, she said. “If you’re dead, you’re dead.”

Risk-adjusted mortality rates should look at the “differences between what happened at your hospital relative to what was expected to happen,” rather than be used to compare data at separate facilities, Normand noted.

The only feasible time to “compare hospital 1 to hospital 2”  would be if you had a case mix that was exactly the same. “It’s very difficult to compare mortality between pairs of hospitals,” she said.  “If we were to do that, we would have to match up each pair of hospitals and make sure the risks looked exactly the same.”

Measuring a hospital's performance to what would be expected to occur, given the case mix to what that facility actually sees, makes sense, Normand said. Case mix among facilities differs depending upon what population of patients they treat. For example, community hospitals or primary PCI hospitals both treat a variant of patient cases dependent on certain illnesses or diseases.

“There is no reason to believe that a case mix would be close enough or similar enough to do a pair-wise comparison,” she explained.

She said that it is also difficult to calculate risk-standardized mortality rates when assessing facilities that have small volumes versus those with large volumes.

“Some hospitals might have 10 patients that they treat in a three-year period and some hospitals may have 500 patients," she said. “The problem is that there will always be some sort of between-hospital variation.”

To help alleviate these variations while measuring overall hospital care, CMS uses hospital standard deviations to alleviate variance. She explained that if the standard deviation of between-hospital deviation is 0.5, the odd ratios is 7.10. She said that this tells us that “the odds of dying in a hospital one standard deviation above the average is seven times that of dying in a hospital one standard below the national average."

“An odds ratio of seven is pretty large,” said Normand, “and there does seem to be a lot of between-hospital variation after we adjust for patient risk and patient severity.”

According to CMS, the heart failure (HF) 30-day mortality model makes adjustments according to these risks. These include accounting for demographic variables including age and sex, and cardiovascular history variables that look at previous history of percutaneous coronary transluminal angioplasty, CABG and HF. Additionally, she said that it adjusts for comorbidities--stroke, pneumonia, cancer, diabetes--but not complications.

“It’s important that those particular comorbidities are really comorbidities and not complications,” she said. “Although I could probably explain a lot more variability by including complications that would be inappropriate because then I would be adjusting for the quality of care that was rendered in any particular hospital.”

During the presentation, Normand referred to 2001 data for HF admissions. She looked at 5,000 hospitals treating 400,000 patients to evaluate annual HF volume. The data showed that one-quarter of hospitals had very small sample sizes and, overall, all-cause mortality rates were 12 percent.

“For small hospitals, with low volume of HF cases, the probability of actually observing nobody die is high,” she said. “If you had a hospital with five patients and the true mortality rate is 12 percent we are very likely to observe no deaths at your institution."

“That relationship tells you that you do not want to use the observed mortality rate divided by the expected mortality rate at the institution because you’d be wrong,” Normand noted. These extremes could lead to a faulty conclusion. “With a small volume, the likelihood of observing extremes—no deaths or everybody died at the hospital—is very high,” she said.

Even if the mortality rates were marked at 12 percent, after calculations like the above, it would have pegged these facilities as either a good performer because there were no deaths, or an extremely bad performer because there were many deaths.

As sample sizes of hospitals increase, so does the probability of observing an extreme event, she said. If you are a big hospital and had 100 HF patients, the probability of observing no deaths is “very small.” On the other hand, at this same hospital observing no mortality in 100 cases of HF could be “strong evidence that it is a good performing hospital that is having lower than expected mortality,” she said.

She offered the following steps be taken at facilities when reporting 30-day mortality rates for any diseased-based population:
  • Separate complications from comorbidities;
  • Separate within-hospital from between-hospital variation;
  • Validate models using various data sources, i.e., risk-adjustment software; and
  • Provide hospital-specific estimates that use "robust" sample sizes.

“Counting up the number of deaths and using that as a metric is really wrongheaded for several reasons but it will certainly get you into trouble if you have smaller sample sizes,” Normand concluded.