NEJM: Cost-profiling tools misclassify physician performance

Twitter icon
Facebook icon
LinkedIn icon
e-mail icon
Google icon

Cost-profiling tools produce misleading results and often misclassify physician performance, according to a RAND Corporation report published March 18 in the New England Journal of Medicine.

“Health plans are limiting the number of physicians who receive in-network contracts, offering patients differential co-payments to encourage them to visit so-called high-performance physicians (i.e., those providing higher-quality, lower-cost services), paying bonuses to physicians whose patterns of resource use are lower than average, and publicly reporting the relative costs of physicians’ services,” the authors noted.

John L. Adams, PhD, from RAND, and colleagues examined how accurately cost-profiling tools distinguished higher-cost physicians from lower-cost physicians.

According to the authors, determining the reliability of cost profiles have three determinate factors:

  • The number of observations (episodes of care);
  • Variations among physicians in their use of resources to manage similar episodes; and
  • Random variations in scores.

During the study, the researchers analyzed claims data (professional, facility, pharmaceutical and ancillary) from four insurance companies throughout the state of Massachusetts during 2004 and 2005. The data encompassed 2.8 million people--44 percent of the state’s residents.

Of the cohort, all patients were under 65 years old and older than 18, were continuously enrolled in a plan for two years and filed at least one claim (1.1 million patients). Additionally, the study included 12,789 physicians who provided direct patient care, contracted with one or more of the participating plans, were not in the pediatric or geriatric specialties and filed at least one claim.

During the study, the researchers:

  • Grouped service claims (office visits, lab tests, medication, etc.) into 600 various "episodes" (diabetes treatments, heart attack, urinary track infection, etc.);
  • Determined the costs of each episode by calculating the mean allowed charge across the four health plans for each type of service in an episode;
  • Assigned the physicians with the highest proportion of total professional costs and billed at least 30 percent of professional costs with episodes; and
  • Constructed physician summary cost profiles--the average cost of each episode assigned to physicians across 10 specialties and then adjusted the cost using patient-specific risk scores.

Researchers studied misclassifications by measuring the probability that cost performances of randomly selected physicians in a certain specialty would be inaccurately categorized.

Primary physicians represented 32 percent of the cohort, were assigned 46 percent of episodes and accounted for 23 percent of the costs.

Data were split into ten specialties: cardiology, endocrinology, family or general practice, gastroenterology, internal medicine, obstetrics-gynecology, orthopedic surgery, otolaryngology, pulmonary and critical care and vascular surgery.

Researchers found that “overall, the majority of physicians did not have cost profiles that met common thresholds of reliability.” Two-thirds of vascular surgeons were classified inaccurately as lower cost.

The average misclassification rate was 22 percent across all specialties, while these same rates were 16 percent for gastroenterology and otolaryngology and 36 percent for vascular surgery.

Twenty-nine percent of physicians in otolaryngology and 67 percent of physicians in vascular surgery were classified as being lower cost, but in fact were not. For those physicians who actually were lower in costs and not classified as such, 10 percent worked in obstetrics-gynecology and 22 percent in vascular surgery and internal medicine.

In addition, the figures showed that the mean reliabilities ranged from 0.05 for vascular surgery to 0.79 for gastroenterology and otolaryngology. Fifty-nine percent of physicians had cost-profile scores that reported reliabilities to be below 0.70, what the authors classified as the common marker for suboptimal reliability.

“The rates of misclassification for the one illustrative application that we examined were large enough to be cause for concern,” the authors wrote. They suggested that future plans include measures to better assess cost performance, this they concluded would be “the most promising avenue for further work."

They concluded: “These findings bring into question both the utility of cost-profiling tools for high-stakes uses, such as tiered health