JACC: Randomized controlled trials subject to 'intellectual gerrymandering'

Twitter icon
Facebook icon
LinkedIn icon
e-mail icon
Google icon

Randomized controlled trials (RCT), the gold standard for evidence-based management policies, have flaws that can lead to suboptimal practice recommendations, according to research published Feb. 2 in the Journal of the American College of Cardiology.

Lead authors Sanjay Kaul, MD, and George A. Diamond, MD, from the division of cardiology at the David Geffen School of Medicine at the University of California, Los Angeles, examined several major clinical trials, and found that three key limitations exist that may misconstrue trial results.

Researchers found that evidence in clinical trials relies heavily on:

  1. Statistical significance rather than clinical or practical importance of treatment effects;
  2. Composite endpoints, which can increase the proportion of outcome events and reduce sample sizes, creating shortcomings; and
  3. Subgroup analyses, which are performed without attention to the subordinate analyses, leading to “the reporting of chance findings that encourage suboptimal patterns in practice.

“One should be cautious that extremely large studies might be more likely to find a formally statistically significant difference for a trivial effect that is not really meaningfully different from the null," the researchers wrote.

According to the authors, typically P values and confidence intervals are used to assess the strength of the association of an intervention and an outcome. However, they found that “a P value or observed significance level provides a measure of the inconsistency of the data with respect to a specific hypothesis.”

A survey showed that composite endpoints were used in 37 percent of 1,231 published trials within seven years. The authors wrote these uses are thought to reduce costs and sample size, but statistical significance "does not place that reality into a meaningful clinical context" and does not represent whether this difference is large, small, trivial or important.

During a typical cardiovascular clinical trial, hard outcomes (death, Q-wave MI, stroke and CABG) and soft outcomes ( reintervention, periprocedural MI and rehospitalization) are combined and analyzed.

Kaul and Diamond found:“These less important disparate outcomes often drive the effect of therapy on the composite.” 

In a review of 114 cardiovascular trials that used the aforementioned endpoints as a systematic evaluation, 40 percent of trials reported a "large gradient in the hierarchy of clinical importance of component events." Of the 27 trials that reported this difference, only seven were driven by hard outcomes.

The researchers wrote that these “approaches, can be highly subjective, thereby lending themselves to intellectual gerrymandering."

To reaffirm the use of composite endpoints as evidence in clinical trials, the authors recommend that that investigators:

  •  Justify the validity of individual components;
  • Avoid clinically unimportant or uncertain outcomes;
  • Avoid components unlikely to be impacted by therapy;
  • Avoid combining efficacy outcomes with safety outcomes;
  • Report primary composite endpoints and individual components separately, preferably both hierarchical and nonhierarchical counts;
  • Examine treatment-by-end-point interaction by a formal assessment of heterogeneity;
  • Weigh components prospectively relative to their clinical importance; and
  • Conduct and report sensitivity analyses relative to weight of the component driving the composite endpoint.

“Careful attention to these caveats,” the authors wrote, “is not only key for critical evaluation of the literature, but it also has implications for the care and treatment of patients, and for the development and implementation of practice guidelines and reimbursement policy.”

In an accompanying editorial, Gregg W. Stone, MD, of the Columbia University Medical Center and the Cardiovascular Research Foundation in New York City, said that randomized controlled clinical trials are often poorly designed, hold inadequate quality control and are often subject to unsuitable interpretation.

He suggested that while Kaul and Diamond’s research is “well reasoned” and should be required reading for healthcare professionals involved in clinical trials, several key aspects were left out of the discussion. These factors include the comparison of multicenter trials and single-center trials, double-blind trials versus single-blind trials, superiority versus noninferiority trials and whether trial-enrolled patients are “generalizable to other