Before I get into the nitty-gritty, I want to remind everyone why we randomize in the first place. It’s to reduce selection bias and to get rid of systematic variation amongst groups, which allows us to come to more precise and efficient causal inferences. Many critics claim that we can’t make valid causal inferences if there’s an imbalance in covariates between the groups.
Here’s an example. Say, hypothetically, we had two groups, and we wanted to see the effect of a statin on all-cause mortality and compare it to placebo. We randomized our participants to both groups. Now, imagine our placebo group had more smokers in it than the statin group. We may think that this imbalance in smoking distribution puts us in trouble because there were already substantial between-group differences before the trial even started.
So, we haven’t even started the study, and we got a significant result from our test of homogeneity. Now we’re thinking, “oh no, I need to randomize again or abandon ship, or I’ll attempt to fit this covariate in a generalized linear model like an ANCOVA.”
Critics of RCTs will argue that because there’s also always the possibility of there being an imbalance of known or unknown covariates between groups, RCTs cannot make proper causal inferences, especially small RCTs that are “unable to distribute confounders effectively.”
Unfortunately, there are several problems with these beliefs and approaches.
Tests of homogeneity cannot tell you whether you’ve “completely randomized,” and they are also inappropriate according to the CONSORT statement).
It’s not always possible to rerandomize to achieve better balance in trials
Larger trials do not necessarily produce a better balance than smaller trials
The balance of covariates has never been the goal of randomization
Fitting covariates to a GLM after a test of significance is inappropriate
RCTs were never designed to perfectly balance covariates or even have very similar distributions of them. The purpose of randomization is to distribute hidden covariates, not perfectly, but efficiently.
Ronald Fisher’s recommendation, when he originally wrote Statistical Methods for Research Workers in the 1900s, was to block known covariates while randomizing all the hidden ones while looking at the effects of known covariates on both between-group variance and within-group variance (where everyone is getting the same treatment). The effects on the within-group variance would give you some idea as to the impact of the covariate.
Now, will imbalances of hidden covariates affect the point estimates? Of course, but they will also yield large standard errors and WIDE compatibility intervals, which should make you less sure about the results of your study. But it does not mean your study is unable to make causal inferences.
It’s also worth remembering that analyses of clinical trials are robust in that they account for these possible imbalances in both known and unknown covariates, almost like how the t-test is robust to violation of normal distributions. When you have an imbalance in covariates, what do you get? Wide uncertainty intervals, which is okay, that’s what statistics is meant to do, allow you to gauge your uncertainty. Furthermore, if covariates were balanced well between groups, the standard analyses used for RCTs would become inappropriate, which is why the analyses for matched RCTs and normal RCTs are different.
But again, this doesn’t mean that we shouldn’t think about covariates. There are several valid methods to account for known covariates such as minimization, fitting covariates into a GLM (decisions like this are made prior, not based on tests of significance) or stratifying by that covariate, in fact, this technique of stratifying after is known as post-stratification and often used by trialists.
So, in conclusion, you don’t achieve causality from a perfect balance of RCTs and methodology is a lot more complicated than we ever thought.
I’d also like to mostly cite Stephen Senn’s works, which pretty much shaped most of this piece.