# Analytical Issues in That New Low-Carb/LDL Study

Recently, a randomized trial that investigated the impact of a low-carbohydrate diet on plasma low density lipoprotein cholesterol (LDL-C) in young and healthy adults was published. The study was done in Norway between 2011 and the end of 2012. A total of 30 participants completed the study, where they were either randomized to a low-carbohydrate group (<20 grams/d) or a control group.

Basically, the investigators found a difference between the groups,

“In the LCHF diet group (n=15), plasma LDL-C increased from (mean±SD) 2.2±0.4 mmol/l before intervention to 3.1±0.8 after, while in the control group (n=15), LDL-C remained unchanged: 2.5±0.8 mmol/l(p<0.001 between groups)

An LCHF diet for three weeks increased LDL-C with 44% versus controls. The individual response on LCHF varied profoundly.”

I really don’t find any of these results even slightly surprising, however, there were several aspects of the trial that concerned me.

First, I found the preregistration of the trial to be incredibly bizarre. The study was registered here and updated during recruitment, around 2012.

It’s not very thorough for a preregistration (no analyses plans mentioned etc), but this is to be expected given that it was registered and updated around 2011-2012, long before registrations were as comprehensive as they are today and have been in the past few years.

This isn’t what concerned me. What did concern me was their chosen primary outcome, which was listed on their registration as,

“Percent change in mean LDL-cholesterol after intervention with low carbohydrate diet from baseline to study end.”

This is baffling to me. They’re running a randomized controlled trial, yet they’re indicating that the primary outcome is a within-group analysis looking at how LDL changes in the low-carbohydrate group over time.

These types of analyses can be pretty misleading because of things like nonspecific effects/interactions, regression to the mean, and the placebo effect. The latter two wouldn’t apply as much here because the study was done in healthy individuals and the primary outcome is a biomarker. However, I still don’t understand what the point of an RCT is if you’re not going to focus on the treatment contrasts AKA the difference in final LDL scores between groups or the difference in changes in LDL scores over time between groups

Given that the study was done in Norway, and began in 2011-2012, I’m hoping that it was just a fluke in registration wording, because otherwise, the investigators didn’t seem to understand at the time what the point of a randomized trial is.

Second, the investigators did end up doing between-group analyses for several of the variables (good on them, they’re using their control group) along with several within-group analyses (I still don’t know why they would run these, and they discuss the results of these changes from baseline extensively in the conclusions, again misleading.)

If the authors cared so much about baseline measures and changes from them over time, what they should’ve done is compared changes over time (change scores) if there was a high correlation between the baseline results and the final results, or they could’ve included baseline characteristics as a predictor in an ANCOVA to adjust for the scores and give a less biased estimate.

It would’ve been easy to construct such a model as:

$y=intercept+\beta^{group}+\beta^{baseline}$

Yet, instead of doing this, they conducted several within-group analyses, which again, is confounded by other issues for which we use a control group. Easily misleading.

Here’s a graph they produced (I remade it on R and tried to get similar numbers using WebPlotDigitizer) showing the change of LDL in individuals.

Authors’ notes on the graph: "Each bar represents one subject’s percent change in plasma lipids, as indicated by the text in each panel. The fasting plasma levels at baseline were compared to fasting plasma levels after using LCHF for 3 weeks expressed as percent change. Subjects were ranked in order after the percent increase in LDL-C, from least change to the left side in the panel and the largest change to the right side of the panel. All subjects were ranked in the same order according to the percent increase of LDL-C to visualize the association of individual change in LDL-C and changes in other lipid parameters.

All this graph shows is that people’s LDL scores changed from their baseline scores. It’s not displaying the differences in means between groups at the end or differences in change scores between groups. This is not very useful data visualization.

Third and final, there are a large number of comparisons in this trial. As I’ve discussed before, the family-wise error rate skyrockets when you run so many statistical tests, and without any procedure to account for these, you’re going to run into many false positives within the Neyman-Pearson framework. Here’s a table of all the variables they measured with between-group analyses and within-group analyses.

Unfortunately, the lack of a solid preregistration with clear analytical plans and the lack of correction for multiple comparisons concerns me here.

Luckily, they did mention that their gene-expression studies, which included several more statistical comparisons, were exploratory and that they didn’t apply a correction for multiple comparisons.

“However, our gene expression studies are hypothesis-driven and not correction for multiple testing and should thus be interpreted with care.”

That’s great. I just wish they had done that for all of the other variables shown above.

Overall, the outcomes and conclusions of the study are not surprising to me. There are a myriad of studies showing similar outcomes. However, I think we need to be more aware of issues in the design and analysis of studies, even if the results support our beliefs.

• Cite this blog post