When Can We Say That Something Doesn’t Work?

People don’t want to waste their time on things that don’t work. To avoid wasting time, many may want to assess the scientific evidence. They may first look at the basic science (if it can be studied at such a level) and ask, “does this thing have a clear molecular/biological mechanism,” or they may ask, “does it have a theoretical foundation?” Next, the person may look at the human evidence (if there is any) and ask if it worked in a clinical trial or epidemiological data. Read More

Book Review: Fisher, Neyman, and the Creation of Classical Statistics

Erich Lehmann’s last book, which was published after his death, is on the history of classical statistics and its creators. Specifically, how his mentor Jerzy Neyman and his adversary Ronald Fisher helped lay the foundations for the methods that are used today in several fields. This post is intended to be a general review/summary of the book, which I recommend to everyone and anyone who is interested in statistics and science. Read More
fisher  math  power 

P-Values Are Tough And S-Values Can Help

The P-value doesn’t have many fans. There are those who don’t understand it, often treating it as a measure it’s not, whether that’s a posterior probability, the probability of getting results due to chance alone, or some other bizarre/incorrect interpretation. [1–3] Then there are those who dislike it for reasons such as believing that the concept is too difficult to understand or because they see it as a noisy statistic that provides something we’re not interested in. Read More

We May Not Understand Control Groups

It’s well known that randomized trials are some of the most efficient ways to make causal inferences and to determine how much something (an intervention) differs from the comparator (some sort of control). Random assignment helps make these goals easier by minimizing selection bias and making the distribution of prognostic factors between groups random (not balanced). [1] Discussions (similar to the one above) praising the efficiency of randomized trials are widespread, however, few of these discussions take a close look at some of the common assumptions that individuals hold regarding randomized trials. Read More

Misplaced Confidence in Observed Power

Two months ago, a study came out in JAMA which compared the effectiveness of the antidepressant escitalopram to placebo for long-term major adverse cardiac events (MACE). The authors explained in the methods section of their paper how they calculated their sample size and what differences they were looking for between groups. First, they used some previously published data to get an idea for incidence rates, “Because previous studies in this field have shown conflicting results, there was no appropriate reference for power calculation within the designated sample size. Read More

The Bradford Hill Criteria Don't Hold Up

In 1965, the epidemiologist, Austin Bradford Hill, who helped link smoking to lung cancer, gave a speech where he presented his viewpoints/criteria on how we can arrive at causation from correlation. This lecture was a bit of a game changer at the time given that the tobacco industry was employing statisticians, medical doctors, and even popular science writers to push the idea that the relationship between smoking and lung cancer was merely a correlation, not a causal one. Read More

Misuse of Standard Error in Clinical Trials

Reporting effect sizes with their accompanying standard errors are necessary because it lets the reader interpret the magnitude of the treatment effect and the amount of uncertainty in that estimate. It is magnitudes better than not providing any effect sizes at all and only focusing on statements of statistical significance. Although many authors provide standard errors with the intention of relaying the amount of uncertainty in the model, there are several misconceptions about when the standard error should be reported, and it is often misused. Read More

High Statistical Power Can Be Deceiving

Even though many researchers are now acquainted with what power is and why we try to aim for high power in studies, there are still several misconceptions about statistical power floating around. For example, if a study designed for 95% power fails to find a difference between two groups, does that offer more support for the null hypothesis? Many will answer yes, because they elicit that if such a large study failed to find a difference between two groups, then this provides evidence for no effect. Read More

Vitamin E, Mortality, and the Bayesian Gloss

Bayesian data analysis is beginning to gain traction in several fields. Some of those reasons include that it allows individuals to represent uncertainty using probability distributions and it helps them avoid losing information that’s typically lost with point estimates and dichotomization. Bayesian inference also allows for relevant background information to be incorporated into a model using a more continuous approach rather than making binary decisions about what to include. Read More

Problems with the Number Needed to Treat

The number needed to treat (NNT) is a popular statistic used in medicine and its use is even encouraged by groups like Cochrane and CONSORT. Why is it so popular? Most believe that the NNT is more understandable than effect sizes like odds ratios or risk ratios or statistics like the absolute risk reduction. The NNT is also believed to convey more meaningful information. In this blog post, I am going to discuss: Read More