Statistical Science & Related Matters on Less Likely
https://lesslikely.com/
Recent content in Statistical Science & Related Matters on Less LikelyHugo -- gohugo.ioen-us2022 | Less LikelyMon, 01 Feb 2021 00:00:00 +0000Simulation of a Two-Group Parallel-Arm RCT with Interim Analyses
https://lesslikely.com/statistics/interim-analysis/
Mon, 01 Feb 2021 00:00:00 +0000https://lesslikely.com/statistics/interim-analysis/Setting up the Function A Simulated Example Examining the Results Statistical Package Citations Environment Article Citation Recently Andrew Althouse informed me that he was going to simulate a two-group parallel-arm randomized trial with interim analyses using the rpact R package, so I offered to also help in constructing the R code to do so. He already has a number of R scripts on his GitHub repo for doing similar simulations, which can be viewed here and a number of tweets explaining these simulations. For this example, his goal was to simulate a trial where the outcome was binary and the probability of death for each group could be tuned in addition to:
the total number of participants the number of interim analyses the schedule of the interim analyses the group-sequential design used along with the usual trial analysis parameters such as:
the \(\alpha\)-level the type of test (1-sided vs. 2-sided). The goal was to be able to produce a table of various statistics such as:
the odds ratio the confidence limits the \(P\)-value the number of successes for each of the interim analyses specified.
The function below is a reflection of our efforts to do so, and also returns several plots from the rpact package for the design that is chosen along with a plot comparing the design to other designs. In order to get similar results, you will need to load the R function first, and then simply enter the proper inputs. While there may be more efficient ways to write the code, for example using lapply() instead of for loops, we have chosen not to do so, and we have also tried to minimize the number of R packages necessary for the function to work but the following will be required:
rpact stringr You can quickly install and load both using:
req_packs <- c("rpact", "stringr", "ggplot2") install.packages(req_packs) lapply(req_packs, library, character.only = TRUE) Setting up the Function #' @title Simulation of a Two-Group Parallel-Arm Trial With Interim Analyses #' @docType Custom function for simulation from the rpact package #' @author Andrew Althouse with edits by Zad Rafi #' NOTE: If you want to confirm "type 1 error" under different stopping rules, #' make death = in the two treatment arms (e.g. no treatment effect) #' NOTE: I have set this one up to test the power for a treatment that would reduce mortality #' from 40% in control group (1) to 30% in treatment group (2) #' NOTE: Trial Design Parameters - Part 1 #' Here we will specify the basics: total N patients to enroll, and death rate for each treatment arm #' NOTE: Trial Design Parameters - Part 2 #' Here we will define the interim analysis strategy and stopping rules #' For this trial we will include provisions for efficacy stopping only (no pre-specified futility stopping) #' We will use the rpact package to compute the stopping/success thresholds at the interim and final analysis #' NOTE: Required packages: rpact and stringr #' @param nSims # The number of simulations, the default is 1000 #' @param nPatients # here is where you specify the planned max number of patients you want included in each RCT #' @param death1 # here is where you specify the event rate for patients receiving 'treatment 1' in these trial #' @param death2 # here is where you specify the event rate for patients receiving 'treatment 2' in these trials #' @param nLooks # here is where you put the number of looks that will take place (INCLUDING the final analysis) #' @param analyses_scheduled # schedule of interim analyses #' @param sided # Whether the test is 1-sided or 2-sided #' @param alpha # Specified alpha level, the default is 0.Tables, Graphs, and Computations from Rafi & Greenland (2020)
https://lesslikely.com/statistics/RG2020BMC/
Sat, 12 Dec 2020 00:00:00 +0000https://lesslikely.com/statistics/RG2020BMC/The following post provides some of the code that was used to construct the figures and tables from Rafi & Greenland, 20201. An enhanced PDF version of the paper can be found here. For further discussion of the computations, see the appendix of the main paper, along with our technical supplement.2
Disclaimer: I am responsible for all the code and mistakes below, and none of them can be attributed to my coauthors or my fellow package developers.
In order to recreate the functions, I would recommend installing the latest version of concurve from CRAN, as it has patched some issues with graphing when the outcome is binary. Use the script below to get the latest version and load the R package. A number of other R packages are also used in this post, which are listed below.
install.packages("concurve") library("concurve") Valid \(P\)-values Are Uniform Under the Null Model Here we show that valid \(P\)-values have specific properties, when the null model is true. We first generate two variables (\(Y\), \(X\)) that come from the same normal distribution with a \(\mu\) of 0 and \(\sigma\) of 1, each with a total of 1000 observations. We assume that there is no relationship between these two variables. We run a simple t-test between \(Y\) and \(X\) and iterate this 100000 times and compute 100000 \(P\)-values to see the overall distribution of the \(P\)-values, which we then plot using a histogram./
RNGkind(kind = "L'Ecuyer-CMRG") set.seed <- 1031 n.sim <- 100 t.sim <- numeric(n.sim) n.samp <- 1000 for (i in 1:n.sim) { X <- rnorm(n.samp, mean = 0, sd = 1) Y <- rnorm(n.samp, mean = 0, sd = 1) df <- data.frame(X, Y) t <- t.test(X, Y, data = df) t.sim[i] <- t[[3]] } ggplot(NULL, aes(x = t.sim)) + geom_histogram(bins = 30, col = "black", fill = "#99c7c7", alpha = 0.25) + labs(title = "Distribution of P-values Under the Null", x = "P-value") + scale_x_continuous(breaks = seq(0, 1, 0.10)) + theme_bw() This can also be shown using the TeachingDemos R package, which has a function dedicated to showing this phenomenon.
library("TeachingDemos") #> #> Attaching package: 'TeachingDemos' #> The following object is masked from 'package:plotly': #> #> subplot #> The following objects are masked from 'package:Hmisc': #> #> cnvrt.coords, subplot RNGkind(kind = "L'Ecuyer-CMRG") set.seed <- 1031 obs_p <- Pvalue.norm.sim(n = 1000, mu = 0, mu0 = 0, sigma = 1, sigma0 = 1, test= "t", alternative = "two.sided", alpha = 0.05, B = 100000) ggplot(NULL, aes(x = obs_p)) + geom_histogram(bins = 30, col = "black", fill = "#99c7c7", alpha = 0.25) + labs(title = "Distribution of P-values Under the Null", x = "P-value") + scale_x_continuous(breaks = seq(0, 1, 0.10)) + theme_bw() As you can see, when the null model is true, the distribution of \(P\)-values is uniform. Valid \(P\)-values are uniform under the null hypothesis and their corresponding \(S\)-values are exponentially distributed. We run the same simulation as before, but then convert the obtained \(P\)-values into \(S\)-values, to see how they are distributed.
RNGkind(kind = "L'Ecuyer-CMRG") set.What Makes a Sensitivity Analysis?
https://lesslikely.com/statistics/sensitivity/
Sat, 12 Dec 2020 00:00:00 +0000https://lesslikely.com/statistics/sensitivity/Cargo-Cult Uncertainty Vagueness of ‘Sensitivity’ Frequent Misconceptions Randomness & Uncertainty Portable Sensitivity Analyses Regulators & Sensitivity Analyses Estimands & Sensitivity Utility of Subject-Matter Experts Sensitivity in Clinical Trials Exploratory Analyses The Analysis Workflow Monte Carlo Error Analysis The Primary Analysis SA I: Controlled MI SA II: Selection Models Supplementary Analyses Full Analysis Set Examining the Analyses Acknowledgements Statistical Environments R Environment Stata Environment References This post will be split into two sections, in the first part, I will discuss the nature and difficulty of conducting sensitivity analyses and the quality of many published sensitivity analyses in various scientific fields, including medicine. I will then discuss a general framework that has existed for a long time but seen a major resurgence by regulatory agencies around the world that may help researchers conduct these analyses and other analyses in general.
In the second part of this post, I will use data from a historical parallel-arm RCT and analyze it from start to finish, however, the caveat is that there are extensive missing data, therefore I will have to utilize a number of non ad-hoc sensitivity analysis and missing data techniques to better understand these data were collected and how informative they are. Much of the code in this second part will be in R and Stata, but references will also be given to specific techniques and theory.
Cargo-Cult Uncertainty Sensitivity analyses are an important part of statistical science and many other disciplines when conducted in a principle and systematic manner. However, in the published sensitivity analyses literature, there are many inconsistencies, misconceptions, and highly misleading findings from these analyses. A giant obstacle that prevents these issues from recurring is that they (sensitivity analysis techniques) are quite difficult to learn and often advanced statistical methods that even many statisticians have difficulty with.
Yet, the appearance of difficulty does not dissuade certain researchers away from adopting them for their own work, (so that they may give their colleagues and their stakeholders that they report to, the impression of rigor and methodological expertise), and when this is done mindlessly, researchers will often skip over learning the details and theory, and like many statistical procedures, they will rely on default settings built in the statistical software.
While there are many suites/commands/functions/libraries available to conduct such analyses, much of these procedures masquerade as meticulous sensitivity analyses and to the users, are often a formality to appease stakeholders and give researchers a false sense of confidence about what they are doing. And yet, their users have little to no idea what they’re actually doing. Thus, like many statistical procedures new and old, they too will inevitably be abused as they become more popular and as they are required in research reports.
As Stark & Saltelli1, along with many others such as Gigerenzer2 and Greenland3 have written in the past. Below is an excerpt where Stark describes that much of statistics is simply people masquerading as data analysts and being rewarded as experts despite not having a single clue what they’re doing.Quality Control in Statistical Analyses
https://lesslikely.com/statistics/statistics-quality-control/
Sat, 13 Jun 2020 00:00:00 +0000https://lesslikely.com/statistics/statistics-quality-control/Data Entry & Management General Principle: Be Consistent With Everything Setting Up a Principled Statistical Workflow Setting up the Project Folder Setting up the Functions Script Setting up the Error Catching Script Example of Data Loading & Inspection Example of the Main Source Script Setting the Seed Example of a Multiple Imputation Script Saving and Reading Data Carefully Example Modeling Script Example Table Script Possible Drawbacks of this Approach Helpful Resources Environment References Experienced statisticians and data analysts are familiar with stories where a coding error has led to an entire conclusion changing, or even a retraction.1 It’s the sort of stuff that keeps people up at night. Unfortunately, not many of us think about these sorts of scenarios until we realize it’s very possible that it could happen to any of us.
To me, it seems that many of these issues could be avoided by having a principled data management and statistical workflow, and making it as transparent, open, and reproducible as possible. I’d like to quickly go over a few things that I’ve found helpful over the years, and I’ll first start with data management and data entry and then move onto analysis workflows. I largely consider this to be a living document, and I’m sure many people who will read this will have far better suggestions, so please leave them down below in the comments!
Before I go on, I want to emphasize that backing up your data, scripts, and using version control is extremely important. There is no debate about this. It’s necessary so that other collaborators/colleagues can inspect your work and catch potential mistakes or see overall progress, but more importantly, it will prevent you from losing your data in a disaster, and it’ll help you catch your own mistakes, since you’ll be the most familiar with the data and scripts.
Data Entry & Management A nice paper that I’d like to review is the one by Broman & Woo, 2018 on how to manage your data when working with spreadsheets.2 The sad reality is that even though spreadsheets like Microsoft Excel or Google Sheets are available everywhere, and easy to use, there are many risks when working with spreadsheets, just ask any statistician who works in genetics or any bioinformatician.3
One of the most fatal errors occurred recently when a group of researchers lost thousands of documented COVID cases because they entered data for each case as a column instead of a row, and Excel has a limit on how many columns and rows it can handle (1,048,576 rows and 16,384 columns, according to Microsoft), as a result, most of these cases were lost, resulting in an enormous waste of resources due to a careless and ignorant mistake, highlighting the dangers of recklessly inputting data and conducting statistical analyses. There is no doubt that reviewing principles of good data management and workflow are essential to any data analyst. I’d like to touch on some of the most important points of Broman & Woo, 2018 paper before moving onto some other “principles” I’d like to share:Book Review: Regression and Other Stories by Gelman, Hill, and Vehtari
https://lesslikely.com/statistics/regression-stories/
Tue, 11 Jun 2019 00:00:00 +0000https://lesslikely.com/statistics/regression-stories/Over a decade ago, Andrew Gelman and Jennifer Hill gave applied researchers a comprehensive book (Data Analysis Using Regression and Multilevel/Hierarchical Models) on fitting simple and complex statistical models in R both from a classical framework and a Bayesian one. Now they’re back with an updated version and a new author (Aki Vehtari).
Much has changed in applied statistics since 2006 (when the book was first released). The primary software used at the time and in the book to fit Bayesian models was BUGS (Bayesian inference Using Gibbs Sampling).
However, both BUGS and some of the R code in the first edition are now outdated. The new edition updates the R code and contains intuitive instructions on how to fit simple and complex models using the probabilistic programming language, Stan (also developed by Gelman and colleagues), which is now used in several fields (even for studying wine!).
Indeed, running a Bayesian regression model in R is now as simple as
# I use the sample PlantGrowth dataset in R library("rstanarm") library("bayesplot") pg <- PlantGrowth model1 <- stan_glm(weight ~ group, data = pg, refresh = 0) summary(model1) #> #> Model Info: #> function: stan_glm #> family: gaussian [identity] #> formula: weight ~ group #> algorithm: sampling #> sample: 4000 (posterior sample size) #> priors: see help('prior_summary') #> observations: 30 #> predictors: 3 #> #> Estimates: #> mean sd 10% 50% 90% #> (Intercept) 5.0 0.2 4.8 5.0 5.3 #> grouptrt1 -0.4 0.3 -0.7 -0.4 0.0 #> grouptrt2 0.5 0.3 0.2 0.5 0.9 #> sigma 0.6 0.1 0.5 0.6 0.8 #> #> Fit Diagnostics: #> mean sd 10% 50% 90% #> mean_PPD 5.1 0.2 4.9 5.1 5.3 #> #> The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')). #> #> MCMC diagnostics #> mcse Rhat n_eff #> (Intercept) 0.0 1.0 2866 #> grouptrt1 0.0 1.0 3054 #> grouptrt2 0.0 1.0 3192 #> sigma 0.0 1.0 2944 #> mean_PPD 0.0 1.0 3666 #> log-posterior 0.0 1.0 1688 #> #> For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1). plot(model1) Another key difference between the first edition and the new edition is that the 2006 book attempted to cover several topics at once. It contained instructions on how to fit simple models in a classical framework all the way up to multilevel models in a Bayesian framework. The new edition attempts to reduce this information overload by splitting itself into two volumes.
The first volume (Regression and Other Stories) covers fitting simple and complex models using R and Stan, and is oriented towards the applied researcher or statistician, who wants a smooth introduction to fitting Bayesian models using Stan without diving into much theory or math.
A draft copy of the table of contents in the new edition can be found here, though it’s very likely that the published edition will have some changes.Book Review: Fisher, Neyman, and the Creation of Classical Statistics
https://lesslikely.com/statistics/classical-lehmann/
Sun, 30 Dec 2018 00:00:00 +0000https://lesslikely.com/statistics/classical-lehmann/A Very Brief History of Classical Statistics Where Are The Bayesians? Fisher’s Contributions Neyman’s Contributions The Fallout Between The Creators Of Classical Statistics There Is No One Neyman Nor One Fisher References Erich Lehmann’s last book, (1) which was published after his death, is on the history of classical statistics and its creators. Specifically, how his mentor Jerzy Neyman and his adversary Ronald Fisher helped lay the foundations for the methods that are used today in several fields.
A Very Brief History of Classical Statistics This post is intended to be a general review/summary of the book, which I recommend to everyone and anyone who is interested in statistics and science. The book clears up several misconceptions people have about how frequentist statistics came to be the dominant school of statistics. Thus, I want to go over four topics from Lehmann’s book that I believe people should know more about:
How the founders of classical statistics viewed Bayesian inference
What they each developed
How they came to become so conflicted
And how their views changed over time
Where Are The Bayesians? As Stephen Senn points out in his Fisher Memorial Lecture at the Royal Statistical Society, there is a common myth that everyone who practiced applied statistics before the early 20th century was using Bayesian inference and doing everything correctly, but then Fisher came in and created significance testing, thus giving researchers a powerful tool to easily hack their data and produce publishable results, and now we have several replication crises because of this.
Of course, this is far from the truth and any thorough investigation into the history of statistics will clear up this up amongst many other misconceptions.
As several individuals may know, it was Thomas Bayes who came up with Bayes theorem and it was Richard Price who disseminated most of his writings after Bayes’s death. However, as many self-identified Bayesians will attest, using Bayes’ theorem does not make one a Bayesian. It is actually quite hard to know how Bayes would react to modern Bayesian inference. The Bayesian inference that we are familiar with today can be attributed to Pierre-Simon Laplace, who popularized what is now known as “objective Bayes.”
Back then, it was not called “Bayesian inference” but was referred to as “inverse probability” and it was a method used by many before the dominance of classical statistics. So this is one part that common myths get right. Inverse probability did indeed have a moment in history before the dominance of frequentist statistics.
Laplace, and several others popularized such methods, but around the end of the 19th century, the tides began to shift. Several mathematicians and statisticians began to discourage the use of inverse probability because they saw it as a nonrigorous method of data analysis.
This can be seen in the following passages about Fisher.
“His first publication on this new approach to inference was a 1930 paper “Inverse probability.” The paper begins with a critique of the inverse (Bayesian) method. This section ends with Fisher’s asking:P-values Are Tough And S-values Can Help
https://lesslikely.com/statistics/s-values/
Sun, 11 Nov 2018 00:00:00 +0000https://lesslikely.com/statistics/s-values/What is a P-value Anyway? Some Definitions & Descriptions Misleading Definitions Auxilliary Assumptions Probability of What? Properties (Uniformity) The Different Interpretations The Decision-Theoretic Approach Statistical Significance The Inductive Approach Null-Hypothesis Significance Testing Measure of Compatibility Common, Misleading Criticisms Estimation and Intervals Overstating the Evidence Some Valid Issues Mismatch With Direction Difficulties Due to Scale Resolution with Surprisals Some Examples S-value Calculator References The \(P\)-value doesn’t have many fans. There are those who don’t understand it, often treating it as a measure it’s not, whether that’s a posterior probability, the probability of getting results due to chance alone, or some other bizarre/incorrect interpretation.1–3 Then there are those who dislike it because they think the concept is too difficult to understand or because they see it as a noisy statistic we’re not interested in.
However, the groups of people mentioned above aren’t mutually exclusive. Many who dislike and criticize the \(P\)-value also do not understand its properties and behavior. This is unfortunate, given how important and widely used they are. In this article, which could also have been titled, \(P\)-values: More Than You Ever Wanted to Know, I take on the task of explaining:
what \(P\)-values are the assumptions behind them their properties and behavior different schools of interpretation misleading criticisms of \(P\)-values some valid issues in interpretation how these issues can be resolved What is a P-value Anyway? Some Definitions & Descriptions The \(P\)-value is the probability of getting a result (specifically, a test statistic) at least as extreme as what was observed if every model assumption, in addition to the targeted test hypothesis (usually a null hypothesis), used to compute it were correct.3–5
A simple, mathematically rigorous definition of a \(P\)-value (for those interested) is given by Stark (2015).
Let \(P\) be the probability distribution of the data \(X\), which takes values in the measurable space \(\mathcal{X}\). Let \(\left\{R_{\alpha}\right\}_{\alpha \in[0,1]}\) be a collection of \(P\) -measurable subsets of \(\mathcal{X}\) such that (1) \(P\left(R_{\alpha}\right)=\alpha\) and (2) If \(\alpha^{\prime}<\alpha\) then \(R_{\alpha^{\prime}} \subset R_{\alpha}\). Then the \(P\)-value of \(H_{0}\) for data \(X=x\) is inf \(_{\alpha \in[0,1]}\left\{\alpha: x \in R_{\alpha}\right\}\).
A descriptive but technical definition is given by Sander Greenland below. The description can seem dense, so feel free to skip over it for now and revisit it after reading the rest of the post.
A single \(P\)-value \(p\) is the quantile location of a directional measure of divergence \(t\) = \(t(y;M)\) of the data point \(y\) (usually, the vector in \(n\)-space formed by $$n individual observations) from a test model manifold \(M\) in the \(n\)-dimensional expectation space defined the logical structure of the data generator (“experiment” or causal structure) that produced the data \(y\). \(M\) is the subset of the \(Y\)-space into which the conjunction of the model constraints (assumptions) force the data expectation or predict where y would be were there no ‘random’ variability. I also use \(M\) to denote the set of all the model constraints, as well as their conjunction.
With this logical set-up, the observed \(P\)-value is the quantile \(p\) for the observed value \(t\) of \(T\) = \(t(Y;M)\).We May Not Understand Control Groups
https://lesslikely.com/statistics/control-group-effects/
Sun, 28 Oct 2018 00:00:00 +0000https://lesslikely.com/statistics/control-group-effects/The Placebo Effect Isn’t What We Often Think The Additive Model Is Too Simple Possible Directions References It’s well known that randomized trials are some of the most efficient ways to make causal inferences and to determine how much something (an intervention) differs from the comparator (some sort of control). Random assignment helps make these goals easier by minimizing selection bias and making the distribution of prognostic factors between groups random (not balanced).1
Discussions (similar to the one above) praising the efficiency of randomized trials are widespread, however, few of these discussions take a close look at some of the common assumptions that individuals hold regarding randomized trials. And unfortunately, these common assumptions may be based on outdated evidence and simplistic ideas.
The Placebo Effect Isn’t What We Often Think For example, in placebo-controlled trials, many individuals make the assumption that much of the improvement seen in the control group over time is due to the placebo effect, as modeled by the graph below.
However, inquiries into this topic have yielded contradictory results, in that the placebo effect may not be as powerful as we assume.
One systematic review2 that looked at 130 clinical trials concluded the following,
“We found little evidence in general that placebos had powerful clinical effects. Although placebos had no significant effects on objective or binary outcomes, they had possible small benefits in studies with continuous subjective outcomes and for the treatment of pain. Outside the setting of clinical trials, there is no justification for the use of placebos.”
Thus, the placebo effect had some small effects in areas where it is difficult to objectively measure a phenomenon and where there is a higher likelihood for encountering measurement error. However, measurement error is not the only thing that could explain some of the improvements seen in control groups, which we often attribute to the placebo effect.
We simply need to reread two sections of a paper by the very anesthesiologist who popularized the placebo effect3 and who claimed that it had the ability to put patients’ conditions into remission.
Here, Henry Beecher (the anesthesiologist) claims that placebo effects are powerful:
“It is evident that placebos have a high degree of therapeutic effectiveness in treating subjective responses, decided improvement, interpreted under the unknowns technique as a real therapeutic effect, being produced in 35.2 ± 2.2% of cases….
Placebos have not only remarkable therapeutic power but also toxic effects. These are both subjective and objective. The reaction (psychological) component of suffering has power to produce gross physical change.”
Then he reports another observation from his patients.
“The evidence is that placebos are most effective when the stress is greatest.”
If you measured someone’s stress (let’s say objectively, via cortisol measurements) and they were extremely stressed and if you followed up after a certain period of time, it would be very likely that the next measurement of their stress (cortisol levels) would be less than the first measurement.
A slightly reverse situation would likely apply too: if you measured someone’s happiness (in some theoretical objective way) and they were extremely happy, it’s very likely that their next measurement of happiness would not be so high.Influential Errors | The Diet Heart Tale
https://lesslikely.com/nutrition/meta-analysis-pufa/
Thu, 11 Oct 2018 00:00:00 +0000https://lesslikely.com/nutrition/meta-analysis-pufa/What Were the Results? What Went Wrong? Correcting the Error Reanalysis Script Major Update Environment References Earlier this year, my colleagues and I were discussing the relationship between saturated fat and cardiovascular disease and one of us was writing an article on a very unusual trial often included in meta-analyses of the topic.
That trial is the Finnish Mental Hospital Study,1 a crossover study that compared patients on a control diet with a certain amount of saturated fat to patients on an intervention diet that replaced the saturated fat with polyunsaturated fats.
Here is a summary of the trial,
“A controlled intervention trial, with the purpose of testing the hypothesis that the incidence of coronary heart disease (CHD) could be decreased by the use of serum-cholesterol-lowering (SCL) diet, was carried out in 2 mental hospitals near Helsinki in 1959–71.
The subjects were hospitalized middle-aged men. One of the hospitals received the SCL diet, i.e. a diet low in saturated fats and cholesterol and relatively high in polyunsaturated fats, while the other served as the control with a normal hospital diet. Six years later the diets were reversed, and the trial was continued another 6 years.”
The study didn’t just include men, it also included women and is discussed in a separate paper by the same research group.
In total, the “two studies” (really just one study) had a sample size of 818 participants (for hard CVD events), so they often weigh quite a bit in meta-analyses.
I’d like to bring attention to one particular meta-analysis published eight years ago by Mozaffarian, Micha, & Wallace, 2010. It’s one of the most cited meta-analyses on this topic, with Google Scholar indicating that it’s been cited by over 900+ academic sources. Web of Science indicates that it’s been cited by 466 papers at the time of writing this post.
Source: Web of Science
Clearly, it’s a well known study.
The meta-analysis of interest describes its inclusion and exclusion criteria as,
“We searched for all RCTs that randomized adults to increased total or n-6 PUFA consumption for at least 1 year without other major concomitant interventions (e.g., blood pressure or smoking control, other multiple dietary interventions, etc.), had an appropriate control group without this dietary intervention, and reported (or had obtainable from the authors) sufficient data to calculate risk estimates with standard errors for effects on occurrence of”hard” CHD events (myocardial infarction, CHD death, and/or sudden death). Studies were excluded if they were observational or otherwise nonrandomized.”
So the authors state that the included studies must be randomized trials that are at least a year long and that they are excluding studies that are non-randomized or observational.
Here’s a list of the studies they included. Note the design of the Finnish studies (Turpeinen, 1979 & Miettinen, 1983), which I’ll touch upon below.
What Were the Results? Mozaffarian D, Micha R, Wallace S (2010)
“Combining all trials, the pooled risk reduction for CHD events was 19% (RR = 0.81, 95% CI 0.70–0.95, p = 0.Misplaced Confidence in Observed Power
https://lesslikely.com/statistics/misplaced-power/
Sun, 30 Sep 2018 00:00:00 +0000https://lesslikely.com/statistics/misplaced-power/Two months ago, a study came out in JAMA which compared the effectiveness of the antidepressant escitalopram to placebo for long-term major adverse cardiac events (MACE).
The authors explained in the methods section of their paper how they calculated their sample size and what differences they were looking for between groups.
First, they used some previously published data to get an idea for incidence rates,
“Because previous studies in this field have shown conflicting results, there was no appropriate reference for power calculation within the designated sample size. The KAMIR study reported a 10.9% incidence of major adverse cardiac events (MACE) over 1 year… Therefore, approximately 50% MACE incidence was expected during a 5-year follow-up.”
Then, they calculated their sample size based on some differences they were interested in finding,
“Assuming 2-sided tests, α = .05, and a follow-up sample size of 300, the expected power was 70% and 96% for detecting 10% and 15% group differences, respectively.”
So far so good.
Then, we get to the results,
“A significant difference was found: composite MACE incidence was 40.9% (61/149) in the escitalopram group and 53.6% (81/151) in the placebo group (hazard ratio [HR], 0.69; 95% CI, 0.49-0.96; P = .03). The model assumption was met (Schoenfeld P = .48). The estimated statistical power to detect the observed difference in MACE incidence rates between the 2 groups was 89.7%.”
Ouch. This issue ended up bothering me so much that I wrote a letter to the editor (LTE) to point out the issue. Unfortunately, the LTE got rejected, but Andrew Althouse suggested that I discuss this over at DataMethods, so I did, and I also discussed it on Twitter but also wanted to publish the LTE on my blog. Here it is.
This letter has now been preprinted on arXiv.
In a similar tale, a group of surgeons published a methodological article advocating this practice of calculating observed power, which I further discuss here.The Bradford Hill Criteria Don't Hold Up
https://lesslikely.com/statistics/bradford-hill-criteria-dont-hold/
Thu, 06 Sep 2018 00:00:00 +0000https://lesslikely.com/statistics/bradford-hill-criteria-dont-hold/In 1965, the epidemiologist, Austin Bradford Hill, who helped link smoking to lung cancer, gave a speech where he presented his viewpoints on how we can arrive at causation from correlation.
This lecture was a bit of a game changer at the time given that the tobacco industry was employing statisticians, medical doctors, and even popular science writers to push the idea that the relationship between smoking and lung cancer was merely a correlation, not a causal one.
Both the tobacco industry and empiricists argued that the existing data was not very convincing because there were no human experiments showing that smoking causes lung cancer. Of course, this was a sticky situation because it was neither ethical nor practical to randomize people and force them to smoke cigarettes and compare rates of lung cancer to a control group.
Austin Bradford Hill and his co-investigator, Richard Doll, were able to argue with compelling data from the British Doctors’ Study that smoking cigarettes does indeed cause lung cancer, when combining these data with several other data (molecular, cellular) and referring to a set of viewpoints.
Since then, these viewpoints have been used as a sort of checklist/criteria in several papers and by several authors to assess causality when randomized trials weren’t possible. For example, Science-Based Medicine often refers to these viewpoints as criteria and recently discussed their usage in a blog post reviewing a book about hormonal therapy.
In this post, drawing heavily from Rothman, Greenland & Lash, 2008 (from here on, RGL), I want to discuss these viewpoints, and why many of them don’t often hold up when critically examined.
Hill’s nine viewpoints were the following:
Strength Stronger associations, according to Hill, were more compelling for causal relationships than weaker associations because of the possibility of unmeasured confounding leading to weak associations between two phenomena. He uses the example of smoking and cancer and compares it to thrombosis in smokers,
“…prospective inquiries into smoking have shown that the death rate from cancer of the lung in cigarette smokers is nine to ten times the rate in non-smokers and the rate in heavy cigarette smokers is twenty to thirty times as great. On the other hand the death rate from coronary thrombosis in smokers is no more than twice, possibly less, the death rate in non-smokers.”
It does indeed sound compelling. Science-Based Medicine uses the same argument for alternative medicine,
“If acupuncture or homeopathy were 400 times superior to placebo, there would no discussion of its validity. Many medical therapies are not 400 times as effective as placebo, but the strength of the association between cause and effect is well above background noise.”
Why It Doesn’t Hold Up Several causal relationships that we know of today, such as the relationship between smoking and cardiovascular disease, and environmental tobacco smoke and lung cancer, had weak associations. However, Hill didn’t discount weak associations as seen here,
“In thus putting emphasis upon the strength of an association we must, nevertheless, look at the obverse of the coin.Misuse of Standard Error in Clinical Trials
https://lesslikely.com/statistics/standard-error-clinical-trials/
Tue, 07 Aug 2018 00:00:00 +0000https://lesslikely.com/statistics/standard-error-clinical-trials/Calculating Standard Deviation Calculating Standard Error Misuse of Standard Error References Reporting effect sizes with their accompanying standard errors is necessary because it lets the reader interpret the magnitude of the treatment effect and the amount of uncertainty in that estimate. It is magnitudes better than not providing any effect size at all and only focusing on statements of statistical significance.
Although many authors provide standard errors with the intention of relaying the amount of uncertainty in the estimate, there are several misconceptions about when the standard error should be reported, and so it is often misused.
I believe it is worth explaining what exactly standard error is and how it differentiates from something like standard deviation. Standard deviation is a descriptive statistic that tells us about the variability of the data. Let’s quickly take a look at how it’s calculated using random data.
Calculating Standard Deviation \[4, 5, 10, 6, 7, 10, 10\]
\[4 + 5 + 10 + 6 + 7 + 10 + 10 = 52\]
The mean is: \(52 / 7 = 7.43\)
Next, we calculate the sum of squared errors, which can be thought of as the total deviance from the mean. We calculate this by taking each value in the data (\(4, 5, 10, 6, 7, 10, 10\)) and subtracting the mean from each value, which gives us the deviance.
\[(4 - 7.43) = -3.43\]
\[(5 - 7.43) = -2.43\]
\[(10 - 7.43) = 2.57\]
\[(6 - 7.43) = -1.43\]
\[( 7 - 7.43) = -0.43\]
\[(10 - 7.43) = 2.57\]
\[(10 - 7.43) = 2.57\]
All of the above values are deviances. We square each of them and add them together to get the sum of squared errors.
\[(-3.43)^2 + (-2.43)^2 + (2.57)^2 + (-1.43)^2 + (-0.43)^2 + (2.57)^2 + (2.57)^2 =\]
Sum of Squared Errors = \(39.68\)
We get the variance by taking the sum of squared errors and dividing by the number of data points \(-1\).
\[39.68 / 7-1 = 6.61\]
Variance = \(6.61\)
Finally, we get the standard deviation by taking the square root of the variance, \[(\sqrt{6.61})\]
Standard deviation = \(2.57\)
There we have it. Again, the standard deviation is a descriptive statistic which tells us about the variability of the data.
Calculating Standard Error The standard error, on the other hand, is the standard deviation of the sampling distribution. It can be thought of as the standard deviation of several means. It is an analytic statistic that tells us about the uncertainty in our point estimate. It is calculated by taking the standard deviation and dividing it by the square root of the total number of data points.
\[2.57 / (\sqrt7) = 0.97\]
Standard error of the mean = \(0.97\)
This and the mean would typically be reported as such \(7.43 ± 0.97\)
Misuse of Standard Error In clinical trials, participants are often recruited nonrandomly but are assigned to groups via random assignment. This random allocation allows us to make inferences about those treatment differences (Kempthorne, 1977; Senn & Auclair, 1990).Vitamin E, Mortality, and the Bayesian Gloss
https://lesslikely.com/statistics/bayesian-vitamin-e/
Wed, 20 Jun 2018 00:00:00 +0000https://lesslikely.com/statistics/bayesian-vitamin-e/Bayesian data analysis is beginning to gain traction in several fields. Some of those reasons include that it allows individuals to represent uncertainty using probability distributions and it helps them avoid losing information that’s typically lost with point estimates and dichotomization.
Bayesian inference also allows for relevant background information to be incorporated into a model using a more continuous approach rather than making binary decisions about what to include.
This approach generally plays out well when the background information being incorporated is based on empirical evidence. However, when a prior is poorly chosen and is given a lot of weight, it transfers into the posterior distribution and can yield results that are no more helpful than using a frequentist approach.
Sander Greenland discusses this particular issue in depth in his response to a Bayesian meta-analysis that was done on Vitamin E and mortality.
To give some context, it was well known in the biochemical literature at the time that antioxidants reduced oxidative stress, so naturally, everyone was interested to see whether antioxidant supplements, such as Vitamin E could be used for any health outcomes.
Several randomized controlled trials had investigated the effects of vitamin E supplementation on mortality, and a few conventional meta-analyses that pooled these studies found that:
high doses of vitamin E supplementation increased the risk of mortality
low doses of vitamin E reduced the risk of mortality
studies that were at low risk of bias showed that vitamin E increased the risk of mortality
studies that were at high risk of bias showed that vitamin E decreased the risk of mortality.
The data weren’t all in agreement. But based on what we knew of mechanisms, combined with the human data, it made sense to believe that vitamin E, at specific dosages, may have helped certain people (people with deficiencies), been useless for a lot of people, and harmed others (induced toxicity). It doesn’t make much sense to conclude that vitamin E has no effect and to extrapolate this to the whole population.
Furthermore, many of the meta-analyses were pooling together trials that used the synthetic form of Vitamin E, dl-tocopheryl acetate, a racemic mixture comprising 50% of but one of 8 known active naturally occurring molecules with vitamin E activity and extrapolating these results to the entire class of molecules.
It would be the equivalent to studying only Vitamin K1 in trials, which is found in plants, finding little to no activity on various biomarkers, and extrapolating those results to the entire Vitamin K class with meta-analyses. So from the beginning, many of these meta-analyses and their conclusions were already displaying ignorance of biochemistry.
In 2009, a group of researchers wanted to end the confusion regarding vitamin E and mortality and decided to go with a Bayesian hierarchical meta-analysis and incorporate prior information into the model, thinking that it would be the solution the mixed results found in the past. They explain their reasoning for using a Bayesian meta-analysis,
“This relationship should be modeled in a robust fashion and presented in a way that clarifies the strength of the relationship and, to the extent possible, resolves residual uncertainties about the relationship and its strength.Problems with the Number Needed to Treat
https://lesslikely.com/statistics/problems-with-nnt/
Sun, 27 May 2018 00:00:00 +0000https://lesslikely.com/statistics/problems-with-nnt/Calculations Incidence Proportions Absolute Risk Reduction Relative Risk Numbers Needed to Treat Major Problem: Missing Information Comparator Uncertainty Time Period Rounding NNTs References The number needed to treat (NNT) is a popular statistic used in medicine and its use is encouraged by groups like Cochrane and CONSORT. Even the famous statistician, Doug Altman was a big proponent of it. Why is it so popular? Most believe that the NNT is more understandable than measures like odds ratios or risk ratios or statistics like the absolute risk reduction. The NNT is also believed to convey more meaningful information.
In this blog post, I am going to discuss:
what the NNT is
how to calculate an NNT
some of the problems associated with the use of the NNT
solutions to the problems associated with the use of the NNT
The NNT is defined as the number of patients that need to be treated for one patient to experience a positive outcome or for one patient not to experience an adverse outcome. So, an NNT of 100 means that 100 patients need to be treated for one patient actually to benefit. The NNT is the inverse of the absolute risk reduction (ARR) (1/ARR). I will show how they are calculated with a sample contingency table from a controlled study comparing the drug Apixaban to placebo for recurrent thromboembolism (RT). Apixaban is the treatment for the first row and placebo is the treatment for the second row. Bolded numbers are the totals.
Recurring Events Non Recurring Events Total Events Apixaban 14 826 840 Control 73 756 829 Total Events 87 1582 1669 Calculations We will calculate four different measures: incidence proportions, absolute risk reduction, relative risk, and then the NNT.
Incidence Proportions In the group that received the drug, 14 people experienced RT out of 840 people (14/840). So, about 1.7%. In the control group, 73 people experienced disease progression out of 829 people (73/829), which is 8.8%. These are incidence proportions.
Absolute Risk Reduction We calculate the absolute risk reduction by finding the difference between the incidence proportions: 8.8% (control group) - 1.7% (treatment group) = 7.1% So, the use of the drug Apixaban lowered the absolute risk of RT by 7.1%.
Relative Risk Most people do not enjoy thinking in terms of absolute risk reduction, and they prefer to think in terms of relative risk. We calculate the relative risk by taking the incidence rates and dividing them by one another. So, 8.8% / 1.7% = 5.2
In words, patients who took placebo were 5.2 times more likely to develop RT than patients who took Apixaban.
We can also do 1.7% / 8.8% = 0.19, which means that people who were treated with Apixaban had 19% of the risk that the control group had in developing RT.
Numbers Needed to Treat Relative risk is more easily understood than absolute risk reduction, but relative risk can also be deceiving, and most in medicine prefer to work with absolute risk reduction, so they prefer to use the numbers needed to treat which can be calculated by taking the inverse of the absolute risk reduction which is 1 / 7.Calculating Observed Power Is Just Transforming Noise
https://lesslikely.com/statistics/observed-power-magic/
Fri, 04 May 2018 00:00:00 +0000https://lesslikely.com/statistics/observed-power-magic/“We Respectfully Disagree” Zombie Power The Frustration & Wrath of Statisticians My Response Script to Reproduce Simulations Environment The analyses were run on Around April of last year, a group of surgeons published a paper in the Annals of Surgery (apparently one of the most read journals in surgical science) where they suggested that CONSORT and STROBE guidelines be modified to recommend calculations of post-hoc power, specifically observed power.
They write,
“But, as 80% power is difficult to achieve in surgical studies, we argue that the CONSORT and STROBE guidelines should be modified to include the disclosure of power—even if less than 80%—with the given sample size and effect size observed in that study.”
Some folks noticed, including Andrew Gelman, who decided to write a letter to the editor (LTE) explaining why this is a bad idea.
Gelman writes in the LTE,
“This would be a bad idea. The problem is that the (estimated) effect size observed in a study is noisy, especially so in the sorts of studies discussed by the authors. Using estimated effect size can give a terrible estimate of power, and in many cases can lead to drastic overestimates of power (thus, extreme overconfidence of the sort that is rightly deplored by Bababekov et al. in their article), with the problem becoming even worse for studies that happen to achieve statistical significance.
The problem is well known in the statistical and medical literatures; see, e.g., Lane and Dunlap (1978), Hedges (1984), Goodman and Berlin (1994), Senn (2002), and Lenth (2007). For some discussion of the systemic consequences of biased power calculations based on noisy estimates of effect size, see Button et al. (2013), and for an alternative approach to design and power analysis, see Gelman and Carlin (2014).”
“We Respectfully Disagree” The authors replied of course, where most of their argument boils down to,
“We respectfully disagree that it is wrong to report post hoc power in the surgical literature. We fully understand that P value and post hoc power based on observed effect size are mathematically redundant; however, we would point out that being redundant is not the same as being incorrect… We also respectfully disagree that knowing the power after the fact is not useful in surgical science.”
Andrew Gelman notices and happens to blog about it, and also gets a chance to write a response to the response to the response to the original paper.
Guess now it’s over. Or well… we thought!
Zombie Power Same group of surgeons come back with a completely new paper where they conducted a review of the surgical literature, removed all significant results, and calculated the observed power of each nonsignificant study.
They summarize their findings,
“In this analysis of the current, high-impact surgical literature, we found the median power in studies concluding insignificant differences was 0.16, with only four studies reaching the standard of 0.8.”
They also draw some interesting parallels in their discussion section to show how change can be difficult but necessary,
“The continued inability of surgical studies to reach power of 0.Myth: Covariates Need to Be Balanced in RCTs
https://lesslikely.com/statistics/equal-covariates/
Tue, 10 Apr 2018 00:00:00 +0000https://lesslikely.com/statistics/equal-covariates/Before I get into the discussion, I want to remind everyone why we randomize in the first place. It’s to reduce selection bias and to get rid of systematic variation among groups, which allows us to come to more precise and efficient causal inferences. Many critics claim that we can’t make valid causal inferences if there’s an imbalance in covariates between the groups.
Here’s an example. Say, hypothetically, we had two groups, and we wanted to see the effect of a statin on all-cause mortality and compare it to placebo. We randomized our participants to both groups. Now, imagine our placebo group had more smokers in it than the statin group. We may think that this imbalance in smoking distribution puts us in trouble because there were already substantial between-group differences before the trial even started.
So, we haven’t even started the study, and we got a significant result from our test of homogeneity. Now we’re thinking, “oh no, I need to randomize again or abandon ship, or I’ll attempt to fit this covariate in a generalized linear model like an ANCOVA.”
Critics of RCTs will argue that because there’s also always the possibility of there being an imbalance of known or unknown covariates between groups, RCTs cannot make proper causal inferences, especially small RCTs that are “unable to distribute confounders effectively.”
Unfortunately, there are several problems with these beliefs and approaches.
Tests of homogeneity cannot tell you whether you’ve “completely randomized,” and they are also inappropriate, see the CONSORT statement).
It’s not always possible to rerandomize to achieve better balance in trials (even though the pioneer of randomization is believed to have one said to rerandomize when an undesirable configuration was obtained.)
Larger trials do not necessarily produce a better balance than smaller trials
The balance of covariates has never been the goal of randomization
Fitting covariates to a GLM after a test of significance is inappropriate
RCTs were never designed to perfectly balance covariates or even have very similar distributions of them. The purpose of randomization is to distribute hidden covariates, not perfectly, but efficiently and randomly.
Ronald Fisher’s recommendation, when he originally wrote Statistical Methods for Research Workers in the 1900s, was to block known covariates while randomizing all the hidden ones while looking at the effects of known covariates on both between-group variance and within-group variance (where everyone is getting the same treatment). The effects on the within-group variance would give you some idea as to the impact of the covariate.
Now, will imbalances of hidden covariates affect the point estimates? Of course, but they will also yield large standard errors and wide compatibility intervals, which should make you less sure about the results of your study. But it does not mean your study is unable to make causal inferences.
It’s also worth remembering that analyses of clinical trials are robust in that they account for these possible imbalances in both known and unknown covariates. When you have an imbalance in covariates, what do you get?Prescript Assist Is a Risky Supplement
https://lesslikely.com/nutrition/prescript-assist/
Sat, 31 Mar 2018 00:00:00 +0000https://lesslikely.com/nutrition/prescript-assist/What is Prescript Assist? Changes to the Prescript Assist Formula What’s Actually Found in Prescript Assist? The Clinical Evidence on Probiotics The Dirt on Probiotics What is Prescript Assist? Prescript Assist is a popular probiotic supplement that is marketed as following ancestral principles. The claim is that this product contains microorganisms found in soil, which are “natural” and what our ancestors consumed with their unwashed produce. Hence, the claim is that we’ve evolved with these microorganisms. Some of the other claims of this product are the following:
It (the formulation) can survive the harsh conditions of the gastrointestinal tract
It can permanantly colonize the gut
It has beneficial organisms in it
It has clinical evidence behind it
The product has prebiotics that will feed the bacteria in the product
Wow! The product seems like a miracle that addresses so many difficulties encountered by lactic-acid bacteria probiotics. The formulation has also changed in the past few years. The old Prescript Assist formula had 29 species of “soil-based organisms,” the new formula has 28, with several new species replacing old ones. Here are some of the microbes that are included in it:
Changes to the Prescript Assist Formula Old Formula New Formula Arthrobacter agilis Arthobacter globiformis Arthrobacter citreus Azospirillum brasilense Arthrobacter globiformis Azosprillum lipoferum Arthrobacter luteus Azotobacter chroococcum Arthrobacter simplex Azotobacter paspali Acinetobacter calcoaceticus Azotobacter vinelandii Azotobacter chroococcum Bacillus amyloliquefaciens Azotobacter paspali Bacillus atrophaeus Azospirillum brasiliense Bacillus licheniformis Azospirillum lipoferum Bacillus megaterium Bacillus brevis Bacillus pumilus Bacillus macerans Bacillus subtilis Bacillus pumilus Bacillus thuringiensis Bacillus polymyxa Bacillus firmus Bacillus subtilis Brevibacillus brevis Bacteroides lipolyticum Cellulomonas fimi Bacteriodes succinogene Kurthia zopfii Brevibacterium stationis Micrococcus luteus Kurthia zopfii Nocardioides simplex Myrothecium verrucaria Pseudomonas fluorescens Myrothecium verrucaria Pseudomonas putida Pseudomonas calcis Rhodobacter sphaeroides Pseudomonas dentrificans Rhodococcus rhodochrous Pseudomonas fluorescens Rhodopseudomonas palustris Pseudomonas glathei Rhodospirillum rubrum Phanerochaete chrysosporium Streptomyces griseus Streptomyces fradiae Streptomyces griseoflavus Streptomyces cellulosae Streptomyces venezuelae Streptomyces griseoflavus Let’s first explore whether it makes any sense to include these strains in a probiotic supplement. Some questions to consider:
Are these microorganisms found in the human gastrointestinal tract?
Are these microbes found in foods that humans regularly eat?
How do these microbes interact with each other?
How do these microbes interact with human microbes?
Will they produce antibiotics that could kill human bacteria?
What effects will they have on the immune system? Are they safe?
Have been they tested in large human studies, what about animal models?
Have they been tested over long periods of time?
What’s Actually Found in Prescript Assist? Let’s do some investigative research. Searches of MEDLINE, the database of studies that are searchable using the search engine PubMed, yield almost no human or animal results for most of these microorganisms.
A lot of these organisms are used for agricultural and industrial purposes. Also, it’s interesting that several of the species found in Prescript Assist are not found in human or animal studies, but are found in this fertilizer product by Tainio Biologicals. Look at the italicized ingredients…
So Prescript Assist is literally repackaged soil fertilizer!Stan
https://lesslikely.com/statistics/stan/
Thu, 01 Jan 1970 00:33:38 +0000https://lesslikely.com/statistics/stan/R Markdown Stan Python R Stata Julia R Stata References Please don’t mind this post. I use this to try out various highlighting styles for my code and formats.
\begin{algorithm} \caption{Quicksort} \begin{algorithmic} \PROCEDURE{Quicksort}{$A, p, r$} \IF{$p < r$} \STATE $q = $ \CALL{Partition}{$A, p, r$} \STATE \CALL{Quicksort}{$A, p, q - 1$} \STATE \CALL{Quicksort}{$A, q + 1, r$} \ENDIF \ENDPROCEDURE \PROCEDURE{Partition}{$A, p, r$} \STATE $x = A[r]$ \STATE $i = p - 1$ \FOR{$j = p$ \TO $r - 1$} \IF{$A[j] < x$} \STATE $i = i + 1$ \STATE exchange $A[i]$ with $A[j]$ \ENDIF \STATE exchange $A[i]$ with $A[r]$ \ENDFOR \ENDPROCEDURE \end{algorithmic} \end{algorithm} YPost=β0+β1Group+β2Base+β3Age+β4Z+β5R1+β6R2+ϵY^{\operatorname{Post}} = \beta_{0} + \beta_{1}^{\operatorname{Group}} + \beta_{2}^{\operatorname{Base}} + \beta_{3}^{\operatorname{Age}} + \beta_{4}^{\operatorname{Z}} + \beta_{5}^{\operatorname{R1}} + \beta_{6}^{\operatorname{R2}} + \epsilonYPost=β0+β1Group+β2Base+β3Age+β4Z+β5R1+β6R2+ϵ \[Y^{\operatorname{Post}} = \beta_{0} + \beta_{1}^{\operatorname{Group}} + \beta_{2}^{\operatorname{Base}} + \beta_{3}^{\operatorname{Age}} + \beta_{4}^{\operatorname{Z}} + \beta_{5}^{\operatorname{R1}} + \beta_{6}^{\operatorname{R2}} + \epsilon\]
R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Stan df1 <- read.csv("/Users/zad/Dropbox/LessLikely/ts.csv") df1$y <- ts(df1$Sales) df1$ds <- as.Date(df1$Time.Increment) splits <- initial_time_split(df1, prop = 0.5) train <- training(splits) test <- testing(splits) interactive <- TRUE # Forecasting with auto.arima library("forecast") md <- auto.arima(train$y) fc <- forecast(md, h = 12) model_fit_arima_no_boost <- arima_reg() %>% set_engine(engine = "auto_arima") %>% fit(y ~ ds, data = training(splits)) # Model 2: arima_boost ---- model_fit_arima_boosted <- arima_boost( min_n = 2, learn_rate = 0.015 ) %>% set_engine(engine = "auto_arima_xgboost") %>% fit(y ~ ds + as.numeric(ds) + factor(month(ds, label = TRUE), ordered = F ), data = training(splits) ) # Model 3: ets ---- model_fit_ets <- exp_smoothing() %>% set_engine(engine = "ets") %>% fit(y ~ ds, data = training(splits)) model_fit_lm <- linear_reg() %>% set_engine("lm") %>% fit(y ~ as.numeric(ds) + factor(month(ds, label = TRUE), ordered = FALSE ), data = training(splits) ) # Model 4: prophet ---- model_fit_prophet <- prophet_reg() %>% set_engine(engine = "prophet") %>% fit(y ~ ds, data = training(splits)) #> Error in sampler$call_sampler(c(args, dotlist)): c++ exception (unknown reason) # Model 6: earth ---- model_spec_mars <- mars(mode = "regression") %>% set_engine("earth") recipe_spec <- recipe(y ~ ds, data = training(splits)) %>% step_date(ds, features = "month", ordinal = FALSE) %>% step_mutate(date_num = as.numeric(ds)) %>% step_normalize(date_num) %>% step_rm(ds) wflw_fit_mars <- workflow() %>% add_recipe(recipe_spec) %>% add_model(model_spec_mars) %>% fit(training(splits)) models_tbl <- modeltime_table( model_fit_arima_no_boost, model_fit_arima_boosted, model_fit_ets, model_fit_prophet, model_fit_lm, wflw_fit_mars ) #> Error in eval_tidy(xs[[j]], mask): object 'model_fit_prophet' not found calibration_tbl <- models_tbl %>% modeltime_calibrate(new_data = testing(splits)) #> Error in modeltime_calibrate(., new_data = testing(splits)): object 'models_tbl' not found calibration_tbl %>% modeltime_forecast( new_data = testing(splits), actual_data = df1 ) %>% plot_modeltime_forecast( .legend_max_width = 25, # For mobile screens .interactive = interactive ) #> Error in modeltime_forecast(., new_data = testing(splits), actual_data = df1): object 'calibration_tbl' not found calibration_tbl %>% modeltime_accuracy() %>% table_modeltime_accuracy( .interactive = FALSE ) #> Error in c(".type", ".calibration_data") %in% names(object): object 'calibration_tbl' not found refit_tbl <- calibration_tbl %>% modeltime_refit(data = df1) #> Error in modeltime_refit(.
https://lesslikely.com/authors/untitled/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/authors/untitled/About
https://lesslikely.com/about/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/about/ This website is about statistical science, which may seem like an odd phrase given that it’s rarely used. Unlike ‘statistics’, which may often refer to ‘mathematical statistics’ and ‘applied statistics’, statistical science goes beyond probability theory and mathematics, and incorporates good principles of design and scientific thinking to maximize quantitative inferences drawn from data in the real world. So, why not just call it ‘applied statistics’ to differentiate it from mathematical statistics?
As the great John Nelder argued,1
First ‘applied statistics’ becomes a tautology, for statistics is nothing without its applications. The phrase should be abandoned. It has arisen to distinguish it from ‘mathematical statistics’. However, this is also a misnomer, because it should be ‘statistical mathematics’, as A. C. Aitken entitled his book many years ago.
To make this change does not in any way diminish the importance of mathematics. Mathematics remains the source of our tools, but statistical science is not just a branch of mathematics; it is not a purely deductive system, because it is concerned with quantitative inferences from data obtained from the real world.
Highly influential statisticians besides Nelder have recognized the need for statistics to break away from mathematics and probability such as Cox, Reid, Efron, & Greenland.2–5 I too share their goals and hope to promote good statistical science on this website within the context of medicine and nutrition.
You can read past articles by clicking this Archives link or the menu on the top right. You can also see other projects by clicking… Projects or clicking the same link on the top right menu. You can find a list of other individuals in statistical science whose content I often read.
If you wish to follow this website, consider subscribing for updates whenever new articles are out. And if you wish to support the website, consider supporting the site here.
If you’re interested in knowing what this website is built on, you can read about it here.
If you’d like to get in touch with me, please use the contact form but also please feel free to message me on Twitter or interact with me there!
Thanks!
References 1. Nelder JA. (1999). ‘From Statistics to Statistical Science’. Journal of the Royal Statistical Society Series D (The Statistician). 48:257–269. 2. Cox DR. (2017). ‘Statistical science: A grammar for research’. European Journal of Epidemiology. 32:465–471. doi: 10.1007/s10654-017-0288-1. 3. Cox DR, Efron B. (2017). ‘Statistical thinking for 21st century scientists’. Science Advances. 3:e1700768. doi: 10.1126/sciadv.1700768. https://advances.sciencemag.org/content/3/6/e1700768. 4. Greenland S. (2017). ‘Invited commentary: The need for cognitive science in methodology’. American Journal of Epidemiology. 186:639–645. doi: 10.1093/aje/kwx259. https://doi.org/10.1093/aje/kwx259. 5. Reid N. (1994). ‘A conversation with sir david cox’. Statistical Science. 9:439–455. doi: 10.1214/ss/1177010394. https://projecteuclid.org/euclid.ss/1177010394. Archives
https://lesslikely.com/archives/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/archives/Blog Roll
https://lesslikely.com/projects/blog-roll/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/projects/blog-roll/ Here are some blogs, websites, and Twitter accounts I often follow.
My listing of these individuals does not indicate endorsement.
Individual Website Twitter Sander Greenland https://scholar.google.com/citations?user=HcvPl18AAAAJ&hl=en https://twitter.com/Lester_Domes David Spiegelhalter http://www.statslab.cam.ac.uk/~david/ https://twitter.com/d_spiegel Stephen Senn http://www.senns.uk/home.html https://twitter.com/stephensenn Philip Stark https://www.stat.berkeley.edu/~stark/ https://twitter.com/philipbstark Jonathan Bartlett https://thestatsgeek.com/ https://twitter.com/TheStatsGeek John Cook https://www.johndcook.com/blog/ https://twitter.com/JohnDCook Frank Harrell http://www.fharrell.com/ https://twitter.com/f2harrell Andrew Gelman https://statmodeling.stat.columbia.edu/ https://twitter.com/StatModeling Jessica Hullman https://t.co/6RHv7dwQ93 https://twitter.com/JessicaHullman Oliver Maclaren https://omaclaren.com/ https://twitter.com/omaclaren Corey Yanofsky https://itschancy.wordpress.com/ https://twitter.com/Corey_Yanofsky Darren Dahly https://statsepi.substack.com/ https://twitter.com/statsepi Andrew Althouse https://github.com/aalthous https://twitter.com/ADAlthousePhD Tim Morris https://iris.ucl.ac.uk/iris/browse/profile?upi=TNMOR17 https://twitter.com/tmorris_mrc Brennan Kahan https://scholar.google.com/citations?user=Lf0jsXAAAAAJ&hl=en https://twitter.com/Brennan_Kahan Kristoffer Magnusson http://rpsychologist.com/ https://twitter.com/krstoffr Richard Morey https://medium.com/@richarddmorey https://twitter.com/richarddmorey Alex Hayes https://www.alexpghayes.com/ https://twitter.com/alexpghayes Isabella R. Ghement ghement.ca https://twitter.com/IsabellaGhement R-Bloggers https://www.r-bloggers.com/ https://twitter.com/Rbloggers Comment Policy
https://lesslikely.com/comment-policy/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/comment-policy/General Guidelines Not Allowed All comments on this blog must be manually approved via Disqus (the comment managing system). I’m quite confident that I won’t have to halt or delete comments by most individuals, however, I still suggest quickly skimming over the community guidelines.
General Guidelines Please read the article before commenting.
Please try to stay on topic. Your comment can just be a joke but at least make it relevant!
Criticism (even harsh criticism) is welcome, but please avoid attacking anyone’s character.
Not Allowed Folks who post the following will not have their comments approved and constantly doing the following will likely get you banned from commenting and banned from the blog.
Spam, sales pitches, affiliate links.
Links to other content without any qualifying comments.
Harassment or bullying of any individuals (on the blog or individuals outside of the blog).
Hate speech against a group of people.
Not too hard to follow. Just be respectful, please. Thank you and happy commenting!Community Code of Conduct
https://lesslikely.com/conduct/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/conduct/Our Pledge Our Standards Enforcement Responsibilities Enforcement Enforcement Guidelines 1. Correction 2. Warning 3. Temporary Ban 4. Permanent Ban Attribution The following code of conduct is adopted from the tidyverse community and sets the conduct on this website along with the comment policy and privacy policy.
Our Pledge We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
Our Standards Examples of behavior that contributes to a positive environment for our community include:
Demonstrating empathy and kindness toward other people Being respectful of differing opinions, viewpoints, and experiences Giving and gracefully accepting constructive feedback Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience Focusing on what is best not just for us as individuals, but for the overall community Examples of unacceptable behavior include:
The use of sexualized language or imagery, and sexual attention or advances of any kind Trolling, insulting or derogatory comments, and personal or political attacks Public or private harassment Publishing others’ private information, such as a physical or email address, without their explicit permission Other conduct which could reasonably be considered inappropriate in a professional setting Enforcement Responsibilities Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate. Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at the following page. All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
Enforcement Guidelines Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
1. Correction Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.Contact
https://lesslikely.com/contact/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/contact/ Loading… Datasets
https://lesslikely.com/projects/datasets/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/projects/datasets/Dataset Summaries + Links PUFA Meta-Analysis Beta Blocker RCT Polyp Randomized Trial Crossover BP Medication Trial Acupuncture headache trial The International Stroke Trial Below are some sample datasets that I have uploaded on the site for teaching purposes, along with descriptive tables about the characteristics of the datasets.
Dataset Summaries + Links Name Description Link 2010 PUFA Mozafarrian Dataset used in a 2010 meta-analysis in which ‘the researchers assess the impact of increased PUFA consumption as replacement for SFA on CHD events.’ https://lesslikely.com/datasets/PUFA.csv Beta Blocker Trial Data from an early randomized trial on beta blockers that looked at mortality. https://lesslikely.com/datasets/betablocker.csv Trial to Reduce Polyps Trial to Reduce Polyps https://lesslikely.com/datasets/polyp.csv Crossover Trial of BP Medication Crossover trial of a blood pressure medication https://lesslikely.com/datasets/crossoverbp.csv Vickers, A.J. Whose data set is it anyway? Sharing raw data from randomized trials. Trials 7, 15 (2006) Data set from acupuncture headache trial (XLS 254 KB) https://lesslikely.com/datasets/vickers-2006-trials.xls Sandercock, P.A., Niewada, M., Członkowska, A. et al. The International Stroke Trial database. Trials 12, 101 (2011). The International Stroke Trial (IST) was conducted between 1991 and 1996 (including the pilot phase between 1991 and 1993). It was a large, prospective, randomised controlled trial, with 100% complete baseline data and over 99% complete follow-up data. The aim of the trial was to establish whether early administration of aspirin, heparin, both or neither influenced the clinical course of acute ischaemic stroke. https://lesslikely.com/datasets/sandercock-2011-trials.csv PUFA Meta-Analysis #> 'data.frame': 6 obs. of 5 variables: #> $ Study_ID : chr "DARTS" "LA Veterans" "Minnesota CS" "MRC Soy" ... #> $ PUFA_Events : int 132 53 131 45 61 2 #> $ PUFA_Total : int 1018 424 4541 199 206 27 #> $ Control_Events: int 144 71 121 51 81 5 #> $ Control_Total : int 1015 422 4516 194 206 28 Study_ID PUFA_Events PUFA_Total Control_Events Control_Total DARTS 132 1018 144 1015 LA Veterans 53 424 71 422 Minnesota CS 131 4541 121 4516 MRC Soy 45 199 51 194 Oslo Diet Heart 61 206 81 206 STARS 2 27 5 28 Beta Blocker RCT #> 'data.frame': 44 obs. of 5 variables: #> $ ID : int 1 2 3 4 5 6 7 8 9 10 ... #> $ Deaths: int 3 14 11 127 27 6 152 48 37 188 ... #> $ Total : int 39 116 93 1520 365 52 939 471 282 1921 ... #> $ Center: int 1 2 3 4 5 6 7 8 9 10 ... #> $ TRT : chr "Control" "Control" "Control" "Control" ... ID Deaths Total Center TRT 1 3 39 1 Control 2 14 116 2 Control 3 11 93 3 Control 4 127 1520 4 Control 5 27 365 5 Control 6 6 52 6 Control 7 152 939 7 Control 8 48 471 8 Control 9 37 282 9 Control 10 188 1921 10 Control Polyp Randomized Trial #> 'data.frame': 20 obs. of 4 variables: #> $ subject: int 1 2 3 4 5 6 7 8 9 10 ... #> $ number : num 63 2 28 17 61 1 7 15 44 25 .Estimating Article Details
https://lesslikely.com/projects/reading-time/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/projects/reading-time/ Word + Code Count Reading Time Images All those numbers you see in the detail section of articles such as reading time, word count, figures, code blocks, I made them all up. Okay not really, see below.
Word + Code Count A lot of those numbers are calculated using some Go functions (the language that this site is built on) to count the number of words after separating them from code blocks. Now one may wonder, how is that possible given that they’re practically dispersed throughout every article and are also composed of words.
{{ $wordCount := replaceRE `(?s)<code class="hljs">.*?</code>` "" .Content | countwords }} {{ $readingTime := div (float $wordCount) 600 | math.Ceil }} {{ $codeCount := .Content| strings.Count "</code>" }} I use the following function to detect code blocks and ignore them in the word count and then I take the total count of words and divide it by a number I think most humans should be reading it at. See below.
Reading Time If we assume that the average reader reads approximately 600 words within a minute, then we can safely assume that the estimated reading time listed above the post is accurate, which is calculated by taking the word count and dividing it by 600.
However, given that no one truly reads this quickly, especially when it comes to reading content that is dense both in content and statistics, which requires careful reading, such an estimated reading time should probably be multiple by a factor of 4 or in essence, the reading speed would be 150 words a minute, which may be slow but will allow better comprehension.
My random guess on this seems to be somewhat close to some inquiries into the matter which deem that around 200–230 words per minute is the average for reading comprehension. Whether the results of these studies is trustworthy is another question, and perhaps something to look into one day.
Hence, this function of providing estimated reading time serves no function besides to fool the reader into thinking this can be consumed faster than it can in the age of the internet.
Images I use this particular go function to detect images throughout each article, however, it can be tricky because images can either come in the form of <img> or <figure>, or something else entirely, which can give imprecise results.
{{ $imgCount := .Content| strings.Count "</figure>" }} Glossary
https://lesslikely.com/glossary/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/glossary/It's Looking Bad for Fish Oil and CVD
https://lesslikely.com/medicine/fish-oil-cvd/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/medicine/fish-oil-cvd/Fish oil supplements have had a long and controversial history. There was a lot of inconsistency in the research around supplementation and several organizations only added to this confusion. The incomplete reporting by media outlets have also added to this confusion.
Here, I’ll give a brief timeline of the evidence for fish oil and CVD outcomes.
1970s | A group of explorers noticed that people who consumed fish also had lower incidences of heart disease.
1980s-1990s | Early observational studies also saw a connection between fish consumption and heart disease risk.
1990s | A few randomized trials were conducted to see the effects of fish oil supplementation and CVD outcomes.
2002 | The American Heart Association put out a statement suggesting that fish oil supplementation was beneficial for fatal outcomes.
2002 - 2016 | Several randomized trials and systematic reviews found inconsistent evidence for the effects of fish oil supplementation and CVD outcomes.
2016 | The American Heart Association creates a scientific advisory group to look at all the evidence and concludes that despite the controversy, there was a small benefit from fish oil and it could be worth it for people who had a history of CVD events.
2018 | Large meta-analysis published in JAMA with 77,000 + participants found no statistically significant benefit from fish oil supplementation on real clinical endpoints.
Let’s look at the evidence from the recent JAMA meta-analysis.
Methods: Primary outcomes:
fatal CHD
nonfatal heart attacks
stroke
major vascular events
all-cause mortality
Inclusion Criteria:
Must be a randomized controlled trial
Studies must have a minimum of 500 participants
Studies must be at least one year long
Studies must use fish oil supplements
Characteristics of Included Studies
10 randomized trials included
8 are double-blind, 2 are open label.
Has a total of 77,000+ participants
Average age: 64 years
Average length of studies: 4.4 years
Used mainly up to 1 g/d of fish oil (ranges: 226 to 1800 mg/day of EPA and 0 to 1700 mg/day of DHA)
Results: No statistically significant association between fish oil supplementation and any of the primary outcomes.
However, the compatibility (confidence) intervals suggest very small effects.
Up to a 7% decrease in the rate ratio is compatible with the test model.
No significant differences amongst individual trials (worth remembering that these are exploratory and the analyses could be underpowered. This should not be taken as evidence for no effect.)
No significant associations found with subgroup analyses (same point as above)
Strengths:
High statistical power/increased precision for the main primary outcome
Only included studies of long duration
Included studies were mostly at low risk of bias
Limitations:
Used aggregated data instead of patient data
Did not get a chance to add data from the REDUCE-IT trial (which I cover here or the STRENGTH trial which are using much higher dosages. (3-4 g/d of EPA)
Conclusions: Up to 1 g/d of fish oil may not have that much to offer when it comes to preventing CVD events in individuals who have a history of CVD.Privacy Policy
https://lesslikely.com/privacy-policy/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/privacy-policy/What Personal Data We Collect And Why We Collect It Comments Media Contact Forms Cookies Embedded Content From Other Websites Analytics Who We Share Your Data With How Long We Retain Your Data What Rights You Have Over Your Data Where We Send Your Data Our Contact Information Additional Information How We Protect Your Data What Data Breach Procedures We Have In Places What Personal Data We Collect And Why We Collect It Comments This website uses Disqus to manage comments. Disqus’s entire privacy policy can be found here, but here’s what’s relevant to you. I’ve copied and pasted part of their privacy policy
THE DATA WE COLLECT ABOUT YOU
Personal data, or personal information, means any information about an individual from which that person can be identified. It does not include data where the identity has been removed (anonymous data).
We may collect, use, store and transfer different kinds of personal data about you which we have grouped together as follows:
Identity Data includes first name, last name, username or similar identifier, and date of birth.
Contact Data includes email address and telephone numbers, and mailing address.
Technical Data includes internet protocol (IP) address, unique Cookie ID, Device ID, your login data, browser type and version, time zone setting and location, browser plug-in types and versions, operating system and platform and other technology on the devices you use to access the Service.
Profile Data includes your username and password, your interests, preferences, feedback and survey responses, marital status, gender, title.
Here’s how they use the data that is collected:
HOW WE USE YOUR PERSONAL DATA
We will only use your personal data when the law allows us to. Most commonly, we will use your personal data in the following circumstances: Where we need to perform the contract we are about to enter into or have entered into with you.
Where it is necessary for our legitimate interests (or those of a third party) and your interests and fundamental rights do not override those interests.
Where we need to comply with a legal or regulatory obligation.
Where you have provided your consent to such use.
We have set out below, in a table format, a description of all the ways we plan to use your personal data, and which of the legal bases we rely on to do so. We have also identified what our legitimate interests are where appropriate.
Note that we may process your personal data for more than one lawful ground depending on the specific purpose for which we are using your data. Please contact us if you need details about the specific legal ground we are relying on to process your personal data where more than one ground has been set out in the table below.
Media If you upload images to the website, you should avoid uploading images with embedded location data (EXIF GPS) included. Visitors to the website can download and extract any location data from images on the website.
Contact Forms Cookies If you leave a comment on our site you may opt-in to saving your name, email address and website in cookies.Projects
https://lesslikely.com/projects/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/projects/ About - Page that describes the purpose of this blog, what I usually write about, and where to go on the blog to find something. Statistical Tools I’ve Developed concurve - An R and Stata package that constructs confidence distributions, surprisal functions, and likelihood functions. You can download a recent copy from CRAN or as a zip file from here or tar ball from here . \(S\)-value Calculator A Shiny calculator that converts observed \(P\)-values into \(S\)-values and offers an intuitive way to think about the observed data. Miscellaneous Content Blog Roll - A list of blogs/websites that I frequently read and subscribe to. Build - Details about my experience building this website and what it currently uses to stay afloat and serve content. Datasets - Some sample datasets that I have uploaded on the site for teaching purposes. Seductive Surrogates Can Be Deadly
https://lesslikely.com/medicine/seductive-surrogates/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/medicine/seductive-surrogates/In clinical trials, it’s not always possible to measure hard endpoints like cardiovascular disease events and cancer remission rates. Studies that use clinical outcomes often dichotomize these variables, and as a result, they need to have a large number of participants and be long in duration to detect differences between groups.
Again, this type of research is expensive and not always feasible. In many scenarios, a more practical alternative is to focus on intermediate markers. Intermediate markers are biomarkers associated with a clinical outcome.hulley2007?
For example, C-reactive protein (CRP) is a molecule that is strongly associated with inflammation, and you can often find that CRP levels increase when inflammation increases.bray2016w?,chew2012jets?,pepys2003jci? Because inflammation is associated with coronary heart disease (CHD)Frangogiannis2012-fo?,Hansson2005-wz?,Libby2006-ch?,Ruparelia2017-nf?, an investigator may choose to focus on decreasing levels of inflammation in a clinical trial (measured by CRP), rather than focus on how many CHD-related deaths the intervention prevents.
If the changes in an intermediate marker can robustly predict changes in a hard endpoint, and if it’s part of the primary pathway of the clinical outcome, then your biomarker can be considered a surrogate marker for that clinical outcome.hulley2007? (I’m not sure if CRP is a good surrogate outcome for CHD events, just used it as an example).
Low-density lipoprotein (LDL) is considered an excellent surrogate marker for CHD because reducing LDL levels also seems to reduce the number of CHD events.Cholesterol_Treatment_Trialists_CTT_Collaboration2015-ni?,Taylor2013-te? Unfortunately, this can all go wrong if the intermediate marker is associated with a clinical outcome, but is not involved in the causal pathway of the outcome and is confounded by other phenomena.
Torcetrapib A great example of this is the story of high-density lipoprotein (HDL) and myocardial infarction (heart attacks). Several studies had found associations between low levels of HDL, often considered “good cholesterol,” and heart attacks.Assmann1996-jd?,Curb2004-ee?,Gordon1977-le?,Gordon1989-mc?,Rahilly-Tierney2011-ox?,Sharrett2001-su?,Turner1998-gv?
So, it shouldn’t come as a surprise that a drug (torcetrapib) was produced by Pfizer that attempted to increase the amount of HDL with the hopes that it would reduce the number of CVD events.
A group of researchers administered the drug to thousands of patients.1 The drug was successful in changing the lipids of the participants to more favorable numbers. Patients who received the drug had a 24.9% decrease in LDL and a 72.1% increase in HDL. Seems pretty great. However, the number of deaths increased by 58% and the number of heart attacks increased by 21%.
A systematic review and meta-analysisBriel2009-yo? published in the BMJ a few years later concluded the following after pooling studies that focused on interventions that primarily increased HDL and interventions that primarily decreased LDL,
“Available data suggest that simply increasing the amount of circulating high-density lipoprotein cholesterol does not reduce the risk of coronary heart disease events, coronary heart disease deaths, or total deaths. The results support reduction in low-density lipoprotein cholesterol as the primary goal for lipid modifying interventions.”
SvenssonSvensson2013-vq? provides us with a very lovely table showing other scenarios for which a drug had a favorable effect on a surrogate marker but had a negative impact on the clinical outcome.Stat Cat Photos
https://lesslikely.com/cats/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/cats/ The following albums are pictures of my friend’s cats, Rex and Rocky, along with photos of my cat, Ace. They can often be found on Andrew Gelman’s blog, as the community there are huge fans of them.
Rex & Rocky II 10/19/2019 Rex & Rocky II 10/19/2019 Rex & Rocky III 10/19/2019 support
https://lesslikely.com/support/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/support/If you are a fan of this website and what it does you can become a patron (click image below) and help support it every month.
If you’d like to help the website just this once, you can buy it a coffee (click image below). I really do appreciate all and any help.Using Stata
https://lesslikely.com/statistics/stata/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/statistics/stata/Cite R Packages Session info Although concurve was originally designed to be used in R, it is possible to achieve very similar results in Stata. We can use some datasets that are built into Stata to show how to achieve this. I’ll use the Statamarkdown R package so that I can obtain Stata outputs using RMarkdown via my Stata 16 package.
First, let’s load the auto2 dataset which contains data about cars and their characteristics.
sysuse auto2 #> . sysuse au(1978 automobile data) Browse the data set in your data browser to get more familiar with some of the variables. Let’s say we’re interested in the relationship between miles per gallon and price. We could fit a very simple linear model to assess that relationship.
First, let’s visualize the data with a scatter plot.
sysuse auto2 #> . sysuse au(1978 automobile data) #> #> . scatter price mpg, mcolor(dkorange) scale( 0.70) #> #> . graph export "scatter.svg", replace #> file scatter.svg saved as SVG format scatter
That’s what our data looks like. Clearly there seems to be an inverse relationship between miles per gallon and price.
Now we could fit a very simple linear model with miles per gallon being the predictor and price being the outcome and get some estimates of the relationship.
sysuse auto2 #> . sysuse au(1978 automobile data) #> #> . regress price mpg #> #> Source | SS df MS Number of obs = 74 #> -------------+---------------------------------- F(1, 72) = 20.26 #> Model | 139449474 1 139449474 Prob > F = 0.0000 #> Residual | 495615923 72 6883554.48 R-squared = 0.2196 #> -------------+---------------------------------- Adj R-squared = 0.2087 #> Total | 635065396 73 8699525.97 Root MSE = 2623.7 #> #> ------------------------------------------------------------------------------ #> price | Coefficient Std. err. t P>|t| [95% conf. interval] #> -------------+---------------------------------------------------------------- #> mpg | -238.8943 53.07669 -4.50 0.000 -344.7008 -133.0879 #> _cons | 11253.06 1170.813 9.61 0.000 8919.088 13587.03 #> ------------------------------------------------------------------------------ That’s what our output looks like.
Our output also gives us 95% consonance (confidence) intervals by default. But suppose we wished to fit a fractional polynomial model and graph it and get the confidence bands, here’s what we would do.
sysuse auto2 #> . sysuse au(1978 automobile data) #> #> . mfp: glm price mpg #> #> Deviance for model with all terms untransformed = 1373.079, 74 observations #> #> Variable Model (vs.) Deviance Dev diff. P Powers (vs.) #> ---------------------------------------------------------------------- #> mpg Lin. FP2 1373.079 19.565 0.000+ 1 -2 -2 #> FP1 1356.927 3.413 0.182 -2 #> Final 1356.927 -2 #> #> #> Transformations of covariates: #> #> -> gen double Impg__1 = X^-2-.2204707671 if e(sample) #> (where: X = mpg/10) #> #> Final multivariable fractional polynomial model for price #> -------------------------------------------------------------------- #> Variable | -----Initial----- -----Final----- #> | df Select Alpha Status df Powers #> -------------+------------------------------------------------------ #> mpg | 4 1.0000 0.0500 in 2 -2 #> -------------------------------------------------------------------- #> #> Generalized linear models Number of obs = 74 #> Optimization : ML Residual df = 72 #> Scale parameter = 5533697 #> Deviance = 398426217.Website Design
https://lesslikely.com/build/
Mon, 01 Jan 0001 00:00:00 +0000https://lesslikely.com/build/Current Status Some History The WordPress Site Origins Migrating to Hugo and Static Sites This website is a passion of mine, I have spent a great deal of time customizing all aspects of it from the stylesheets to the functionality of buttons and menus both to make it as easy to read and be on as possible but also pleasing to look at. However, my main priority has always been designing it to be able to handle code and analyses efficiently and display those results to other colleagues. It’s lead me to do a lot of experimentation and make numerous changes over the years with regards to my workflow, and I still believe it is evolving but that it has matured substantially so I shall describe it as succinctly as possible.
Current Status This website is built on Hugo and the theme is completely custom built as are many of the go functions I use. It’s compiled using the R package blogdown and is also heavily composed of pandoc arguments. Almost all of my writing and editing of the website is done on RStudio and I host this website on Amazon Web Services using AWS Amplify.
I used to use AWS S3 Buckets in conjunction with CloudFront as the CDN, however, that was quite a hassle for things like website redirects, and it required a lot of work for not much in return. Amplify still uses CloudFront as the CDN and uses S3 buckets, but takes care of those on its own, so I am fairly content with the results at the moment.
My workflow basically involves writing in RMarkdown, compiling it via blogdown/hugo, checking to see what the results look like, and if everything is okay, I push it to GitHub, which triggers AWS Amplify to start building the site.
I generally use tools like Zotero for references, \(MathJaX\) for math/\(latex\), and Algolia for my search engine. If you want to know how I got to this point, you can see my history below when I first started this website and what prompted me to move to a static website generator like Hugo rather than stay with something like Wordpress.
Some History The WordPress Site Origins This website was originally built on WordPress and hosted on SiteGround. I had a lot of success with that configuration for a while after trying several other hosts (I’ve used Bluehost, DreamHost, and HostGator, which was the worst host I’ve ever used).
However, despite success with the Wordpress & Siteground configuration, it was largely frustrating to think about and maintain subscriptions for so many Wordpress plugins (I was maintaining subscriptions for my theme, a caching plugin, an image optimizer, an SEO plugin, and a page builder) on top of hosting.
I also had to frequently deal with updates and database errors, and the backend was often slow. The other frustration stemmed from the fact that such an environment was clearly not built for hosting content rich in \(\LaTeX\) and statistical outputs.
Migrating to Hugo and Static Sites So at the end of last year, I migrated to Hugo, a static website builder that can build an entire website in milliseconds.