SPRINT invalid instrument simulation

library(readr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(timereg)

## Loading required package: survival

set.seed(32343)

baseline <- read_csv(file = "~/phd/sfg/nejm-sprint/data/baseline.csv") %>%
  mutate(SMOKER=SMOKE_3CAT >= 2 & SMOKE_3CAT <= 4)

## Parsed with column specification:
## cols(
##   .default = col_integer(),
##   MASKID = col_character(),
##   RISK10YRS = col_double(),
##   EGFR = col_double(),
##   SCREAT = col_double(),
##   RACE4 = col_character(),
##   UMALCR = col_double(),
##   BMI = col_double()
## )

## See spec(...) for full column specifications.

outcomes <- read_csv(file = "~/phd/sfg/nejm-sprint/data/outcomes.csv")

## Parsed with column specification:
## cols(
##   .default = col_integer(),
##   MASKID = col_character()
## )
## See spec(...) for full column specifications.

bp <- read_csv(file = "~/phd/sfg/nejm-sprint/data/bp.csv") %>%
  mutate(VISITTIME = ifelse(VISITCODE=="RZ", 0, strtoi(substr(VISITCODE, 1, nchar(VISITCODE)-1)))) %>%
  arrange(MASKID, desc(VISITTIME)) %>%
  group_by(MASKID) %>%
  filter(row_number()==1) %>%
  ungroup()

## Parsed with column specification:
## cols(
##   MASKID = col_character(),
##   VISITCODE = col_character(),
##   SBP = col_integer(),
##   DBP = col_integer(),
##   N_BPCLASSES = col_integer()
## )

data <- inner_join(x=baseline, y=bp, by="MASKID")
data <- inner_join(x=data, y=outcomes, by="MASKID")

bp_at_start <- read_csv(file = "~/phd/sfg/nejm-sprint/data/bp.csv") %>%
  filter(VISITCODE=="RZ")

## Parsed with column specification:
## cols(
##   MASKID = col_character(),
##   VISITCODE = col_character(),
##   SBP = col_integer(),
##   DBP = col_integer(),
##   N_BPCLASSES = col_integer()
## )

data_at_start <- inner_join(x=baseline, y=bp_at_start, by="MASKID")
data_at_start <- inner_join(x=data_at_start, y=outcomes, by="MASKID")

First, let’s see what an Aalen additive hazard model’s coefficients would look like on the true outcomes.

model_if_just_sbp <- aareg(
  Surv(T_PRIMARY, EVENT_PRIMARY) ~ SBP.y + AGE +
    SMOKER + SCREAT + STATIN + HDL,
  data=data)
plot(model_if_just_sbp)

coefs <- summary(model_if_just_sbp, maxtime=1100)$table[,"slope"]
coefs <- split(coefs, names(coefs))

Their methods section says they adjusted for

age, sex, race, body mass index (BMI), smoking, high density lipoprotein (HDL), serum creatinine, use of statin drugs, history of cardiovascular disease, and history of chronic kidney disease

Their full methods for CVD risk:

We estimated a recently validated two-stage additive hazards model to examine the incremental effect of antihypertensive drugs on major cardiovascular events and serious adverse events. In the first stage, we estimated the predicted number of antihypertensive drug classes as a linear function of randomization status and covariates. In the second stage, we estimated cardiovascular or adverse risk as a function of the predicted number of antihypertensive drug classes (from the first stage) and covariates. We estimated Aalen additive hazards models in the second stage to account for the rightcensored nature of survival outcomes. To account for the combined statistical uncertainty of the two stages, we implemented a bootstrap estimator and based statistical inference on 95% confidence intervals derived from 2000 nonparametric bootstraps.

We estimated fully non-parametric additive hazards models and based parameter estimates on the weighted linear regression of the cumulative estimates plot. We did so to provide a useful measure of the overall size of the effect.

Let’s give it a try, first using the true event data:

first_stage <- lm(
  N_BPCLASSES ~ INTENSIVE + AGE +
    FEMALE + BMI + SMOKE_3CAT + SCREAT +
    STATIN + SUB_CKD + SUB_CVD + HDL,
  data = data)

second_stage_data <- data %>%
  mutate(predicted_classes = predict(first_stage, data))

second_stage <- aareg(
  Surv(T_PRIMARY / 365.25, EVENT_PRIMARY) ~ predicted_classes +
  AGE + FEMALE + BMI + SMOKER + SCREAT + STATIN + SUB_CKD + SUB_CVD + HDL,
  data = second_stage_data)

summary(second_stage, maxtime = 4e8)$table["predicted_classes", "slope"] * 1000

## [1] -6.666611

This effect size is similar to what is reported in the paper. It’s not clear from the documentation why one should use the slope instead of the coef, but it is consistent with what they do in their analysis.

Next, let’s consider what happens if we simulate event times, according to the additive hazards model that we built using the real data. We will adjust the coefficients a little bit, to deal with overfitting, so that the simulation matches the data more closely in terms of its SBP dependence.

Print the coefficients to include in our response:

coefs$Intercept- 0.1 * coefs$SBP.y * 130

##     Intercept 
## -0.0002172743

1.1 * coefs$SBP.y

##        SBP.y 
## 3.334069e-07

coefs$AGE

##         AGE 
## 3.02478e-06

coefs$SMOKERTRUE

##   SMOKERTRUE 
## 2.107987e-05

coefs$SCREAT

##       SCREAT 
## 4.989049e-05

coefs$STATIN

##       STATIN 
## 1.181723e-05

coefs$HDL

##           HDL 
## -6.505417e-07

Now, simulate the outcome, and confirm that the coefficients of the model refit to this data are similar to what was fit to the original SPRINT data, to make sure our correction for finite sample bias was appropriate.

simulate_outcome_time <- function(X) {
  n <- nrow(X)
  # pmax prevents negative hazards, which in the simulation
  # could happen due to chance. Obviously a hazard cannot be
  # negative, in reality.
  h <- pmax(coefs$Intercept - 0.1 * coefs$SBP.y * 130 +
              1.1 * X$SBP.y * coefs$SBP.y + X$AGE * coefs$AGE +
              X$SMOKER * coefs$SMOKERTRUE + X$SCREAT * coefs$SCREAT +
              X$STATIN * coefs$STATIN + X$HDL * coefs$HDL,
            1/(60*365.25))
  v <- runif(n)
  H <- - log(v)
  t <- H / h
  return(t)
}

second_stage_data <- second_stage_data %>%
  mutate(survival.time=simulate_outcome_time(.))

model_if_just_sbp_check <- aareg(
  Surv(survival.time) ~ SBP.y + AGE +
    SMOKER + SCREAT + STATIN + HDL,
  data=second_stage_data)

summary(model_if_just_sbp, maxtime = 8e4)$table[,"coef"]

##     Intercept         SBP.y           AGE    SMOKERTRUE        SCREAT 
## -4.150633e-04  5.881951e-07  5.691974e-06  4.407906e-05  1.097293e-04 
##        STATIN           HDL 
##  1.570840e-05 -1.346466e-06

summary(model_if_just_sbp_check, maxtime = 8e4)$table[,"coef"]

##     Intercept         SBP.y           AGE    SMOKERTRUE        SCREAT 
## -3.628722e-04  7.245109e-07  5.999746e-06  4.288925e-05  9.989580e-05 
##        STATIN           HDL 
##  2.464332e-05 -1.310953e-06

plot(model_if_just_sbp_check)

Due to potential model mis-specification, the simulated data set has higher impact of SBP on the outcome, which is why the coefficient has been inflated slightly.

What happens to the 2 stage least squares analysis when we use our simulated event times, which we know have nothing to do with the number of BP classes used, and only has to do with the SBP after 1 year.

second_stage <- survival::aareg(
  Surv(survival.time/365.25) ~ predicted_classes +
    AGE + FEMALE + BMI + SMOKER + SCREAT + STATIN +
    SUB_CKD + SUB_CVD + HDL,
  data = second_stage_data)

summary(second_stage, maxtime=4e8)$table["predicted_classes","slope"] * 1000

## [1] -5.222114

This is of a similar order of magnitude, and -6.67 versus -5.22 could be a matter of how well the relationship between SBP and outcome in the simulation matches the real outcomes in the SPRINT trial. Of course, this does not show that having multiple classes of drugs has no effect– to be sure, if lower SBP is achieved using multiple drugs, then this result is consistent with that being the case. However, it suggests that the IV is not meaningful in disentangling whether multiple classes of drugs is better than a single drug at a higher dosage.

Just to be sure, let’s check CIs with a nonparametric bootstrap.

library(boot)

## 
## Attaching package: 'boot'

## The following object is masked from 'package:survival':
## 
##     aml

boot_fn <- function(formula, data, indices) {
  data <- data[indices, ]
  stage1 <- lm(N_BPCLASSES ~ INTENSIVE + AGE +
                 FEMALE + BMI + SMOKE_3CAT + SCREAT +
                 STATIN + SUB_CKD + SUB_CVD + HDL,
               data = data)
  stage2_data <- data
  stage2_data$predicted_classes <- predict(first_stage, data)
  stage2 <- survival::aareg(formula, data=stage2_data)
  return(summary(stage2, maxtime=4e8)$table["predicted_classes","slope"] * 1000)
}

boot_obj <- boot(
  second_stage_data,
  boot_fn,
  # matches the number of available cores I had to
  # run this in a reasonable amount of time.
  21,
  formula = formula(
    Surv(survival.time/365.25) ~ predicted_classes +
      AGE + FEMALE + BMI + SMOKE_3CAT + SCREAT + STATIN +
      SUB_CKD + SUB_CVD + HDL),
  parallel = "multicore",
  ncpus=3)

boot.ci(boot_obj, type="norm")

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 21 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_obj, type = "norm")
## 
## Intervals : 
## Level      Normal        
## 95%   (-9.386, -1.528 )  
## Calculations and Intervals on Original Scale

This is significant, suggesting that one can get significant P-values to reject the null, even under the null, due to the fact that the IV is invalid.

Our concern with the plausiblity is that there may be other effects of randomization besides increased number of hypertension medications on cardiovascular outcomes through other approaches to reducing systolic blood pressure. If this were the case, one would expect to see differences between the mean blood pressure conditional on the number of classes of hypertension taken, however this does not seem to be the case,

data %>% group_by(N_BPCLASSES, INTENSIVE) %>%
  summarize(average_sbp=mean(SBP.y, na.rm = T), se=sd(SBP.y, na.rm = T)/sqrt(n())) %>%
  mutate(average_sbp = ifelse(INTENSIVE==1, average_sbp, -average_sbp)) %>%
  group_by(N_BPCLASSES) %>% summarize(diff_sbp = sum(average_sbp),
                                      se = sqrt(sum(se ^ 2))) %>%
  # Bonferroni corrected CIs
  mutate(ci.lower = diff_sbp - qnorm(1 - 0.025/8)*se, ci.upper = diff_sbp + qnorm(1 - 0.025/8)*se)

Conditional on the change in number of classes of blood pressure medication, the change in final systolic blood pressure is

inner_join(x=data, y=data_at_start, by="MASKID") %>%
  mutate(delta_n_bp = N_BPCLASSES.x - N_BPCLASSES.y, delta_bp = SBP.y.x - SBP.y.y) %>% 
  group_by(delta_n_bp, INTENSIVE.x) %>%
  summarize(average_sbp=mean(delta_bp, na.rm = T), se=sd(delta_bp, na.rm = T)/sqrt(n())) %>%
  mutate(average_sbp = ifelse(INTENSIVE.x==1, average_sbp, -average_sbp)) %>%
  group_by(delta_n_bp) %>% summarize(diff_sbp = sum(average_sbp),
                                      se = sqrt(sum(se ^ 2))) %>%
  # Bonferroni corrected CIs
  mutate(ci.lower = diff_sbp - qnorm(1 - 0.025/8)*se, ci.upper = diff_sbp + qnorm(1 - 0.025/8)*se)

In all groups, these confidence intervals include -14.4 mmHg. Let’s assume that participants in the intensive arm have a constant change of 14.4 mmHg in systolic blood pressure lower than baseline and assume that the change in number of blood pressure medication classes has no effect,

simulate_outcome_time_unexplained <- function(X) {
  n <- nrow(X)
  # pmax prevents negative hazards, which in the simulation
  # could happen due to chance. Obviously a hazard cannot be
  # negative, in reality.
  h <- pmax(coefs$Intercept - 0.1 * coefs$SBP.y * 130 +
              1.1 * (X$SBP.x - 14.4 * X$INTENSIVE) * coefs$SBP.y + X$AGE * coefs$AGE +
              X$SMOKER * coefs$SMOKERTRUE + X$SCREAT * coefs$SCREAT +
              X$STATIN * coefs$STATIN + X$HDL * coefs$HDL,
            1/(60*365.25))
  v <- runif(n)
  H <- - log(v)
  t <- H / h
  return(t)
}

second_stage_data_unexplained <- second_stage_data %>%
  mutate(survival.time=simulate_outcome_time_unexplained(.))

second_stage_unexplained <- survival::aareg(
  Surv(survival.time/365.25) ~ predicted_classes +
    AGE + FEMALE + BMI + SMOKER + SCREAT + STATIN +
    SUB_CKD + SUB_CVD + HDL,
  data = second_stage_data_unexplained)

summary(second_stage_unexplained, maxtime=4e8)$table["predicted_classes","slope"] * 1000

## [1] -2.825616

boot_obj_unexplained <- boot(
  second_stage_data_unexplained,
  boot_fn,
  # matches the number of available cores I had to
  # run this in a reasonable amount of time.
  21,
  formula = formula(
    Surv(survival.time/365.25) ~ predicted_classes +
      AGE + FEMALE + BMI + SMOKE_3CAT + SCREAT + STATIN +
      SUB_CKD + SUB_CVD + HDL),
  parallel = "multicore",
  ncpus=3)

boot.ci(boot_obj_unexplained, type="norm")

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 21 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_obj_unexplained, type = "norm")
## 
## Intervals : 
## Level      Normal        
## 95%   (-5.612, -0.597 )  
## Calculations and Intervals on Original Scale

The IV analysis is still confident that there was an effect of the number of blood pressure medication classes, even in a simulation that has removed potential such effects. It’s interesting to note that this constant 14 mmHg difference, independent of number of blood pressure treatment classes used, is consistent with most of the data and summary statistics presented here, except for rows 3 and 4 of the first table of differences in systolic BP for participants taking 3 and 4 classes of blood pressure treatment, respectively. For these two groups, it’s actually a little bit lower than the observed difference.