Inference Overview

What do we want to do?

  • Estimation \(\Rightarrow\) Point estimate and confidence interval
  • Decision \(\Rightarrow\) Hypothesis test

Always ask

  • How many variables?
  • What types of variables?
  • What is the research question?

Different Options Inside generate()

The function generate allows three options for its type argument. Discussion of type = bootstrap, type = draw, and type = permute is available here.

  • type = permute: shuffle the data without replacement
    • for hypothesis testing (HT) on a difference in the outcome between groups
    • example: HT for a difference in proportions of yawners in the treatment and the control group
  • type = draw: sample from a theoretical distribution
    • only for HT on a single proportion
    • example: HT for proportion of the number of heads in coin flips
  • type = bootstrap: re-sample the original data with replacement
    • for confidence intervals (CI) or for HT on a single mean / median
    • example: CI and HT for the true mean rent of one-bedroom apartments in Manhattan

Push Ups and Pull Ups

First load the relevant packages:


Today’s dataset push_pull comes from a “mini study” by mountain tactical institute.

push_pull <- read_csv("data/push_pull.csv")
push_pull %>%
  slice(1:3, 24:26)
## # A tibble: 6 × 7
##   participant_id   age push1 push2 pull1 pull2 training
##            <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>   
## 1              1    41    41    45    16    17 density 
## 2              2    32    35    44     9    11 density 
## 3              3    44    33    38    10    11 density 
## 4             24    36    31    60     9    15 gtg     
## 5             25    50    35    42     9    12 gtg     
## 6             26    34    23    39     9    13 gtg

26 individuals completed 1 of 2 exercise regiments for 3.5 weeks to increase their push ups and pull ups. See the codebook below:

  • participant_id: unique identifier for each participant
  • age: age of participant
  • push1 / push2: push-ups at beginning and end of program, respectively
  • pull1 / pull2: pull-ups at beginning and end of program, respectively
  • training: which training protocol the individual participated in - either “density” or “gtg” (grease-the-groove)

We create new variables for relative change in push-ups and pull-ups before and after the training. Recall, relative change = (new - old)/old.

push_pull <- push_pull %>%
  mutate(rel_change_push = (push2 - push1)/push1, 
         rel_change_pull = (pull2 - pull1)/pull1)

Part 1: Is the average relative change in pull-ups of a gtg trainee significantly greater than a density trainee?

In other words, we wonder if the group variable training affects the average relative increase in pull-ups. Let \(\mu_{den}\) and \(\mu_{gtg}\) be the true average relative change in pull-ups among density trainees and gtg trainees, respectively.

Let’s perform a hypothesis testing.

Q - Step 1: State the null hypothesis and the alternative hypothesis both in words and math.

Q - Step 2: Find the relevant statistic from the data.

mu_hat_diff <- push_pull %>% 
  group_by(____) %>% 
  summarize(mu_hat = ____) %>% 
  pull(mu_hat) %>% 

Step 3: Simulate from the null distribution and compute the p-value.

Q - Which of the type option in the generate() function is the most appropriate in this case? Why?

Q - Complete the code chunk below to simulate 10,000 sample statistics under the null hypothesis. Hint: check the help page of calculate() for its stat argument.


null_dist <- push_pull %>% 
  specify() %>%
  hypothesize() %>% 
  generate() %>%

Q - Visualize the p-value region under the null hypothesis with informative labels and compute the p-value.

visualize(null_dist) +
  labs(x = "_____", 
       y = "Count") +
  shade_pvalue(obs_stat = _____, direction = _____) 
pvalue1 <- null_dist %>%
  get_pvalue(obs_stat = ______, direction = _____) 

Q - Step 4: State your conclusion with \(\alpha = 0.01\).

Part 2. Most people who train consistently will see at least a 15% increase in push-ups

Q - State the null hypothesis and the alternative hypothesis.

Q - Create a binary outcome over15 which takes TRUE if rel_change_push is larger than 0.15 and FALSE otherwise.

Q - Find the sample proportion \(\hat{p}\).


Q - Which of the type option in the generate() function is the most appropriate in this case? Why?

Q - Perform a hypothesis test, compute the p-value, and state your conclusion with \(\alpha = 0.05\).


null2_dist <- push_pull %>% 
  specify() %>%
  hypothesize() %>% 
  generate() %>%

visualize(null2_dist) +
  labs(x = "Sample proportion of people with a 15% increase in push-ups", 
       y = "Count") +
  shade_pvalue(obs_stat = _______, direction = _______) 
pvalue2 <- null2_dist %>%
  get_pvalue(obs_stat = _______, direction = _______) 

Now we will construct a confidence interval to evaluate the hypotheses above.

Q - Which of the type option in the generate() function is the most appropriate in this case? Why?

Q - Simulate from a bootstrap distribution with reps = 10000 and visualize the distribution. What is it centered at?


Q - We want to construct a confidence interval at a confidence level equivalent to the significance level of \(\alpha = 0.05\). What do you think the confidence level should be? Hint: The alternative hypothesis is one-sided.

Q - Construct a confidence interval with the confidence level equivalent to \(\alpha = 0.05\). Interpret the confidence interval. Is the conclusion drawn from the confidence interval consistent with the conclusion from the hypothesis test?

