Introduction to Linear ModelsBora Jin1 / 27

Material

🎥 Watch The Language of Models

Slides

🎥 Watch Fitting and Interpreting Models

Slides

2 / 27

Today's Goal

Understand the language and notation of linear modeling
Use tidymodels and stat package to make inference under a linear regression model

3 / 27

Quiz

Q - What do we need models for?

4 / 27

Quiz

Q - What do we need models for?

Explain the relationship between variables
Make predictions (e.g., Amazon / Netflix recommendations)
We will focus on linear models (straight lines!) but there are many other types.

4 / 27

Quiz

Q - What is a predicted value?

5 / 27

Quiz

Q - What is a predicted value?

Output of a model function
Typical or expected value of the response variable conditioning on the explanatory variable
$\hat{y} = f (x)$

5 / 27

Quiz

Q - What is a predicted value?

6 / 27

Quiz

Q - What is a predicted value?

7 / 27

Quiz

Q - What is a residual?

8 / 27

Quiz

Q - What is a residual?

A measure of how far each observation is from its predicted value
residual = observed value - predicted value
$e = y - \hat{y}$

8 / 27

Quiz

Q - What is a residual?

9 / 27

Quiz

Q - What is a residual?

Where are the paintings with positive / negative residual relative to the fitted line?

10 / 27

Quiz

Q - What is a residual?

Where are the paintings with positive / negative residual relative to the fitted line?

11 / 27

Quiz

Q - What is a residual?

Where are the paintings with positive / negative residual relative to the fitted line?

What does a negative residual mean?

11 / 27

Quiz

Q - What is a residual?

Where are the paintings with positive / negative residual relative to the fitted line?

What does a negative residual mean? The predicted value is greater than the actual value.

11 / 27

Quiz

Q - What are some upsides and downsides of models?

12 / 27

Quiz

Q - What are some upsides and downsides of models?

Upsides:
- Can sometimes reveal patterns that are not evident in visualizations.

12 / 27

Quiz

Q - What are some upsides and downsides of models?

Upsides:
- Can sometimes reveal patterns that are not evident in visualizations.
Downsides:
- Might impose structures that are not really there.
- Be skeptical about modeling assumptions!

12 / 27

Quiz

Q - Models always entail uncertainty. Which part of the following visualization and table is relevant to uncertainty?

## # A tibble: 2 × 3
##   term        estimate std.error
##   <chr>          <dbl>     <dbl>
## 1 (Intercept)    3.62    0.254  
## 2 Width_in       0.781   0.00950

13 / 27

Quiz

Q - Models always entail uncertainty. Which part of the following visualization and table is relevant to uncertainty?

## # A tibble: 2 × 3
##   term        estimate std.error
##   <chr>          <dbl>     <dbl>
## 1 (Intercept)    3.62    0.254  
## 2 Width_in       0.781   0.00950

Uncertainty is as important as the fitted line, if not more.

13 / 27

Linear Model with a Single Predictor

We are interested in $β_{0}$ and $β_{1}$ in the following model:

$y_{i} = β_{0} + β_{1} x_{i} + ϵ_{i}$

$y_{i}$ : response variable value for the $i$ th observation
$β_{0}$ : population parameter for the intercept
$β_{1}$ : population parameter for the slope
$x_{i}$ : independent variable (or predictor, covariate) value for the $i$ th observation
- can be numeric or categorical
$ϵ_{i}$ : random error for the $i$ th observation

14 / 27

Linear Model with a Single Predictor

As usual, we have to estimate the true parameters with sample statistics:

${\hat{y}}_{i} = {\hat{β}}_{0} + {\hat{β}}_{1} x_{i}$

${\hat{y}}_{i}$ : predicted value for the $i$ th observation
${\hat{β}}_{0}$ : estimate for $β_{0}$
${\hat{β}}_{1}$ : estimate for $β_{1}$

15 / 27

Least Squares Regression

In the least squares regression the estimates are calculated in a way to minimize the sum of squared residuals. In other words, if I have $n$ observations and the $i$ th residual is $e_{i} = y_{i} - {\hat{y}}_{i}$ , then the fitted regression line minimizes $\sum_{i = 1}^{n} e_{i}^{2}$ .

Q - Why do we minimize the "squares" of the residuals?

16 / 27

Least Squares Regression

Q - Why do we minimize the "squares" of the residuals?

Some residuals are positive, and others are negative. If we just naively sum them, they will cancel each other.
Residuals are not good in both directions. We especially want to penalize residuals that are large in absolute magnitude.

16 / 27

Least Squares Regression

Q - Why do we minimize the "squares" of the residuals?

Some residuals are positive, and others are negative. If we just naively sum them, they will cancel each other.
Residuals are not good in both directions. We especially want to penalize residuals that are large in absolute magnitude.

Click to play with least squares regression!

16 / 27

Quiz

Q - What are some properties of the least squares regression?

17 / 27

Quiz

Q - What are some properties of the least squares regression?

The fitted line always goes through $(\bar{x}, \bar{y})$

$\bar{y} = {\hat{β}}_{0} + {\hat{β}}_{1} \bar{x}$

The fitted line has a positive slope ( ${\hat{β}}_{1}$ > 0) if $x$ and $y$ are positively correlated and a negative slope if they are negatively correlated. The line has 0 slope if they are not correlated.
The sum of the residuals is always zero: $\sum_{i = 1}^{n} e_{i} = 0$ .
The residuals and $x$ values are uncorrelated.

17 / 27

Quiz

Q - Based on the code and output below, write a model formula with parameter estimates.

linear_reg() %>%
  set_engine("lm") %>%
  fit(Height_in ~ Width_in, data = pp) %>%
  tidy()

## # A tibble: 2 × 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    3.62    0.254        14.3 8.82e-45
## 2 Width_in       0.781   0.00950      82.1 0

18 / 27

Quiz

Q - Based on the code and output below, write a model formula with parameter estimates.

linear_reg() %>%
  set_engine("lm") %>%
  fit(Height_in ~ Width_in, data = pp) %>%
  tidy()

## # A tibble: 2 × 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    3.62    0.254        14.3 8.82e-45
## 2 Width_in       0.781   0.00950      82.1 0

${\hat{h e i g h t}}_{i} = 3.62 + 0.781 \times w i d t h_{i}$

18 / 27

Quiz

Q - Interpret slope and intercept estimates in the context of data.

${\hat{h e i g h t}}_{i} = 3.62 + 0.781 \times w i d t h_{i}$

19 / 27

Quiz

Q - Interpret slope and intercept estimates in the context of data.

${\hat{h e i g h t}}_{i} = 3.62 + 0.781 \times w i d t h_{i}$

Slope: For each additional inch the painting is wider, the height is expected to be higher, on average, by 0.781 inches.
- Always remember, the slope is about correlation, not causation. We are not saying increasing one inch in width of a painting will cause it to become 0.781 inches taller.
Intercept: Paintings that are 0 inches wide are expected to be 3.62 inches high, on average.

19 / 27

Quiz

Q - Explain the code chunk below. Based on its output, write a model formula with parameter estimates.

landsALL is a categorical variable with the following two levels:

0: no landscape features
1: some landscape features

linear_reg() %>%
  set_engine("lm") %>%
  fit(Height_in ~ factor(landsALL), data = pp) %>%
  tidy()

## # A tibble: 2 × 5
##   term              estimate std.error statistic  p.value
##   <chr>                <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)          22.7      0.328      69.1 0       
## 2 factor(landsALL)1    -5.65     0.532     -10.6 7.97e-26

20 / 27

Quiz

Q - Explain the code chunk below. Based on its output, write a model formula with parameter estimates.

landsALL is a categorical variable with the following two levels:

0: no landscape features
1: some landscape features

linear_reg() %>%
  set_engine("lm") %>%
  fit(Height_in ~ factor(landsALL), data = pp) %>%
  tidy()

## # A tibble: 2 × 5
##   term              estimate std.error statistic  p.value
##   <chr>                <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)          22.7      0.328      69.1 0       
## 2 factor(landsALL)1    -5.65     0.532     -10.6 7.97e-26

${\hat{h e i g h t}}_{i} = 22.7 - 5.65 \times l a n d s A L L$

20 / 27

Quiz

Q - Interpret slope and intercept estimates in the context of data.

${\hat{h e i g h t}}_{i} = 22.7 - 5.65 \times l a n d s A L L$

21 / 27

Quiz

Q - Interpret slope and intercept estimates in the context of data.

${\hat{h e i g h t}}_{i} = 22.7 - 5.65 \times l a n d s A L L$

Slope: Paintings with landscape features are expected, on average, to be 5.65 inches shorter than paintings that without landscape features.
- Compares baseline level (landsALL = 0) to the other level (landsALL = 1)
- Q - How do you know which one is the baseline level?
Intercept: Paintings that don't have landscape features are expected, on average, to be 22.7 inches tall.

21 / 27

Quiz

Q - What happens in model fitting if a categorical variable has more than two levels?

22 / 27

Quiz

Q - What happens in model fitting if a categorical variable has more than two levels?

They are automatically encoded to dummy variables (e.g., 6 dummies for 7 levels).
Each coefficient describes the expected difference by the level compared to the baseline level.

22 / 27

Quiz

Q - Explain the code chunk below.

school_pntg is a categorical variable about school of paintings with 7 levels: A: Austrian, D\FL: Dutch/Flemish, F: French, G: German, I: Italian, S: Spanish, X: Unknown

linear_reg() %>%
  set_engine("lm") %>%
  fit(Height_in ~ school_pntg, data = pp) %>%
  tidy()

## # A tibble: 7 × 5
##   term            estimate std.error statistic p.value
##   <chr>              <dbl>     <dbl>     <dbl>   <dbl>
## 1 (Intercept)        14.0       10.0     1.40  0.162  
## 2 school_pntgD/FL     2.33      10.0     0.232 0.816  
## 3 school_pntgF       10.2       10.0     1.02  0.309  
## 4 school_pntgG        1.65      11.9     0.139 0.889  
## 5 school_pntgI       10.3       10.0     1.02  0.306  
## 6 school_pntgS       30.4       11.4     2.68  0.00744
## # … with 1 more row

23 / 27

Quiz

Q - Interpret slope and intercept estimates in the context of data.

school_pntg is a categorical variable about school of paintings with 7 levels: A: Austrian, D\FL: Dutch/Flemish, F: French, G: German, I: Italian, S: Spanish, X: Unknown

## # A tibble: 7 × 5
##   term            estimate std.error statistic p.value
##   <chr>              <dbl>     <dbl>     <dbl>   <dbl>
## 1 (Intercept)        14.0       10.0     1.40  0.162  
## 2 school_pntgD/FL     2.33      10.0     0.232 0.816  
## 3 school_pntgF       10.2       10.0     1.02  0.309  
## 4 school_pntgG        1.65      11.9     0.139 0.889  
## 5 school_pntgI       10.3       10.0     1.02  0.306  
## 6 school_pntgS       30.4       11.4     2.68  0.00744
## # … with 1 more row

24 / 27

Quiz

Q - Interpret slope and intercept estimates in the context of data.

school_pntg is a categorical variable about school of paintings with 7 levels: A: Austrian, D\FL: Dutch/Flemish, F: French, G: German, I: Italian, S: Spanish, X: Unknown

## # A tibble: 7 × 5
##   term            estimate std.error statistic p.value
##   <chr>              <dbl>     <dbl>     <dbl>   <dbl>
## 1 (Intercept)        14.0       10.0     1.40  0.162  
## 2 school_pntgD/FL     2.33      10.0     0.232 0.816  
## 3 school_pntgF       10.2       10.0     1.02  0.309  
## 4 school_pntgG        1.65      11.9     0.139 0.889  
## 5 school_pntgI       10.3       10.0     1.02  0.306  
## 6 school_pntgS       30.4       11.4     2.68  0.00744
## # … with 1 more row

Intercept: Austrian school (A) paintings are expected, on average, to be 14 inches tall.
Slope: French school (F) paintings are expected, on average, to be 10.2 inches taller than Austrian school paintings.
Q - How do you know which one is the baseline level?

24 / 27

Questions?

25 / 27

Let's Practice Together!

Go to AE 20: Introduction to Linear Models

26 / 27

Bulletin

Watch videos for Prepare: June 10
Lab07 due Friday, June 10 at 11:59pm
HW04 released
Don't forget HW02! It's due Thursday, June 16 at 11:59pm.
Project draft due Monday, June 13 at 11:59pm
Submit ae20 (~ Part 2 Question 2)

27 / 27

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Introduction to Linear Models

Bora Jin

Material

Today's Goal

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Linear Model with a Single Predictor

Linear Model with a Single Predictor

Least Squares Regression

Least Squares Regression

Least Squares Regression

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Quiz

Questions?

Let's Practice Together!

Bulletin

Material

Help