# Multiple Linear Regression

STAT 20: Introduction to Probability and Statistics

# Concept Questions

## Estimate the coefficient

01:00

m1 <- lm(bill_depth_mm ~ bill_length_mm, data = penguins)

What will be the sign of the coefficient for bill_length_mm?

m1

Call:
lm(formula = bill_depth_mm ~ bill_length_mm, data = penguins)

Coefficients:
(Intercept)  bill_length_mm
20.78665        -0.08233  

## Estimate the coefficient, take two

01:00

m2 <- lm(bill_depth_mm ~ bill_length_mm + species, penguins)

What will be the sign of the coefficient for bill_length_mm? How many coefficients will be in this linear model?

m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
(Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo
10.5653            0.2004           -1.9331           -5.1033  

Dummy Variable

A variable that is 1 if an observation takes a particular level of a categorical variable and 0 otherwise. A categorical variable with $k$ levels can be encoded using $k - 1$ dummy variables.

## Interpreting coefficients

01:00

m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
(Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo
10.5653            0.2004           -1.9331           -5.1033  

Which is the correct interpretation of the coefficient in front of Gentoo?