Multiple Linear Regression

STAT 20: Introduction to Probability and Statistics

Concept Questions

Estimate the coefficient

01:00



m1 <- lm(bill_depth_mm ~ bill_length_mm, data = penguins)




What will be the sign of the coefficient for bill_length_mm?

m1

Call:
lm(formula = bill_depth_mm ~ bill_length_mm, data = penguins)

Coefficients:
   (Intercept)  bill_length_mm  
      20.78665        -0.08233  

Estimate the coefficient, take two

01:00


m2 <- lm(bill_depth_mm ~ bill_length_mm + species, penguins)




What will be the sign of the coefficient for bill_length_mm? How many coefficients will be in this linear model?

m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
     (Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo  
         10.5653            0.2004           -1.9331           -5.1033  

Dummy Variable

A variable that is 1 if an observation takes a particular level of a categorical variable and 0 otherwise. A categorical variable with \(k\) levels can be encoded using \(k - 1\) dummy variables.

Interpreting coefficients

01:00


m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
     (Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo  
         10.5653            0.2004           -1.9331           -5.1033  

Which is the correct interpretation of the coefficient in front of Gentoo?