Multiple Linear Regression

STAT 20: Introduction to Probability and Statistics

Concept Questions

Estimate the coefficient

01:00



m1 <- lm(bill_depth_mm ~ bill_length_mm, data = penguins)




What will be the sign of the coefficient for bill_length_mm?

m1

Call:
lm(formula = bill_depth_mm ~ bill_length_mm, data = penguins)

Coefficients:
   (Intercept)  bill_length_mm  
      20.78665        -0.08233  

Estimate the coefficient, take two

01:00


m2 <- lm(bill_depth_mm ~ bill_length_mm + species, penguins)




What will be the sign of the coefficient for bill_length_mm? How many coefficients will be in this linear model?

m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
     (Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo  
         10.5653            0.2004           -1.9331           -5.1033  

Dummy Variable

A variable that is 1 if an observation takes a particular level of a categorical variable and 0 otherwise. A categorical variable with \(k\) levels can be encoded using \(k - 1\) dummy variables.

Interpreting coefficients

01:00


m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
     (Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo  
         10.5653            0.2004           -1.9331           -5.1033  

Which is the correct interpretation of the coefficient in front of Gentoo?

Concept Question 4

Consider the following linear regression output where the variable school is categorical and the variable hours_studied is numerical.

lm(GPA ~ hours_studied + school, data = edu)

 
 

Coefficients Estimate
(Intercept) 2.5
hours_studied .2
schoolCal 1
schoolStanford -1

Concept Question 4 (cont.)

  • Say I wanted to create a data frame from the original edu dataframe which contains the minimum, median, and IQR for hours_studied among each school. In order to do this, I make use of group_by() followed by summarize(). I save this data frame into an object called GPA_summary.

What are the dimensions of GPA_summary?

01:00