# Multiple Linear Regression

STAT 20: Introduction to Probability and Statistics

# Concept Questions

## Estimate the coefficient

01:00

m1 <- lm(bill_depth_mm ~ bill_length_mm, data = penguins)

What will be the sign of the coefficient for bill_length_mm?

m1

Call:
lm(formula = bill_depth_mm ~ bill_length_mm, data = penguins)

Coefficients:
(Intercept)  bill_length_mm
20.78665        -0.08233  

## Estimate the coefficient, take two

01:00

m2 <- lm(bill_depth_mm ~ bill_length_mm + species, penguins)

What will be the sign of the coefficient for bill_length_mm? How many coefficients will be in this linear model?

m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
(Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo
10.5653            0.2004           -1.9331           -5.1033  

Dummy Variable

A variable that is 1 if an observation takes a particular level of a categorical variable and 0 otherwise. A categorical variable with $k$ levels can be encoded using $k - 1$ dummy variables.

## Interpreting coefficients

01:00

m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + species, data = penguins)

Coefficients:
(Intercept)    bill_length_mm  speciesChinstrap     speciesGentoo
10.5653            0.2004           -1.9331           -5.1033  

Which is the correct interpretation of the coefficient in front of Gentoo?

## Concept Question 4

Consider the following linear regression output where the variable school is categorical and the variable hours_studied is numerical.

lm(GPA ~ hours_studied + school, data = edu)

Coefficients Estimate
(Intercept) 2.5
hours_studied .2
schoolCal 1
schoolStanford -1

## Concept Question 4 (cont.)

• Say I wanted to create a data frame from the original edu dataframe which contains the minimum, median, and IQR for hours_studied among each school. In order to do this, I make use of group_by() followed by summarize(). I save this data frame into an object called GPA_summary.

What are the dimensions of GPA_summary?

01:00