STAT 20: Introduction to Probability and Statistics

- Announcements
- Multiple Linear Regression Refresher
- Quiz Review (this week’s notes)
- Break
- Lab 2.2 (extended)

- Quiz 1 is Monday, in-class and covers all lectures from the beginning to the semester until today.

- Lab 2.2, Problem Set 4 and Problem Set 5 are due Tuesday 9am
- Make sure you follow Lab Submission Guidelines on Ed

- RQ: Introducing Probability due on Monday/Tuesday at 11:59pm; Probability unit begins next week

*Extra Practice*for Multiple Linear Regression added to the resources tab on the course home page.

- Head to
`pollev.com`

for a set of rapid-fire questions on last night’s notes.

- Head to
`pollev.com`

for a set of quiz-level questions pertaining to*Summarizing Numerical Associations*and*Multiple Linear Regression*.

Consider the following multiple linear regression model, which will be the subject of the next three review questions.

`01:00`

```
Call:
lm(formula = bill_depth_mm ~ bill_length_mm + body_mass_g + species,
data = penguins)
Coefficients:
(Intercept) bill_length_mm body_mass_g speciesChinstrap
10.33083 0.09484 0.00117 -0.90748
speciesGentoo
-5.80117
```

Which is the correct interpretation of the coefficient in front of **bill length**? *Select all that apply*.

`01:00`

```
Call:
lm(formula = bill_depth_mm ~ bill_length_mm + body_mass_g + species,
data = penguins)
Coefficients:
(Intercept) bill_length_mm body_mass_g speciesChinstrap
10.33083 0.09484 0.00117 -0.90748
speciesGentoo
-5.80117
```

Which is the correct interpretation of the coefficient in front of **Gentoo**?

`01:00`

```
Call:
lm(formula = bill_depth_mm ~ bill_length_mm + body_mass_g + species,
data = penguins)
Coefficients:
(Intercept) bill_length_mm body_mass_g speciesChinstrap
10.33083 0.09484 0.00117 -0.90748
speciesGentoo
-5.80117
```

How would this linear model best be visualized?

Consider the following linear regression output where the variable `school`

is categorical and the variable `hours_studied`

is numerical.

Coefficients | Estimate |
---|---|

`(Intercept)` |
2.5 |

`hours_studied` |
.2 |

`schoolCal` |
1 |

`schoolStanford` |
-1 |

- Say I wanted to create a data frame from the original
`edu`

dataframe which contains the minimum, median, and IQR for`hours_studied`

among each school. In order to do this, I make use of`group_by()`

followed by`summarize()`

. I save this data frame into an object called`GPA_summary`

.

What are the dimensions of `GPA_summary`

?

`01:00`

`05:00`

`40:00`