STAT 20: Introduction to Probability and Statistics

- Announcements
- Concept Questions
- Problem Set: Overfitting
- Lab: Cancer Diagnosis

- Problem Sets:
- PS 18:
*Overfitting*releases Tuesday and due next Tuesday at 9am - Extra Practice:
*Logistic Regression*releases Thursday (non-turn in)

- PS 18:

- Lab 6:
- Lab 6.1 releases Tuesday and due next Tuesday at 9am
- Lab 6.2 releases Thursday and due next Tuesday at 9am

- Quiz 4:
- next Monday in-class.
- covers
*Wrong By Design*through*Logistic Regression*(Thu/Fri)

Which one of these (open `pollev.com`

) is not an example of overfitting (either in real life or in statistics)?

Suppose I overfit my model to the training data. In which scenario (for which training data) would I expect the test set performance to be significantly worse? Assume that the testing sets A and B look like their corresponding training sets.

