# Method of Least Squares

STAT 20: Introduction to Probability and Statistics

## Agenda

• Linear Models Review
• Concept Questions
• Optimization with Algorithms
• Problem Set 6.1: Method of Least Squares

# Linear Models Review

Go to pollev.com and get ready for a kahoot.

# Concept Questions

## Scenario 1

An engineer working for Waymo self-driving cars is working to solve a problem. When it rains, reflections of other cars in puddles can disorient the self-driving car. Their team is working on a model to determine when the self-driving car is seeing a reflection of a car vs a real car.

Please identify A. the response, B. the predictor(s), and C. whether it is regression or classification (poll only has C.)

01:00

## Scenario 2

An analyst working for the UC Berkeley Admissions Office is working to help the university decide how many students to send offer letters to. They have a target class size (that fits within the budget and residence halls), but they’re not sure how many students will accept the offer. How many should they admit?

Please identify A. the response, B. the predictor(s), and C. whether it is regression or classification (poll only has C.)

01:00

# Fitting Predictive Models via Optimization

## Two approaches

#### Calculus

Certain models (like least squares) can be fit simply by taking partial derivatives, setting to 0, and solving.

#### Algorithms

There are many iterative algorithms that accomplish the same task, some better than others. Two examples:

• Gradient Descent: the most-used algorithm currently. Used to fit deep learning models.
• Nelder-Mead: an older and more general (and generally not as reliable!) algorithm.

The downhill simplex method now takes a series of steps, most steps just moving the point of the simplex where the function is largest (“highest point”) through the opposite face of the simplex to a lower point. These steps are called reflections, and they are constructed to conserve the volume of the simplex (and hence maintain its nondegeneracy). When it can do so, the method expands the simplex in one or another direction to take larger steps. When it reaches a “valley floor”, the method contracts itself in the transverse direction and tries to ooze down the valley. If there is a situation where the simplex is trying to “pass through the eye of a needle”, it contracts itself in all directions, pulling itself in around its lowest (best) point. (from Wikipedia)

## Nelder-Mead on a simple function

Can we use Nelder-Mead to find the mimimum value of this function (with zero calculus)?

$f(x) = \left(x + .5 \right)^2$

## Writing a new function in R

f <- function(x) {
(x + .5)^2
}
1. Functions are created with function() and assigned to an object (here, our new function is f())
2. The arguments go inside the parens of function()

## Writing a new function in R

f <- function(x) {
(x + .5)^2
}
1. Functions are created with function() and assigned to an object (here, our new function is f())
2. The arguments go inside the parens of function()
3. The guts of the function goes between {}.
4. Once you run this function once, you’ll have access to f() in your environment.

## Finding values of $f(x)$

$f(x) = \left(x + .5 \right)^2$

f <- function(x) {
(x + .5)^2
}

## Finding values of $f(x)$

$f(x) = \left(x + .5 \right)^2$

f <- function(x) {
(x + .5)^2
}

f(x = 0)
 0.25

## Finding values of $f(x)$

$f(x) = \left(x + .5 \right)^2$

f <- function(x) {
(x + .5)^2
}

## Finding values of $f(x)$

$f(x) = \left(x + .5 \right)^2$

f <- function(x) {
(x + .5)^2
}

f(x = .75)
 1.5625

## Finding minimum value of $f(x)$

$f(x) = \left(x + .5 \right)^2$

f <- function(x) {
(x + .5)^2
}

optim(par = .5, fn = f)
$par  -0.4$value
 0.01

$counts function gradient 12 NA$convergence
 0

$message NULL ## Notes on optim() optim(par = .5, fn = f) • The function to optimize is passed to fn. You provide a starting point for the algorithm with par (which must be a scalar or a vector). • This is a random algorithm - each time you run it you’ll get a (slightly) different answer. • The best guess of the algorithm will be returned as $par

optim’s guess: -0.4

25:00