01:00
Which of these variables do you expect to be uniformly distributed?
Please vote at pollev.com
.
01:00
It depends on your desiderata: the nature of your data and what you seek to capture in your summary.
Get out a piece of paper. You’ll be watching a 3 minute video that discusses characteristics of a typical human. Note which numerical summaries are used and what for.
But there are other notions of typical…
There are two new food delivery services that open in Berkeley: Oski Eats and Cal Cravings. A friend of yours that took Stat 20 collected data on each and noted that Oski Eats has a mean delivery time of 29 minutes and Cal Cravings a mean delivery time of 27 minutes. Which would would you rather order from?
Would you still prefer to order from Cal?
You can construct a statistical graphic to show the shape, which you can describe in terms of modality and skew… you can calculate a measure of center to convey a sense of a typical observation…and you can calculate a measure of spread to capture how much variability there is in the data.
We construct tools (statistics, graphics) that produce useful summaries of raw data.
How can we express the variability in this data set using a single number?
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Desiderata
\[ {\Large 6} \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad {\Large 11}\]
\[\textrm{range:} \quad max - min\]
\[ 11 - 6 = 5\]
Characteristics
\[ 6 \quad 7 \quad {\Large 7 \quad 7} \quad 8 \quad {\large 8} \quad 9 \quad {\Large 9 \quad 10} \quad 11 \quad 11\]
The difference between the median of the larger half of the sorted data set, \(Q_3\), and the median of the smaller half, \(Q_1\).
\[\textrm{IQR:} \quad Q_3 - Q_1\]
\[ 9.5 - 7 = 2.5 \]
Characteristics
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Take the differences from each observation, \(x_i\), to the sample mean, \(\bar{x}\), take their absolute values, add them up, and divide by \(n\) .
\[MAD: \quad \frac{1}{n}\sum_{i = 1}^n |x_i - \bar{x}| \]
\[ MAD = 1.4 \]
Characteristics
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Take the differences from each observation, \(x_i\), to the sample mean, \(\bar{x}\), square them, add them up, and divide by \(n - 1\) .
\[s^2: \quad \frac{1}{n - 1}\sum_{i = 1}^n (x_i - \bar{x})^2 \]
\[ s^2 = 2.87 \]
Characteristics
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Take the differences from each observation, \(x_i\), to the sample mean, \(\bar{x}\), square them, add them up, divide by \(n - 1\), then take the square root.
\[s: \quad \sqrt{\frac{1}{n - 1}\sum_{i = 1}^n (x_i - \bar{x})^2} \]
\[ s = 1.70 \]
Characteristics
service | range | IQR | var | sd |
---|---|---|---|---|
cal | 37.4 | 9.9 | 62.9 | 7.9 |
oski | 6.5 | 3.9 | 4.3 | 2.1 |
Desiderata