`01:00`

STAT 20: Introduction to Probability and Statistics

- Concept Questions
- Practice Problems

The following two questions are based on the diagram below. The diagram corresponds to a one-sided hypothesis test that uses the following null hypothesis: \(H_0: \mu = 1\)

If \(H_A\) represents a specific version of the alternate hypothesis that is true, what is the power of this test?

`01:00`

What is the p-value of this test?

`01:00`

Instead of constructing a confidence interval to learn about the parameter, we could assert the value of a parameter and see whether it is consistent with the data using a hypothesis test. Say you are interested in testing whether there is a clear majority opinion of support or opposition to the project.

What are the null and alternative hypotheses?

`01:00`

```
library(tidyverse)
library(infer)
library(stat20data)
ppk <- ppk %>%
mutate(support_before = Q18_words %in% c("Somewhat support",
"Strongly support",
"Very strongly support"))
obs_stat <- ppk %>%
specify(response = support_before,
success = "TRUE") %>%
calculate(stat = "prop")
obs_stat
```

```
Response: support_before (factor)
# A tibble: 1 × 1
stat
<dbl>
1 0.339
```

```
null <- ppk %>%
specify(response = support_before,
success = "TRUE") %>%
hypothesize(null = "point", p = .5) %>%
generate(reps = 500, type = "draw") %>%
calculate(stat = "prop")
null
```

```
Response: support_before (factor)
Null Hypothesis: point
# A tibble: 500 × 2
replicate stat
<fct> <dbl>
1 1 0.490
2 2 0.528
3 3 0.483
4 4 0.513
5 5 0.499
6 6 0.502
7 7 0.498
8 8 0.495
9 9 0.511
10 10 0.491
# ℹ 490 more rows
```

What would a Type I error be in this context?

`01:00`

What would a Type II error be in this context?

`01:00`

Learn why we don’t accept the null hypothesis.

Hypothesis tests have been shown to be valuable contributors to science (p < .05) but are sometimes abused (p < .05).

- Used to assess the degree to which data is consistent with a particular model.
- The most widely used tool in statistical inference.

Lay out your model(s).

\(H_0\): null model, business as usual

\(H_A\): alternative model, business not as usual

- Hypotheses are statments about the TRUE STATE of the world and should involve
*parameters*, not*statistics*. - Hypotheses should suggest a
*test statistic*that has some bearing on the claim. - The nature of \(H_A\) determines one- or two-sided tests; default to two.

Select a test statistic that bears on the null hypothesis.

- \(\bar{x}\)
- \(\hat{p}\)
- \(m\)
- \(r\)
- \(b_1\)

- \(\bar{x}_1 - \bar{x}_2\)
- \(\hat{p}_1 - \hat{p}_2\)
- \(m_1 - m_2\)
- \(\chi^2\)
*The list goes on…*

Construct the appropriate null distribution.

- Permutation (when
`null = "independence"`

) - Simulation (when
`null = "point"`

) - Normal Approximation

Calculate a measure of consistency between the observed test statistic (the data) and the null distribution (i.e., a p-value).

- If your observed test stat is in the tails
- low p-val
- data is inconsistent with null hypothesis
- “reject null hypothesis”.

- If your observed test stat is in the body
- high p-val
- data is consistent with the null hypothesis
- “fail to reject the null hypothesis”.

What can go wrong?

What geometries are in use in this graphic?

**A simplified model**

UHS tests a sample of the Cal community every week and monitors the positivity rate (proportion of tests that are positive). Assume this is a random sample of constant size and that the test is perfectly accurate. Let \(p\) be the positivity rate.

\(H_0\) \(\quad p = 3\%\)

The incidence of COVID at Cal is at a manageable level.

\(H_A\) \(\quad p > 3\%\)

The incidence of COVID at Cal is at an elevated level.

Decision protocol: if there is a big enough spike in a given week, shift classes to remote.

**Sample size, \(n\)**: with increasing \(n\), the variability of the null distribution will decrease.**Changing \(\alpha\)**: decreasing \(\alpha\) will decrease type I error but increase type II error.**Increasing**: change data collection process to separate the distribution under \(H_A\) and decrease type II error.*effect size*- Ex: If you’re testing whether a pain medicine provides pain relief, only conduct the test if using a medicine that you expect to have cause a dramatic decrease in pain.

Consider a setting where the Cal UHS testing system observes a positivity rate of 3.5% in a one week interval, double the previous week. Administration needs to decide whether or not to move to remote learning. Which error would be worse?

A. Moving to remote instruction when in fact the true number of cases on campus is still low.

B. Failing to move to remote instruction when in fact the true number of cases on campus is elevated.

`01:00`

**Power** is the probability that you will reject the null hypothesis if it is in fact false.

\[ P(\textrm{reject } H_0 | H_0 \textrm{ is false}) \]

The more power, the higher the probability of finding an effect.

https://upload.wikimedia.org/wikipedia/commons/6/66/PeopleBirding.JPG

Learn why we don’t accept the null hypothesis.