# Hypothesis Testing

STAT 20: Introduction to Probability and Statistics

## Agenda

• Concept Questions
• Problem Set: Is Yawning Contagious
• Appendix: Hypothesis Testing with infer

# Concept Questions

Which of the following statements below represents claims that correspond to a null hypothesis (as opposed to an alternative hypothesis)?

Hint: try to write them using parameters (statements about means / proportions / etc)

A. King cheetahs on average run the same speed as standard spotted cheetahs.

B. For a particular student, the probability of correctly answering a 5-option multiple choice test is larger than 0.2 (i.e., better than guessing).

C. The mean length of African elephant tusks has changed over the last 100 years.

D. The risk of facial clefts is equal for babies born to mothers who take folic acid supplements compared with those from mothers who do not.

E. Mean birth weight of newborns is dependent on caffeine intake during pregnancy.

F. The probability of getting in a car accident is the same if using a cell phone than if not using a cell phone.

01:00

We want to understand whether blood thinners are helpful or harmful. We’ll consider both of these possibilities using a two-sided hypothesis test.

Null: Blood thinners do not have an overall survival effect, i.e., the survival proportions are the same in each group.

Alternative: Blood thinners have an impact on survival, either positive or negative, but not zero.

What is your guess at the p-value?

01:00

A pharmaceutical company developed a new treatment for eczema and performed a hypothesis test to see if it worked better than the company’s old treatment. The P-value for the test was $10\%$. Which of the following statements are true?

A. The probability that the null hypothesis is false is $10\%$.

B. The probability that the null hypothesis is false is $90\%$.

C. The P-value of about $10\%$ was computed assuming that the null hypothesis was true.

D. The new drug is significantly better than the old.

E. The alternative hypothesis is 10 times more likely the null.

01:00

# Problem Set: Is Yawning Contagious?

20:00

# Example: Class Survey

Question: Do beach lovers prefer the warm seasons more than mountain lovers?

Load packages and data.

library(tidyverse)
library(stat20data)

Create new column with just two levels and drop NAs.

class_survey <- class_survey %>%
mutate(warm_fav = season%in% c("Summer", "Fall")) %>%
drop_na(beach_or_mtns, warm_fav)

## Visualizing the data

What sort of visualization can we use to see the association between these two variables?

class_survey %>%
select(beach_or_mtns, warm_fav)
# A tibble: 619 × 2
beach_or_mtns    warm_fav
<chr>            <lgl>
1 At the beach     TRUE
2 At the beach     TRUE
3 In the mountains FALSE
4 At the beach     TRUE
5 In the mountains FALSE
6 At the beach     TRUE
7 In the mountains FALSE
8 At the beach     FALSE
9 At the beach     FALSE
10 At the beach     TRUE
# ℹ 609 more rows
ggplot(class_survey, aes(x = beach_or_mtns,
fill = warm_fav)) +
geom_bar(position = "fill")

## Answering with a statistic

Question: Do beach lovers prefer the warm seasons more than mountain lovers?

library(infer)
obs_stat <- class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
calculate(stat = "diff in props")
obs_stat
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
# A tibble: 1 × 1
stat
<dbl>
1 0.231

We see the difference is non-zero, but could that just be a product of this particular small sample of data that we have?

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE")  
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
# A tibble: 619 × 2
warm_fav beach_or_mtns
<fct>    <fct>
1 TRUE     At the beach
2 TRUE     At the beach
3 FALSE    In the mountains
4 TRUE     At the beach
5 FALSE    In the mountains
6 TRUE     At the beach
7 FALSE    In the mountains
8 FALSE    At the beach
9 FALSE    At the beach
10 TRUE     At the beach
# ℹ 609 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence")  
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
Null Hypothesis: independence
# A tibble: 619 × 2
warm_fav beach_or_mtns
<fct>    <fct>
1 TRUE     At the beach
2 TRUE     At the beach
3 FALSE    In the mountains
4 TRUE     At the beach
5 FALSE    In the mountains
6 TRUE     At the beach
7 FALSE    In the mountains
8 FALSE    At the beach
9 FALSE    At the beach
10 TRUE     At the beach
# ℹ 609 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 1,
type = "permute")
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
Null Hypothesis: independence
# A tibble: 619 × 3
# Groups:   replicate [1]
warm_fav beach_or_mtns    replicate
<fct>    <fct>                <int>
1 TRUE     At the beach             1
2 TRUE     At the beach             1
3 FALSE    In the mountains         1
4 FALSE    At the beach             1
5 TRUE     In the mountains         1
6 FALSE    At the beach             1
7 FALSE    In the mountains         1
8 TRUE     At the beach             1
9 TRUE     At the beach             1
10 TRUE     At the beach             1
# ℹ 609 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 1,
type = "permute") # a second shuffle
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
Null Hypothesis: independence
# A tibble: 619 × 3
# Groups:   replicate [1]
warm_fav beach_or_mtns    replicate
<fct>    <fct>                <int>
1 FALSE    At the beach             1
2 TRUE     At the beach             1
3 FALSE    In the mountains         1
4 TRUE     At the beach             1
5 FALSE    In the mountains         1
6 FALSE    At the beach             1
7 TRUE     In the mountains         1
8 TRUE     At the beach             1
9 TRUE     At the beach             1
10 FALSE    At the beach             1
# ℹ 609 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 1,
type = "permute") # a third shuffle
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
Null Hypothesis: independence
# A tibble: 619 × 3
# Groups:   replicate [1]
warm_fav beach_or_mtns    replicate
<fct>    <fct>                <int>
1 TRUE     At the beach             1
2 TRUE     At the beach             1
3 FALSE    In the mountains         1
4 TRUE     At the beach             1
5 FALSE    In the mountains         1
6 FALSE    At the beach             1
7 TRUE     In the mountains         1
8 TRUE     At the beach             1
9 FALSE    At the beach             1
10 FALSE    At the beach             1
# ℹ 609 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 500,
type = "permute")
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
Null Hypothesis: independence
# A tibble: 309,500 × 3
# Groups:   replicate [500]
warm_fav beach_or_mtns    replicate
<fct>    <fct>                <int>
1 FALSE    At the beach             1
2 FALSE    At the beach             1
3 TRUE     In the mountains         1
4 TRUE     At the beach             1
5 TRUE     In the mountains         1
6 FALSE    At the beach             1
7 FALSE    In the mountains         1
8 TRUE     At the beach             1
9 FALSE    At the beach             1
10 TRUE     At the beach             1
# ℹ 309,490 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 500,
type = "permute") %>%
calculate(stat = "diff in props")  
Response: warm_fav (factor)
Explanatory: beach_or_mtns (factor)
Null Hypothesis: independence
# A tibble: 500 × 2
replicate     stat
<int>    <dbl>
1         1 -0.0245
2         2  0.00394
3         3 -0.0174
4         4 -0.0316
5         5 -0.0103
6         6  0.0111
7         7 -0.0103
8         8 -0.0103
9         9  0.0537
10        10  0.00394
# ℹ 490 more rows

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 500,
type = "permute") %>%
calculate(stat = "diff in props") %>%
visualize()  

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 500,
type = "permute") %>%
calculate(stat = "diff in props") %>%
visualize() +
direction = "both")  

## Hypothesis Test Pipeline

class_survey %>%
specify(response = warm_fav,
explanatory = beach_or_mtns,
success = "TRUE") %>%
hypothesize(null = "independence") %>%
generate(reps = 500,
type = "permute") %>%
calculate(stat = "diff in props") %>%
get_p_value(obs_stat = obs_stat,
direction = "both")  
# A tibble: 1 × 1
p_value
<dbl>
1       0