10:00
infer
10:00
In order to demonstrate how to conduct a hypothesis test through simulation, we will be collecting data from this class using a poll.
You will have only 15 seconds to answer the following multiple choice question, so please get ready at pollev.com
…
The two shapes above have simple first names:
Which of the two names belongs to the shape on the left?
00:15
What is a statement of the null hypothesis that corresponds to the notion the link between names and shapes is arbitrary?
01:00
\[\hat{p}_k = \frac{\textrm{Number who chose "Kiki"}}{\textrm{Total number of people}}\]
Note: you could also simply \(n_k\), the number of people who chose “Kiki”.
Our technique: simulate data from a world in which the null is true, then calculate the test statistic on the simulated data.
Which simulation method(s) align with the null hypothesis and our data collection process?
01:00
infer
library(tidyverse)
library(infer)
# update these based on the poll
n_k <- 40
n_b <- 20
shapes <- data.frame(name = c(rep("Kiki", n_k),
rep("Booba", n_b)))
shapes %>%
specify(response = name,
success = "Kiki") %>%
hypothesize(null = "point", p = .5) %>%
generate(reps = 1, type = "draw") %>%
calculate(stat = "prop")
null <- shapes %>%
specify(response = name,
success = "Kiki") %>%
hypothesize(null = "point", p = .5) %>%
generate(reps = 500, type = "draw") %>%
calculate(stat = "prop")
obs_p_hat <- shapes %>%
specify(response = name,
success = "Kiki") %>%
# hypothesize(null = "point", p = .5) %>%
# generate(reps = 500, type = "simulate") %>%
calculate(stat = "prop")
null %>%
visualise() +
shade_pvalue(obs_p_hat, direction = "both")
What is the proper interpretation of this p-value?
01:00
25:00