What is the distribution of city/town populations in all cities and towns in California?
What is the distribution of the first digit of city/town populations in all cities and towns in California?
Observation: many naturally occurring numerical variables have a recurring pattern in the distribution of the first digit.
Benford’s law states that
Let \(X\) be the first digit of a randomly selected number. \(X \sim Benfords()\) if
\[P(X = x) = \log_{10}\left(1 + 1/x \right)\]
Ahmadinejad won the election with 62.6% of the votes cast, while Mousavi received 33.75% of the votes cast.
Was the election fraudulent?
In a normally occurring, fair, election, the first digit of the vote counts county-by-county should follow Benford’s Law. If they do not, that might suggest that vote counts have been manually altered.
This theory brought to bear to determine whether the 2009 presidential election in Iran showed irregularities1.
get_first()
slice_sample()
pull()
Statisticians, scientists, and engineers work on projects that include code, data, figures, and texts. For large-scale or long-run projects, we need a system to track and share everything.
stat20data
package has its code and data stored on GitHub hereData from GitHub or other websites can be loaded into R like this:
So where to find the link to the raw data?
25:00