# Taxonomy of Data

STAT 20: Introduction to Probability and Statistics

## Agenda

• Concept Questions: Taxonomy of Data
• Worksheet on Paper
• Break
• Worksheet via RStudio

# Concept Questions

## Types of Variables

There are three things a variable could be referring to

1. a phenomenon
2. how the phenomenon is being recorded or measured into data
• what values can it take? (this is often an intent- or value-laden exercise!)
• for numerical units, what unit should we express it in?
3. How the recorded data is being analyzed
• binning/discretizing income data
• if a barchart has too many bars, using a histogram.

What type of variable is age?

Answer at pollev.com/<name>

01:00

## Images as data

• Images are composed of pixels (this image is 1012 by 1520)

• The color in each pixel is in RGB

• Each band takes a value from 0-255

• This image is data with 1020 x 1520 x 3 values.

## Grayscale

• Grayscale images have only one band
• 0 is black, 255 is white
• This image is data with 1020 x 1520 x 1 values.

To simplify, assume our photos are 8 x 8 grayscale images.

## Images in a Data Frame

If you were to put the data from these (8 x 8 grayscale) images into a data frame, what would the dimensions of that data frame be in rows x columns?

01:00

20:00
20:00