class: center, middle, inverse, title-slide # Data Visualization 2 ### Bora Jin --- layout: true <div class="my-footer"> <span> <a href="https://introds.org" target="_blank">introds.org</a> </span> </div> --- ## Material 🎥 Watch [Visualizing Numerical Data](https://youtu.be/waBabVTI8ec) - [Slides](https://rstudio-education.github.io/datascience-box/course-materials/slides/u2-d03-viz-num/u2-d03-viz-num.html#1) 🎥 Watch [Visualizing Categorical Data](https://youtu.be/21h3rEO8k2E) - [Slides](https://rstudio-education.github.io/datascience-box/course-materials/slides/u2-d04-viz-cat/u2-d04-viz-cat.html#1) --- ## Today's Goal - Explain continuous, discrete, and categorical variables - Understand how to make visualizations and summarize variables according to their type - Develop a faceted plot --- ## Quiz **Q - (Numerical / Categorical) variables can be classified as either continuous or discrete.** -- Numerical -- **Q - (Ordinal / Nominal) categorical variable has a natural ordering.** -- Ordinal -- **Q - Classify the following variables:** - monthly expenses -- : numeric, continuous -- - number of shoes -- : numeric, discrete -- - course satisfaction rating (“extremely dislike”, “dislike”, “neutral”, “like”, “extremely like”) -- : categorical, ordinal -- - eye color -- : categorical, nominal --- ## Quiz **Q - Describe the shape of the following distribution of a numeric w.r.t. skewness and modality.** <img src="04-visual2_BJ_files/figure-html/left-skewed-1.png" width="50%" style="display: block; margin: auto;" /> -- left-skewed, unimodal --- ## Quiz **Q - Describe the shape of the following distribution of a numeric w.r.t. skewness and modality.** <img src="04-visual2_BJ_files/figure-html/uniform-1.png" width="50%" style="display: block; margin: auto;" /> -- symmetric, uniform --- ## Quiz **Q - Describe the shape of the following distribution of a numeric w.r.t. skewness and modality.** <img src="04-visual2_BJ_files/figure-html/bimodal-1.png" width="50%" style="display: block; margin: auto;" /> -- bimodal --- ## Quiz **Q - Fill in the blanks with appropriate** `R` **functions** - center: mean (`___`), median (`___`) - spread: range (`range`), standard deviation (`___`), interquartile range (`IQR`) --- ## Quiz **Q - Fill in the blanks with appropriate** `R` **functions** - center: mean (`mean`), median (`median`) - spread: range (`range`), standard deviation (`sd`), interquartile range (`IQR`) -- **Q - What plot might you draw if you want to detect potential outliers?** -- Box plot --- ## Quiz **Q - Which of these commands are inappropriate to visualize distribution of a single numerical variable?** a. `geom_histogram()` b. `geom_point()` c. `geom_density()` d. `geom_boxplot()` e. `geom_hex()` --- ## Quiz **Q - Which of these commands are inappropriate to visualize distribution of a single numerical variable?** a. `geom_histogram()` **b. `geom_point()` - to visualize relationships between two numerical variables** c. `geom_density()` d. `geom_boxplot()` **e. `geom_hex()` - relationships between two numerical variables through binning** --- ## Quiz **Q - Which of these commands are inappropriate to visualize relationships between numerical and categorical variables?** a. `geom_boxplot()` b. `geom_violin()` c. `geom_density_ridges()` d. `geom_bar()` --- ## Quiz **Q - Which of these commands are inappropriate to visualize relationships between numerical and categorical variables?** a. `geom_boxplot()` b. `geom_violin()` c. `geom_density_ridges()` **d. `geom_bar()` - visualize distribution of a categorical variable or relationship between categorical variables** --- ## Quiz **Q - Which of these is the most relevant for the difference between two bar plots?** .pull-left[ <img src="04-visual2_BJ_files/figure-html/segmented-bar-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ a. `aes(x = homeownership, fill = grade)` b. `position = "fill"` c. `labs()` ] --- ## Quiz **Q - Which of these is the most relevant for the difference between two bar plots?** .pull-left[ <img src="04-visual2_BJ_files/figure-html/segmented-bar2-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ a. `aes(x = homeownership, fill = grade)` **b. `position = "fill"` - relative frequency within x** c. `labs()` ] --- class: middle, center # Questions? --- ## Let's Practice Together! Go to [AE 04: Data Visualization 2](https://sta199-summer22.netlify.app/appex/ae04_BJ.html) --- ## Bulletin - Lab 01 due Today at 11:59pm - Watch videos for [Prepare: May 17](https://sta199-summer22.netlify.app/prepare/week02_may17_BJ.html) - Complete Part 4 and Practice of `ae03` - Complete Part 1-2 of `ae04`