Complete this survey!
Wait until we finish watching a video
.Rmd
file and replace “Your Name” with your name.Today we load knitr
to use kable()
function which neatly displays tables:
library(tidyverse)
library(knitr)
We will also use a new function pivot_wider()
in dplyr
package (\(\in\) tidyverse
):
For this Application Exercise, we will look at our newly collected data. Remove eval = FALSE
to read data.
sta199 <- read_csv("data/sta199-ae10.csv")
The dataset includes
year
: Year in schoolanimal
: Whether you prefer cats or dogsmovie
: Favorite movie genrestat_major
: statistical science or notGive two examples of an event from the dataset.
Let’s take a look at favorite movie genre. Note that we have categorized genres so that each person can only have one favorite genre.
Q- What is the sample space for favorite movie genre? You can use code to identify the sample space.
# code here
Q- How large is the sample space of any individual’s response?
# code here
The sample space for the four survey questions contains \(4 \times 2 \times 4 \times 2 = 64\) different outcomes.
Let’s make a table that includes year
, the number of students in each, and the associated probabilities.
Q- What is the probability a randomly selected STA199 student is a freshman?
# code here
Q- What is the probability a randomly selected STA199 student favors cats? Answer it with a table that includes animal
, the number of students who prefer each, and the associated probabilities.
# code here
Q- What is the probability a randomly selected STA199 student is not a senior and prefers dogs?
Let \(A\) be the event that someone is not a senior and prefers dogs.
# code here
Q- What is the probability a randomly selected STA199 student likes either action movies or comedy and is a statsci major?
Let \(B\) be the event someone is an action or comedy movie lover and a statsci major.
# code here
Now we examine the relationship between favorite animal and favorite movie. Let’s make a table of the number of students for every combination of favorite animals and movie genres.
# code here
Using pivot_wider()
, we’ll reformat the data into a contingency table, a table frequently used to study the association between two categorical variables. In this contingency table, each row will represent an animal, each column will represent a movie, and each cell is the number of students have a particular combination of animal and movie.
# code here
Q - How many students in STA199 like sci-fi movies?
Q - How many students in STA199 like dogs and dramas?
For each of the following exercises:
sta199
data frame and dplyr
functions.year
and major
Q - What is the probability a randomly selected STA199 student is a junior or not a statsci major?
# code here
# code here
Q - What is the probability a randomly selected STA199 student is a statsci major?
# code here
animal
and major
Q - What is the probability a randomly selected STA199 student likes dogs and a statsci major?
# code here
# code here
year
and movie
Q - What is the probability a randomly selected STA199 student is a senior or does not pick sci-fi as the favorite movie genre?
# code here
# code here