AE 10: Introduction to Probability

Getting Started

Introduction

Events, Sample Space

Part 1
Part 2
Part 3

Probability

Part 4
Part 5
Part 6
Part 7
Part 8
Practice

Submitting Application Exercises

Getting Started

Complete this survey!
Wait until we finish watching a video
- Clone the repository entitled “ae10-GitHubUsername” at course GitHub organization page on your RStudio.
- Open the .Rmd file and replace “Your Name” with your name.

Introduction

Today we load knitr to use kable() function which neatly displays tables:

library(tidyverse)
library(knitr)

We will also use a new function pivot_wider() in dplyr package ( tidyverse):

🎥 Video and Slides

For this Application Exercise, we will look at our newly collected data. Remove eval = FALSE to read data.

sta199 <- read_csv("data/sta199-ae10.csv")

The dataset includes

year: Year in school
animal: Whether you prefer cats or dogs
movie: Favorite movie genre
stat_major: statistical science or not

Events, Sample Space

Part 1

Give two examples of an event from the dataset.

Part 2

Let’s take a look at favorite movie genre. Note that we have categorized genres so that each person can only have one favorite genre.

Q- What is the sample space for favorite movie genre? You can use code to identify the sample space.

# code here

Part 3

Q- How large is the sample space of any individual’s response?

# code here

The sample space for the four survey questions contains different outcomes.

Probability

Part 4

Let’s make a table that includes year, the number of students in each, and the associated probabilities.

Q- What is the probability a randomly selected STA199 student is a freshman?

# code here

Part 5

Q- What is the probability a randomly selected STA199 student favors cats? Answer it with a table that includes animal, the number of students who prefer each, and the associated probabilities.

# code here

Part 6

Q- What is the probability a randomly selected STA199 student is not a senior and prefers dogs?

Let be the event that someone is not a senior and prefers dogs.

# code here

Part 7

Q- What is the probability a randomly selected STA199 student likes either action movies or comedy and is a statsci major?

Let be the event someone is an action or comedy movie lover and a statsci major.

# code here

Part 8

Now we examine the relationship between favorite animal and favorite movie. Let’s make a table of the number of students for every combination of favorite animals and movie genres.

# code here

Using pivot_wider(), we’ll reformat the data into a contingency table, a table frequently used to study the association between two categorical variables. In this contingency table, each row will represent an animal, each column will represent a movie, and each cell is the number of students have a particular combination of animal and movie.

# code here

Q - How many students in STA199 like sci-fi movies?

Q - How many students in STA199 like dogs and dramas?

Practice

For each of the following exercises:

Calculate the probability using a relevant contingency table.
Then write code to check your answer using the sta199 data frame and dplyr functions.

Relationship between year and major

Q - What is the probability a randomly selected STA199 student is a junior or not a statsci major?

# code here

# code here

Q - What is the probability a randomly selected STA199 student is a statsci major?

# code here

Relationship between animal and major

Q - What is the probability a randomly selected STA199 student likes dogs and a statsci major?

# code here

# code here

Relationship between year and movie

Q - What is the probability a randomly selected STA199 student is a senior or does not pick sci-fi as the favorite movie genre?

# code here

# code here

Submitting Application Exercises

Once you have completed the activity, push your final changes to your GitHub repo.
Make sure you committed at least three times.
Check that your repo is updated on GitHub, and that’s all you need to do to submit application exercises for participation.