Skip to content
Final Prep
Share
Explore

# Final PrepFinal Prep

Last edited 3 days ago by Eddie Coda

## Question 1.

A die is tossed until the first 6 occurs. What is the probability that it takes 4 or more tosses? Estimate the probability for this geometric distribution by simulating 1000 random samples. Create a histogram of your simulations and describe the shape of the distribution.
# Step 1: Set up the problem
p_success <- 1/6
p_failure <- 5/6

# Step 2: Calculate the probability using the CCDF
p_4or_more <- (1 - p_failure^3)
p_4or_more

# Step 3: Simulate 10,000 random samples using the geometric distribution
set.seed(42) # Set the seed for reproducibility
n <- 10000
simulations <- rgeom(n, prob = p_success) + 1

# Step 4: Create a histogram using ggplot2
library(ggplot2)
ggplot(data.frame(simulations), aes(x = simulations)) +
geom_histogram(binwidth = 1, color = "black", fill = "skyblue") +
labs(title = "Waiting for the 6: A Board Game Enthusiast's Journey",
x = "Number of Rolls to Get the First 6",
y = "Frequency") +
theme_minimal()

# Step 5: Describe the distribution

## Question 2.

UFO sightings have been reported to occur at an average rate of five per hour during certain clear nights. What is the probability that a UFO hunter will spot exactly ten UFOs in two hours?
- Run a random sample of this event and simulate it to estimate the probability and compare it to the exact probability.
- Create a histogram and describe the shape of the distribution.
lambda <- 5 * 2 # Rate per hour * number of hours
k <- 10 # Number of UFOs
exact_prob <- dpois(k, lambda)

n_simulations <- 10000
simulated_UFOs <- rpois(n_simulations, lambda)

estimated_prob <- sum(simulated_UFOs == k) / n_simulations

hist(
simulated_UFOs,
main="Simulated UFO Sightings 🛸",
xlab="Number of UFOs Spotted",
col="lightblue", border="black",
breaks=seq(min(simulated_UFOs
), max(simulated_UFOs), 1))

## Question 3.

Let W ∼ Uniform(8, 12). Let M be the growth of a mystical tree in centimeters after being exposed to enchanted unicorn droppings, with the growth rate per day being equal to W.
- Use R to simulate W. Simulate the mean and pdf of M and compare to the exact results.
- Create one graph with both the theoretical density and the simulated distribution.
n_simulations <- 10000
simulated_W <- runif(n_simulations, min = 8, max = 12)

# Calculate the exact mean of W:
exact_mean_W <- (8 + 12) / 2

# Estimate the mean of M using the simulated data:
estimated_mean_M <- mean(simulated_W)

# Compare the exact and estimated means:
comparison_table <- data.frame(
Means = c("Exact", "Estimated"),
Values = c(exact_mean_W, estimated_mean_M)
)
print(comparison_table)

# Create a density plot of the simulated data:
plot(density(simulated_W), main = "Mystical Tree Growth Distribution", xlab = "Growth Rate (cm/day)", ylim = c(0, 0.35), col = "blue")

# Overlay the theoretical density of the uniform distribution:
curve(dunif(x, min = 8, max = 12), add = TRUE, col = "red", lwd = 2)

## Question 4.

As an adventurer, you've found a legendary key that can open secret passages in an ancient temple. The key, being centuries old, has a 12% chance of breaking permanently each day. You want to calculate the probability that the key remains intact on each day, from day 1 to day 30. You also want to create a plot of this to demonstrate.
days <- 1:30
prob_breaking <- 0.12
prob_intact <- pgeom(days - 1, prob_breaking, lower.tail = FALSE)

# Create a plot of the probabilities:
plot(days, prob_intact, type = "l", main = "Probability of the Legendary Key Remaining Intact", xlab = "Day", ylab = "Probability", col = "darkgreen", lwd = 2)

## Question 5.

In a thrilling game of "Guess the Jellybeans," there are exactly 100 jellybeans in a jar, with 70 being red and 30 being green. Participants need to draw 5 jellybeans without looking. What is the probability that a participant draws 3 red jellybeans and 2 green jellybeans? Use a simulation to estimate the probability and compare it to the exact probability. Create a histogram of your simulations and describe the shape of the distribution.
Understand the problem: We need to find the probability of drawing 3 red jellybeans and 2 green jellybeans from a jar with 100 jellybeans (70 red and 30 green) in a single draw of 5 jellybeans.
Identify the distribution: This is a hypergeometric distribution problem because we have a finite population (100 jellybeans) and we're trying to find the probability of a specific outcome without replacement (3 red and 2 green jellybeans).
Calculate the exact probability using the hypergeometric distribution formula in R:
k_red <- 3 # Number of red jellybeans
k_green <- 2 # Number of green jellybeans
N_red <- 70 # Total red jellybeans
N_green <- 30 # Total green jellybeans
n_draw <- 5 # Number of jellybeans drawn

exact_prob <- dhyper(k_red, N_red, N_green, n_draw)

# 4 Simulate the situation:
n_simulations <- 10000
jellybean_colors <- c(rep("red", 70), rep("green", 30))
simulated_draws <- replicate(n_simulations, sample(jellybean_colors, n_draw))
################START CODE HERE################
num_red_in_draws <- ??? # hint: use apply(), function(draw)
#################END CODE HERE#################

# 5 Estimate the probability from the simulation:
estimated_prob <- sum(num_red_in_draws == k_red) / n_simulations

# 6 Compare the exact and estimated probabilities:
# - Print out the exact and estimated probabilities.
# - Discuss the differences, if any.
comparison_table <- data.frame(
Probabilities = c("Exact", "Estimated"),
Values = c(exact_prob, estimated_prob)
)
print(comparison_table)

# 7: Create a histogram of the simulated data:
hist(num_red_in_draws, main="Simulated Jellybean Draws", xlab="Number of Red Jellybeans", col="purple", border="black", breaks=seq(min(num_red_in_draws), max(num_red_in_draws), 1))

## Question 6.

A group of 200 students is taking an online statistics course. On average, 35% of students complete their homework each day. Simulate the number of students who complete their homework on a given day using a binomial distribution. Run 10000 simulations and create a histogram to visualize the distribution.
# Step 1: Set up the problem
n_students <- 200
p_complete <- 0.35

# Step 2: Run simulations
set.seed(42) # Set the seed for reproducibility
n_simulations <- 10000
################START CODE HERE################
students_complete <- ??? # hint: use rbinom()
#################END CODE HERE#################

# Step 3: Create a histogram
hist(students_complete, main = "Online Stats Course: Daily Homework Completion", xlab = "Number of Students Completing Homework", col = "orange", border = "black", breaks = seq(min(students_complete), max(students_complete), 1))

## Question 7.

In a fantasy game, a player can find rare gemstones in a cave with a 5% chance of success. The player is allowed to enter the cave 15 times per day. What is the probability that the player will find at least 3 gemstones in a day? Run a simulation to estimate the probability and create a histogram to visualize the distribution.
# Step 1: Set up the problem
n_tries <- 15
p_success <- 0.05

# Step 2: Run simulations
set.seed(42) # Set the seed for reproducibility
n_simulations <- 10000

################START CODE HERE################
gemstones_found <- ??? # hint: use rbinom()
#################END CODE HERE#################

# Step 3: Estimate the probability
prob_at_least_3_gemstones <- sum(gemstones_found >= 3) / n_simulations

# Step 4: Create a histogram
hist(gemstones_found, main = "Fantasy Game: Gemstones Found in a Day", xlab = "Number of Gemstones Found", col = "purple", border = "black", breaks = seq(min(gemstones_found), max(gemstones_found), 1))

## Question 8.

In the world of "Sleepy Scholars," college students observe their classmates during lectures. On average, they notice 5 classmates dozing off during a single lecture. The number of dozing students follows a Poisson distribution with a mean of 5. What is the probability of witnessing at least 8 students dozing off during a single lecture? Run a simulation to estimate the probability and create a histogram to illustrate the distribution.
Understand the problem: We need to find the probability of witnessing at least 8 students dozing off during a lecture, given that the number of dozing students follows a Poisson distribution with a mean of 5.
# Step 1: Calculate the exact probability using Poisson

lambda <- 5 # Mean of the Poisson distribution
k <- 8 # Number of dozing students
exact_prob <- 1 - ppois(k - 1, lambda)

# Step 3: Simulate the situation
n_simulations <- 10000
################START CODE HERE################
simulated_dozing_students <- ??? # hint: use rpois()
#################END CODE HERE#################

# Setp 4: Estimate the probability from the simulation
estimated_prob <- sum(simulated_dozing_students >= k) / n_simulations

# Step 5: Compare the exact and estimated probabilities
comparison_table <- data.frame(
Probabilities = c("Exact", "Estimated"),
Values = c(exact_prob, estimated_prob)
)
print(comparison_table)

# Step 6: Create a histogram of the simulated data:
hist(
simulated_dozing_students,
main = "Sleepy Scholars 😴",
xlab = "Number of Dozing Students",
col = "lightblue", border = "black",
breaks = seq(min(simulated_dozing_students),
max(simulated_dozing_students), 1)
)

# Step 7: Interpret the results

### Based on the simulation, we estimated the probability of witnessing at least 8 students dozing off during a lecture to be approximately 0.13 (the estimated probability may vary slightly due to the random nature of simulations). The histogram illustrates that the majority of the time, students are likely to see around 3 to 5 classmates dozing off. The estimated probability is close to the exact probability calculated using the Poisson distribution formula, which is around 0.13. This indicates that our simulation provides a good approximation of the true probability.

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.