You’re on a game show and there are three doors with three prizes. Two of the doors has goats and the last remaining door has a car. Goats are not the desirable prize, but the car is. As a contestant, you pick one door. For the two other doors, the host opens the door that shows the goat. Should you, as a contestant, open the door you originally chose or switch to the other door you didn’t originally pick? You should always switch.
We’ll simulate this scenario and see how often we win when choosing the same original door or switching.
19 Simulate Monty Hall problem
#load librarieslibrary(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## simulate 1000 examples of 3 doors doors<-c("1","2","3") #3 doorswin<-c("goat","goat","car")# what you can winexperiments<-1:1000pick<-c(0,0,1)##picking a doordf<-expand.grid(doors,experiments)df<-df|>group_by(Var2)|>mutate(winnings=sample(win,3))|>#randomly assign winnings (goats, cars)mutate(contestantdoor=sample(pick,3))|>#randomly pick a doormutate(hostdoor=if_else(contestantdoor=="0"& winnings=="goat",Var1,"0"))#host chooses door with goat that is not the one contestant picked knitr::kable(head(df,6))
Var1
Var2
winnings
contestantdoor
hostdoor
1
1
goat
0
1
2
1
goat
0
2
3
1
car
1
0
1
2
goat
1
0
2
2
goat
0
2
3
2
car
0
0
#check if any winnings == car from host doordf|>filter(hostdoor>0)|>filter(winnings=="car")
##ok now, calculate strategy for picking the original door vs switchingdfpick<-df|> dplyr::group_by(Var2)|> dplyr::summarise(same=max(if_else(contestantdoor==1&winnings=="car",1,0)),switch=max(if_else(contestantdoor!=1&winnings=="car",1,0)))##calculate the proportion of winners dfpick|> dplyr::summarise(samewin=sum(same)/length(same),switchwin=sum(switch)/length(switch))