17  Monty hall problem

18 Monty Hall problem

You’re on a game show and there are three doors with three prizes. Two of the doors has goats and the last remaining door has a car. Goats are not the desirable prize, but the car is. As a contestant, you pick one door. For the two other doors, the host opens the door that shows the goat. Should you, as a contestant, open the door you originally chose or switch to the other door you didn’t originally pick? You should always switch.

We’ll simulate this scenario and see how often we win when choosing the same original door or switching.

19 Simulate Monty Hall problem

#load libraries
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## simulate 1000 examples of 3 doors 

doors<-c("1","2","3") #3 doors
win<-c("goat","goat","car")# what you can win
experiments<-1:1000
pick<-c(0,0,1)##picking a door

df<-expand.grid(doors,experiments)

df<-df|>
  group_by(Var2)|>
  mutate(winnings=sample(win,3))|>#randomly assign winnings (goats, cars)
  mutate(contestantdoor=sample(pick,3))|>#randomly pick a door
mutate(hostdoor=if_else(contestantdoor=="0" & winnings=="goat",Var1,"0"))#host chooses door with goat that is not the one contestant picked 
knitr::kable(head(df,6))
Var1 Var2 winnings contestantdoor hostdoor
1 1 goat 0 1
2 1 goat 0 2
3 1 car 1 0
1 2 goat 1 0
2 2 goat 0 2
3 2 car 0 0
#check if any winnings == car from host door
df|>
  filter(hostdoor>0)|>
  filter(winnings=="car")
# A tibble: 0 × 5
# Groups:   Var2 [0]
# ℹ 5 variables: Var1 <fct>, Var2 <int>, winnings <chr>, contestantdoor <dbl>,
#   hostdoor <chr>
##ok now, calculate strategy for picking the original door vs switching
dfpick<-df|>
  dplyr::group_by(Var2)|>
  dplyr::summarise(same=max(if_else(contestantdoor==1&winnings=="car",1,0)),switch=max(if_else(contestantdoor!=1&winnings=="car",1,0)))

##calculate the proportion of winners 
dfpick|>
  dplyr::summarise(samewin=sum(same)/length(same),switchwin=sum(switch)/length(switch))
# A tibble: 1 × 2
  samewin switchwin
    <dbl>     <dbl>
1   0.327     0.673

This simulation shows that by switching, you double your chances of winning!