H: The range of dice to choose from. Either 4, 6, 8, 12 or 20 sided. D: The range of numbers the chosen die could roll. $1, 2, \dots, 20$. --

The Prior: Each dice is equally likely to be selected. P(H) should be uniform. The Likelihood: The dice are fair, so P(D | H) should be uniform over the range of values the dice could roll. Remember a 6-sided die cannot roll a seven, but a 20-sided die can! --- ### Implementing your results. ```{r, eval=FALSE} dice <- c(4, 6, 8, 12, 20) dice_likelihood <- function(x, dice) { if (x>dice) l <- # The roll was greater than the number of sides! else l <- # The roll was less than the number of sides. return(l) } prior <- # Make a vector of probabilities dice_posterior <- function(x, dice) { post <- numeric(length(dice)) for(i in 1:length(dice)){ post[i] <- #The posterior numerator for dice[i] } PD <- #The posterior denominator post <- post / PD data.frame(dice, post) } ``` --- class: center, inverse, middle # The Locomotive Problem --- "A railroad numbers its locomotives in order 1..N. One day you see a locomotive with the number 60. Estimate how many locomotives the railroad has." ```{r, eval = FALSE} library(tidyverse) loco_likelihood <- function(x, N){ if(x > N){ l <- # This is similar to the dice problem } else { l <- } l } loco_posterior <- function(x, mu, maxN){ support <- 1:maxN prior <- # What is this? likelihood <- numeric(maxN) for(i in 1:maxN){ likelihood[i] <- # Use the function above } post <- # Get the numerator of the posterior using Bayes' Rule pN <- # The denominator of the posterior post <- post / pN plot <- # Make a ggplot geom_line object print(plot) } ``` --- class:center, inverse, middle # Conjugate Priors --- ### The Normal / Normal Problem Let $x_i \sim \mathcal{N}(\theta, 9)$. We can estimate the value of $\theta$ that maximises the probability of observing $x_i$ with Maximum Likelihood Estimation. Or, if $\theta \sim \mathcal{N}(\mu, \tau^2)$, we can get a range of possible values for $\theta$ $$p(\theta | x) = \frac{\prod_{i=1}^N \mathcal{N}(x_i | \theta, 9) \mathcal{N}(\theta | \mu, \tau^2)}{\int_{\theta}\prod_{i=1}^N \mathcal{N}(x_i | \theta, 9) \mathcal{N}(\theta | \mu, \tau^2)d\theta}$$ which simplifies to $$p(\theta | x) = \sqrt{\frac{N\tau^2 + 9}{18 \pi \tau^2}} \exp\left\{ \frac{-(N \tau^2 + 9)}{18 \tau^2} \left(\theta - \frac{\tau^2 \sum_i x_i + 9 \mu}{N \tau^2 + 9} \right)^2 \right\}$$ [Use this link to plot some posteriors](https://ebsmonash.shinyapps.io/etc2420bayes/) --- ### The Beta / Binomial Problem Let $y_i \sim Bernoulli(\theta)$ and $\theta \sim Beta(\alpha, \beta)$. Then the likelihood is $$p(y_1, \dots, y_N | \theta) = \prod_{i=1}^N \theta^{y_i} (1 - \theta)^{1 - y_i} = \theta^{\sum y_i} (1-\theta)^{N - \sum y_i},$$ the prior is $$p(\theta) = \frac{1}{ B(\alpha, \beta)}\theta^{\alpha - 1} (1-\theta)^{\beta - 1}$$ and the posterior is $$p(\theta | y) = \frac{1}{ B(\alpha + \sum y_i, \beta + N - \sum y_i)} \theta^{\sum y_i + \alpha - 1} (1-\theta)^{N - \sum y_i + \beta - 1}$$ --- class: centre, inverse, middle # Extra Material: Deriving the Beta Posterior --- ### The posterior distribution is $$p(\theta | y) = \frac{ \frac{1}{ B(\alpha, \beta)} \theta^{\sum y_i} (1-\theta)^{N - \sum y_i}\theta^{\alpha - 1} (1-\theta)^{\beta - 1}} { \frac{1}{ B(\alpha, \beta)} \int_0^1 \theta^{\sum y_i} (1-\theta)^{N - \sum y_i} \theta^{\alpha - 1} (1-\theta)^{\beta - 1} d\theta}.$$ First, we usually cannot solve the integral analytically. So ignore it. $$p(\theta | y) = C \times \theta^{\sum y_i} (1-\theta)^{N - \sum y_i}\theta^{\alpha - 1} (1-\theta)^{\beta - 1}.$$ Second: Simplify the remaining terms. $$p(\theta | y) = C \times \theta^{\sum y_i + \alpha - 1} (1-\theta)^{N - \sum y_i + \beta - 1}.$$ Third: Remember this is a probability distribution, so $$\int_0^1 C \theta^{\sum y_i + \alpha - 1} (1-\theta)^{N - \sum y_i + \beta - 1} d\theta = 1$$ hence $$C^{-1} = \int_0^1\theta^{\sum y_i + \alpha - 1} (1-\theta)^{N - \sum y_i + \beta - 1} d\theta$$ --- ###What do we know about this integral? For any positive $\delta$ and $\lambda$ the $B(\delta, \lambda)$ distribution has this probability density function: $$ \frac{1}{B(\delta + \lambda)} \theta^{\delta - 1}(1 - \theta)^{\lambda - 1},$$ with the integral $$\int_0^1\theta^{\delta - 1}(1 - \theta)^{\lambda - 1}d\theta = B(\delta + \lambda).$$ Compare this to the last slide and we get $$C = B\left(\alpha + \sum y_i, \beta + N - \sum y_i\right),$$ and therefore $$p(\theta | y) = \frac{1}{ B(\alpha + \sum y_i, \beta + N - \sum y_i)} \theta^{\sum y_i + \alpha - 1} (1-\theta)^{N - \sum y_i + \beta - 1},$$ another Beta distribution with parameters $\alpha + \sum y_i$ and $\beta + N - \sum y_i.$