15 Homework Posterior Probabilities - M3L6HW1

Caution

Section omitted to comply with the Honor Code

--- title: "Homework Posterior Probabilities - M3L6HW1" subtitle: "Bayesian Statistics: From Concept to Data Analysis" categories: - Bayesian Statistics keywords: - Posterior Probabilities - Homework --- ::::: {.content-visible unless-profile="HC"} ::: {.callout-caution} Section omitted to comply with the Honor Code ::: ::::: ::::: {.content-hidden unless-profile="HC"} For the next two questions, consider the following experiment: ::: {#exm-themometer-calibration} ### calibrating a thermometer [**calibrating a thermometer**]{.column-margin} Suppose you are trying to calibrate a thermometer by testing the temperature it reads when water begins to boil. Because of natural variation, you take several measurements (experiments) to estimate $\theta$, the mean temperature reading for this thermometer at the boiling point. You know that at sea level, water should boil at 100 degrees Celsius, so you use a precise prior with $\mathbb{P}r(\theta = 100) = 1$. You then observe the following five measurements: $94.6, 95.4, 96.2, 94.9, 95.9$. ::: ::: {#exr-pdf-1} [**calibrating a thermometer** see @exm-themometer-calibration]{.column-margin} What will the posterior for $\theta$ look like? ::: ::: {.solution .column-screen-inset-right .callout-tip collapse="true"} #### Solution: - [ ] Most posterior probability will be concentrated near the sample mean of $95.4 \degree C$. - [ ] Most posterior probability will be spread between the sample mean of $95.4 \degree C$ and the prior mean of $100 \degree C$. - [x] The posterior will be $\theta=100$ with probability 1, regardless of the data. - [ ] None of the above. Because all prior probability is on a single point ($100 \degree C$), the prior completely dominates any data. If we are 100% certain of the outcome before the experiment, we learn nothing by performing it. This was a poor choice of prior, especially in light of the data we collected. Since bad priors with $\mathbb{P}r(X=x)=1$ or $\mathbb{P}r(X=x)=0$ will ignore any additional data. ::: ::: {#exr-pdf-2} [**calibrating a thermometer** see @exm-themometer-calibration]{.column-margin} Suppose you believe before the experiments that the thermometer is biased high so that on average it would read $105 \degree C$, and you are 95% confident that the average would be between 100 and 110. Which of the following prior PDFs most accurately reflects this prior belief? ::: ::: {.solution .column-screen-inset-right .callout-tip collapse="true"} #### Solution: The prior mean is 105 degrees Celsius and approximately 95% of the prior probability is assigned to the interval (100, 110). ::: ::: {#exr-pdf-3} Recall that for positive integer n, the gamma function has the following property: $\Gamma(n)=(n−1)!$. What is the value of $\Gamma(6)$? ::: ::: {.solution .column-screen-inset-right .callout-tip collapse="true"} #### Solution: $$ \Gamma(6)=5!=120 $$ ::: ::: {#exr-pdf-4} Find the value of the normalizing constant, $c$, which will cause the following integral to evaluate to 1. $$ \int_0^1c⋅z^3(1-z)^1 dz. $$ ::: ::: {.callout-important} ### Hint: Notice that this is proportional to a beta density. We only need to find the values of the parameters $\alpha$ and $\beta$ and plug those into the usual normalizing constant for a $Beta()$ density. ::: ::: {.solution .column-screen-inset-right .callout-tip collapse="true"} #### Solution: $$ 20 = \frac{\Gamma(4)\Gamma(2) }{\Gamma(4+2)} = {5!}{3!1!} $$ ::: ::: {#exr-pdf-5} [**coin-flipping** see @exm-frequentist-coinflip]{.column-margin} Consider the coin-flipping **coin-flipping** example from Lesson 5. The likelihood for each coin flip was Bernoulli with a probability of heads $\theta$, or $f(y \mid \theta) = \theta^y (1−\theta)^{1−y}$ for $y=0$ or $y=1$, and we used a uniform prior on $\theta$. Recall that if we had observed $Y_1 = 0$ instead of $Y_1 = 1$, the posterior distribution for $\theta$ would have been $f(\theta \mid Y_1 = 0 ) = 2(1−\theta)I{\{0 \le \theta \le 1\}}$. Which of the following is the correct expression for the posterior predictive distribution for the next flip $Y_2 \mid Y_1 = 0?$ $$ f(y_2 \mid Y_1 = 0 ) = \int_0^1 \theta^{y_2}(1−\theta)^{1−y_2}d\theta \: for \: y_2 = 0 \: or \: y_2 = 1 $$ ::: ::: {.solution .column-screen-inset-right .callout-tip collapse="true"} #### Solution: This is just the integral over likelihood × posterior. This expression simplifies to $$\begin{aligned} f(y_2 \mid Y_1 = 0 ) &= \int_0^1 2 \theta^{y_2} (1−\theta)^{2−y_2}d \theta \mathbb{I}_{(y_2 \in \{0,1\}) } \\ &= \frac {2}{\Gamma(4)} \Gamma(y_2 +1)\Gamma(3-y_2) \mathbb{I}_{(y_2 \in \{0,1\}) } \\ &= \frac{2}{3} \mathbb{I}_{(y_2 = 0)} + \frac{1}{3} \mathbb{I}_{(y_2 =1) } \end{aligned} $$ ::: ::: {#exr-pdf-6} The prior predictive distribution for $X$, when $\theta$ is continuous, is given by $$ \int f(x\mid \theta) \cdot f(\theta)\ d\theta \qquad \text(prior\ predictive\ distribution) $$ The analogous expression when $\theta$ is discrete is $$ \sum_{\theta} f(x \mid \theta)\cdot f(\theta) \qquad \text(prior\ predictive\ distribution) $$ summing over all possible values of $\theta$. Let's return to the **loaded coin** [**loaded coin** see @exm-two-coins]{.column-margin} of your brother's. Recall that he has a fair coin where heads come up on average $50 \%$ of the time $(p=0.5)$ and a loaded coin $(p=0.7)$. If we flip the coin five times, the likelihood is binomial: $f(x \mid p) = {5 \choose x} p^x (1−p)^{5−x}$ where $X$ counts the number of heads. Suppose you are confident, but not sure that he has brought you the loaded coin, so that your prior is $f(p)=0.9 \mathbb{I}_{(p=0.7)} +0.1 \mathbb{I}_{(p=0.5)}$. Which of the following expressions gives the prior predictive distribution of X? ::: ::: {.solution .column-screen-inset-right .callout-tip collapse="true"} #### Solution: $$ f(x) = {5 \choose x} .7^x .3^{5-x} (0.9) + {5 \choose x} .5^x .5^{5-x} (0.1) $$ This is a *weighted average* of binomials, with weights being your prior probabilities for each scenario (loaded or fair). ::: :::::