Caution
Section omitted to comply with the Honor Code
Bayesian Statistics: From Concept to Data Analysis
Oren Bochman
Posterior Probabilities, Homework
Section omitted to comply with the Honor Code
---
title: "Homework Posterior Probabilities - M3L6HW1"
subtitle: "Bayesian Statistics: From Concept to Data Analysis"
categories:
- Bayesian Statistics
keywords:
- Posterior Probabilities
- Homework
---
::::: {.content-visible unless-profile="HC"}
::: {.callout-caution}
Section omitted to comply with the Honor Code
:::
:::::
::::: {.content-hidden unless-profile="HC"}
For the next two questions, consider the following experiment:
::: {#exm-themometer-calibration}
### calibrating a thermometer
[**calibrating a thermometer**]{.column-margin}
Suppose you are trying to calibrate a thermometer by testing the temperature it reads when water begins to boil. Because of natural variation, you take several measurements (experiments) to estimate $\theta$, the mean temperature reading for this thermometer at the boiling point.
You know that at sea level, water should boil at 100 degrees Celsius, so you use a precise prior with $\mathbb{P}r(\theta = 100) = 1$. You then observe the following five measurements: $94.6, 95.4, 96.2, 94.9, 95.9$.
:::
::: {#exr-pdf-1}
[**calibrating a thermometer** see @exm-themometer-calibration]{.column-margin} What will the posterior for $\theta$ look like?
:::
::: {.solution .callout-tip collapse="true"}
#### Solution:
- [ ] Most posterior probability will be concentrated near the sample mean of $95.4 \degree C$.
- [ ] Most posterior probability will be spread between the sample mean of $95.4 \degree C$ and the prior mean of $100 \degree C$.
- [x] The posterior will be $\theta=100$ with probability 1, regardless of the data.
- [ ] None of the above.
Because all prior probability is on a single point ($100 \degree C$), the prior completely dominates any data. If we are 100% certain of the outcome before the experiment, we learn nothing by performing it.
This was a poor choice of prior, especially in light of the data we collected.
Since bad priors with $\mathbb{P}r(X=x)=1$ or $\mathbb{P}r(X=x)=0$ will ignore any additional data.
:::
::: {#exr-pdf-2}
[**calibrating a thermometer** see @exm-themometer-calibration]{.column-margin} Suppose you believe before the experiments that the thermometer is biased high so that on average it would read $105 \degree C$, and you are 95% confident that the average would be between 100 and 110.
Which of the following prior PDFs most accurately reflects this prior belief?
:::
::: {.solution .callout-tip collapse="true"}
#### Solution:
The prior mean is 105 degrees Celsius and approximately 95% of the prior probability is assigned to the interval (100, 110).
:::
::: {#exr-pdf-3}
Recall that for positive integer n, the gamma function has the following property: $\Gamma(n)=(n−1)!$. What is the value of $\Gamma(6)$?
:::
::: {.solution .callout-tip collapse="true"}
#### Solution:
$$
\Gamma(6)=5!=120
$$
:::
::: {#exr-pdf-4}
Find the value of the normalizing constant, $c$, which will cause the following integral to evaluate to 1.
$$
\int_0^1c⋅z^3(1-z)^1 dz.
$$
:::
::: {.callout-important}
### Hint:
Notice that this is proportional to a beta density. We only need to find the values of the parameters $\alpha$ and $\beta$ and plug those into the usual normalizing constant for a $Beta()$ density.
:::
::: {.solution .callout-tip collapse="true"}
#### Solution:
$$
20 = \frac{\Gamma(4)\Gamma(2) }{\Gamma(4+2)} = {5!}{3!1!}
$$
:::
::: {#exr-pdf-5}
[**coin-flipping** see @exm-frequentist-coinflip]{.column-margin} Consider the coin-flipping **coin-flipping** example from Lesson 5. The likelihood for each coin flip was Bernoulli with a probability of heads $\theta$, or $f(y \mid \theta) = \theta^y (1−\theta)^{1−y}$ for $y=0$ or $y=1$, and we used a uniform prior on $\theta$.
Recall that if we had observed $Y_1 = 0$ instead of $Y_1 = 1$, the posterior distribution for $\theta$ would have been $f(\theta \mid Y_1 = 0 ) = 2(1−\theta)I{\{0 \le \theta \le 1\}}$. Which of the following is the correct expression for the posterior predictive distribution for the next flip $Y_2 \mid Y_1 = 0?$
$$
f(y_2 \mid Y_1 = 0 ) = \int_0^1 \theta^{y_2}(1−\theta)^{1−y_2}d\theta \: for \: y_2 = 0 \: or \: y_2 = 1
$$
:::
::: {.solution .callout-tip collapse="true"}
#### Solution:
This is just the integral over likelihood × posterior. This expression simplifies to
$$\begin{aligned}
f(y_2 \mid Y_1 = 0 ) &= \int_0^1 2 \theta^{y_2} (1−\theta)^{2−y_2}d \theta \mathbb{I}_{(y_2 \in \{0,1\}) } \\
&= \frac {2}{\Gamma(4)} \Gamma(y_2 +1)\Gamma(3-y_2) \mathbb{I}_{(y_2 \in \{0,1\}) } \\
&= \frac{2}{3} \mathbb{I}_{(y_2 = 0)} + \frac{1}{3} \mathbb{I}_{(y_2 =1) } \end{aligned}
$$
:::
::: {#exr-pdf-6}
The prior predictive distribution for $X$, when $\theta$ is continuous, is given by
$$
\int f(x\mid \theta) \cdot f(\theta)\ d\theta \qquad \text(prior\ predictive\ distribution)
$$
The analogous expression when $\theta$ is discrete is
$$
\sum_{\theta} f(x \mid \theta)\cdot f(\theta) \qquad \text(prior\ predictive\ distribution)
$$
summing over all possible values of $\theta$.
Let's return to the **loaded coin** [**loaded coin** see @exm-two-coins]{.column-margin} of your brother's. Recall that he has a fair coin where heads come up on average $50 \%$ of the time $(p=0.5)$ and a loaded coin $(p=0.7)$. If we flip the coin five times, the likelihood is binomial: $f(x \mid p) = {5 \choose x} p^x (1−p)^{5−x}$ where $X$ counts the number of heads.
Suppose you are confident, but not sure that he has brought you the loaded coin, so that your prior is $f(p)=0.9 \mathbb{I}_{(p=0.7)} +0.1 \mathbb{I}_{(p=0.5)}$. Which of the following expressions gives the prior predictive distribution of X?
:::
::: {.solution .callout-tip collapse="true"}
#### Solution:
$$
f(x) = {5 \choose x} .7^x .3^{5-x} (0.9) + {5 \choose x} .5^x .5^{5-x} (0.1)
$$
This is a *weighted average* of binomials, with weights being your prior probabilities for each scenario (loaded or fair).
:::
:::::