107.1 Practice Quiz for Week 1
Caution
Section omitted to comply with the Honor Code
Capstone Project: Bayesian Conjugate Analysis for Autogressive Time Series Models
Oren Bochman
July 2, 2025
Time Series, R code
Section omitted to comply with the Honor Code
---
date: 2025-07-02
title: "Homework - Practice Quiz for Week 1 -- M1L1"
subtitle: "Capstone Project: Bayesian Conjugate Analysis for Autogressive Time Series Models"
description: ""
categories:
- Bayesian Statistics
- Time Series
keywords:
- Time Series
- R code
fig-caption: Notes about ... Bayesian Statistics
title-block-banner: images/banner_deep.jpg
---
## Practice Quiz for Week 1
::::: {.content-visible unless-profile="HC"}
::: {.callout-caution}
Section omitted to comply with the Honor Code
:::
:::::
::::: {.content-hidden unless-profile="HC"}
1. Simulate 200 observations from $y_t= \phi_1 y_{t-1} + \phi_2 y_{t-2} + \varepsilon_t$
where $\varepsilon_t \sim N(0,\nu)$ with $\phi_1=0.1,\phi_2=0.8,\nu=1$.
You can simulate the data using the following R code:
```{r}
#| label: lbl-simulate-data
library(colorRamps)
library(leaflet)
library(fields)
library(mvtnorm)
phi1 = 0.1
phi2 = 0.8
v = 1
set.seed(1)
m1.sample = arima.sim(n=200,model=list(order=c(2,0,0),ar=c(phi1,phi2),sd=sqrt(v)))
##plot the simulate data
plot.ts(m1.sample,ylab=expression(italic(y)[italic(t)]),xlab=expression(italic(t)),main='')
```
Perform a conjugate Bayesian analysis of AR(2) model to your simulated data.
Using prior $m_0 = (0,0)^\top$, $C_0 = 10 I_2$, $n_0 = d_0 = 2$. Simulated 5000 samples of $\phi_1,\phi_2,v$. What is the posterior means of these three parameters? (Do the same calculations for $\phi_2^,\nu^$ below.)
```{r}
#| label: lbl-bayesian-analysis-q1-3
y.sample = m1.sample # fix here
N=5000
p=2 ## order of AR process
n.all=length(y.sample) ## T, total number of data
Y=matrix(y.sample[3:n.all],ncol=1)
Fmtx=matrix(c(y.sample[2:(n.all-1)],y.sample[1:(n.all-2)]),nrow=p,byrow=TRUE)
n=length(Y)
m0=matrix(rep(0,p),ncol=1)
C0=10*diag(p)
n0=2
d0=2
e=Y-t(Fmtx)%*%m0
Q=t(Fmtx)%*%C0%*%Fmtx+diag(n)
Q.inv=chol2inv(chol(Q))
A=C0%*%Fmtx%*%Q.inv
m=m0+A%*%e
C=C0-A%*%Q%*%t(A)
n.star=n+n0
d.star=t(Y-t(Fmtx)%*%m0)%*%Q.inv%*%(Y-t(Fmtx)%*%m0)+d0
n.sample=5000
nu.sample=rep(0,n.sample)
phi.sample=matrix(0,nrow=n.sample,ncol=p)
for (i in 1:n.sample) {
set.seed(i)
nu.new=1/rgamma(1,shape=n.star/2,rate=d.star/2)
nu.sample[i]=nu.new
phi.new=rmvnorm(1,mean=m,sigma=nu.new*C)
phi.sample[i,]=phi.new
}
par(mfrow=c(1,3))
hist(phi.sample[,1],freq=FALSE,xlab=expression(phi[1]),main="",ylim=c(0,6.4))
lines(density(phi.sample[,1]),type='l',col='red')
hist(phi.sample[,2],freq=FALSE,xlab=expression(phi[2]),main="",ylim=c(0,6.4))
lines(density(phi.sample[,2]),type='l',col='red')
hist(nu.sample,freq=FALSE,xlab=expression(nu),main="")
lines(density(nu.sample),type='l',col='red')
# assert phi1=0.1, phi2=0.8, nu=1
library(testthat)
test_that("Posterior means are close to expected values", {
expect_equal(mean(phi.sample[,1]), 0.1121369, tolerance = 0.01)
expect_equal(mean(phi.sample[,2]), 0.7796674, tolerance = 0.01)
expect_equal(mean(nu.sample), 1.02, tolerance = 0.01)
})
```
2. Using prior $m_0 = (1,1)^\top$, $C_0 = I_2$, $n_0 = d_0 = 0.01$. Simulated 5000 samples of $\phi_1,\phi_2,\nu$. Obtain the posterior means of these three parameters. Do they differ a lot with the posterior means you get from question 1?
```{r}
#| label: lbl-bayesian-analysis-q4
y.sample = m1.sample # fix here
p=2 ## order of AR process
n.all=length(y.sample) ## T, total number of data
Y=matrix(y.sample[3:n.all],ncol=1)
Fmtx=matrix(c(y.sample[2:(n.all-1)],y.sample[1:(n.all-2)]),nrow=p,byrow=TRUE)
n=length(Y)
m0=matrix(rep(0,p),ncol=1)
C0=10*diag(p)
n0=.01
d0=.01
e=Y-t(Fmtx)%*%m0
Q=t(Fmtx)%*%C0%*%Fmtx+diag(n)
Q.inv=chol2inv(chol(Q))
A=C0%*%Fmtx%*%Q.inv
m=m0+A%*%e
C=C0-A%*%Q%*%t(A)
n.star=n+n0
d.star=t(Y-t(Fmtx)%*%m0)%*%Q.inv%*%(Y-t(Fmtx)%*%m0)+d0
n.sample=5000
nu.sample=rep(0,n.sample)
phi.sample=matrix(0,nrow=n.sample,ncol=p)
for (i in 1:n.sample) {
set.seed(i)
nu.new=1/rgamma(1,shape=n.star/2,rate=d.star/2)
nu.sample[i]=nu.new
phi.new=rmvnorm(1,mean=m,sigma=nu.new*C)
phi.sample[i,]=phi.new
}
par(mfrow=c(1,3))
hist(phi.sample[,1],freq=FALSE,xlab=expression(phi[1]),main="",ylim=c(0,6.4))
lines(density(phi.sample[,1]),type='l',col='red')
hist(phi.sample[,2],freq=FALSE,xlab=expression(phi[2]),main="",ylim=c(0,6.4))
lines(density(phi.sample[,2]),type='l',col='red')
hist(nu.sample,freq=FALSE,xlab=expression(nu),main="")
lines(density(nu.sample),type='l',col='red')
# assert phi1=0.1, phi2=0.8, nu=1
library(testthat)
phi1_expected = 0.1121369
phi2_expected = 0.7796674
nu_expected = 1.02
test_that("Posterior means are close to expected values", {
expect_equal(mean(phi.sample[,1]), phi1_expected, tolerance = 0.01)
expect_equal(mean(phi.sample[,2]), phi2_expected, tolerance = 0.01)
expect_equal(mean(nu.sample), nu_expected, tolerance = 0.01)
})
cat("Posterior mean of phi1:", mean(phi.sample[,1]), "\n")
cat("Posterior mean of phi2:", mean(phi.sample[,2]), "\n")
cat("Posterior mean of nu:", mean(nu.sample), "\n")
```
5. In this simulated case, we know the true value is $\phi_1=0.1$, $\phi_2=0.8$, $\nu=1$. Are the posterior means you get for the three parameters close to the true value?
:::::