105 Homework - Practice Quiz for Week 1 – M1L1

105.1 Practice Quiz for Week 1

Caution

Section omitted to comply with the Honor Code

--- date: 2025-07-02 title: "Homework - Practice Quiz for Week 1 -- M1L1" subtitle: "Capstone Project: Bayesian Conjugate Analysis for Autogressive Time Series Models" description: "" categories: - Bayesian Statistics - Time Series keywords: - Time Series - R code fig-caption: Notes about ... Bayesian Statistics title-block-banner: images/banner_deep.jpg --- ## Practice Quiz for Week 1 ::::: {.content-visible unless-profile="HC"} ::: {.callout-caution} Section omitted to comply with the Honor Code ::: ::::: ::::: {.content-hidden unless-profile="HC"} 1. Simulate 200 observations from $y_t= \phi_1 y_{t-1} + \phi_2 y_{t-2} + \varepsilon_t$ where $\varepsilon_t \sim N(0,\nu)$ with $\phi_1=0.1,\phi_2=0.8,\nu=1$. You can simulate the data using the following R code: ```{r} #| label: lbl-simulate-data library(colorRamps) library(leaflet) library(fields) library(mvtnorm) phi1 = 0.1 phi2 = 0.8 v = 1 set.seed(1) m1.sample = arima.sim(n=200,model=list(order=c(2,0,0),ar=c(phi1,phi2),sd=sqrt(v))) ##plot the simulate data plot.ts(m1.sample,ylab=expression(italic(y)[italic(t)]),xlab=expression(italic(t)),main='') ``` Perform a conjugate Bayesian analysis of AR(2) model to your simulated data. Using prior $m_0 = (0,0)^\top$, $C_0 = 10 I_2$, $n_0 = d_0 = 2$. Simulated 5000 samples of $\phi_1,\phi_2,v$. What is the posterior means of these three parameters? (Do the same calculations for $\phi_2^,\nu^$ below.) ```{r} #| label: lbl-bayesian-analysis-q1-3 y.sample = m1.sample # fix here N=5000 p=2 ## order of AR process n.all=length(y.sample) ## T, total number of data Y=matrix(y.sample[3:n.all],ncol=1) Fmtx=matrix(c(y.sample[2:(n.all-1)],y.sample[1:(n.all-2)]),nrow=p,byrow=TRUE) n=length(Y) m0=matrix(rep(0,p),ncol=1) C0=10*diag(p) n0=2 d0=2 e=Y-t(Fmtx)%*%m0 Q=t(Fmtx)%*%C0%*%Fmtx+diag(n) Q.inv=chol2inv(chol(Q)) A=C0%*%Fmtx%*%Q.inv m=m0+A%*%e C=C0-A%*%Q%*%t(A) n.star=n+n0 d.star=t(Y-t(Fmtx)%*%m0)%*%Q.inv%*%(Y-t(Fmtx)%*%m0)+d0 n.sample=5000 nu.sample=rep(0,n.sample) phi.sample=matrix(0,nrow=n.sample,ncol=p) for (i in 1:n.sample) { set.seed(i) nu.new=1/rgamma(1,shape=n.star/2,rate=d.star/2) nu.sample[i]=nu.new phi.new=rmvnorm(1,mean=m,sigma=nu.new*C) phi.sample[i,]=phi.new } par(mfrow=c(1,3)) hist(phi.sample[,1],freq=FALSE,xlab=expression(phi[1]),main="",ylim=c(0,6.4)) lines(density(phi.sample[,1]),type='l',col='red') hist(phi.sample[,2],freq=FALSE,xlab=expression(phi[2]),main="",ylim=c(0,6.4)) lines(density(phi.sample[,2]),type='l',col='red') hist(nu.sample,freq=FALSE,xlab=expression(nu),main="") lines(density(nu.sample),type='l',col='red') # assert phi1=0.1, phi2=0.8, nu=1 library(testthat) test_that("Posterior means are close to expected values", { expect_equal(mean(phi.sample[,1]), 0.1121369, tolerance = 0.01) expect_equal(mean(phi.sample[,2]), 0.7796674, tolerance = 0.01) expect_equal(mean(nu.sample), 1.02, tolerance = 0.01) }) ``` 2. Using prior $m_0 = (1,1)^\top$, $C_0 = I_2$, $n_0 = d_0 = 0.01$. Simulated 5000 samples of $\phi_1,\phi_2,\nu$. Obtain the posterior means of these three parameters. Do they differ a lot with the posterior means you get from question 1? ```{r} #| label: lbl-bayesian-analysis-q4 y.sample = m1.sample # fix here p=2 ## order of AR process n.all=length(y.sample) ## T, total number of data Y=matrix(y.sample[3:n.all],ncol=1) Fmtx=matrix(c(y.sample[2:(n.all-1)],y.sample[1:(n.all-2)]),nrow=p,byrow=TRUE) n=length(Y) m0=matrix(rep(0,p),ncol=1) C0=10*diag(p) n0=.01 d0=.01 e=Y-t(Fmtx)%*%m0 Q=t(Fmtx)%*%C0%*%Fmtx+diag(n) Q.inv=chol2inv(chol(Q)) A=C0%*%Fmtx%*%Q.inv m=m0+A%*%e C=C0-A%*%Q%*%t(A) n.star=n+n0 d.star=t(Y-t(Fmtx)%*%m0)%*%Q.inv%*%(Y-t(Fmtx)%*%m0)+d0 n.sample=5000 nu.sample=rep(0,n.sample) phi.sample=matrix(0,nrow=n.sample,ncol=p) for (i in 1:n.sample) { set.seed(i) nu.new=1/rgamma(1,shape=n.star/2,rate=d.star/2) nu.sample[i]=nu.new phi.new=rmvnorm(1,mean=m,sigma=nu.new*C) phi.sample[i,]=phi.new } par(mfrow=c(1,3)) hist(phi.sample[,1],freq=FALSE,xlab=expression(phi[1]),main="",ylim=c(0,6.4)) lines(density(phi.sample[,1]),type='l',col='red') hist(phi.sample[,2],freq=FALSE,xlab=expression(phi[2]),main="",ylim=c(0,6.4)) lines(density(phi.sample[,2]),type='l',col='red') hist(nu.sample,freq=FALSE,xlab=expression(nu),main="") lines(density(nu.sample),type='l',col='red') # assert phi1=0.1, phi2=0.8, nu=1 library(testthat) phi1_expected = 0.1121369 phi2_expected = 0.7796674 nu_expected = 1.02 test_that("Posterior means are close to expected values", { expect_equal(mean(phi.sample[,1]), phi1_expected, tolerance = 0.01) expect_equal(mean(phi.sample[,2]), phi2_expected, tolerance = 0.01) expect_equal(mean(nu.sample), nu_expected, tolerance = 0.01) }) cat("Posterior mean of phi1:", mean(phi.sample[,1]), "\n") cat("Posterior mean of phi2:", mean(phi.sample[,2]), "\n") cat("Posterior mean of nu:", mean(nu.sample), "\n") ``` 5. In this simulated case, we know the true value is $\phi_1=0.1$, $\phi_2=0.8$, $\nu=1$. Are the posterior means you get for the three parameters close to the true value? :::::