108 Homework - Determine the order of your data – M2L2

Caution

Section omitted to comply with the Honor Code

--- date: 2025-07-02 title: "Homework - Determine the order of your data -- M2L2" subtitle: "Capstone Project: Bayesian Conjugate Analysis for Autogressive Time Series Models" description: "" categories: - Bayesian Statistics - Time Series keywords: - Time Series - R code fig-caption: Notes about ... Bayesian Statistics title-block-banner: images/banner_deep.jpg --- ::::: {.content-visible unless-profile="HC"} ::: {.callout-caution} Section omitted to comply with the Honor Code ::: ::::: ::::: {.content-hidden unless-profile="HC"} 1. For your choice of the data, calculate the AIC and BIC for different values of p Based on your calculation, what is the order you will use for earthquake data? ```{r} #| label: lst-capstone-earthquake-data ## read data, you need to make sure the data file is in your current working directory earthquakes.dat <- read.delim("data/earthquakes.txt") earthquakes.dat$Quakes=as.numeric(earthquakes.dat$Quakes) y.dat=earthquakes.dat$Quakes[1:100] ## this is the training data y.new=earthquakes.dat$Quakes[101:103] ## this is the test data ``` ```{r} #| label: fig-capstone-ar-earthquake-acf-pacf #| #| fig-cap: "ACF and PACF of Earthquake Data" #| fig-subcap: #| - ACF #| - PACF y.sample=y.dat par(mfrow=c(1,2)) acf(y.sample,main="",xlab='Lag') pacf(y.sample,main="",xlab='Lag') ``` ```{r} #| label: fig-capstone-ar-order-earthquake #| fig-cap: "AIC and BIC for Earthquake Data" n.all=length(y.sample) p.star=5 Y=matrix(y.sample[(p.star+1):n.all],ncol=1) sample.all=matrix(y.sample,ncol=1) n=length(Y) p=seq(2,p.star,by=1) design.mtx=function(p_cur){ Fmtx=matrix(0,ncol=n,nrow=p_cur) for (i in 1:p_cur) { start.y=p.star+1-i end.y=start.y+n-1 Fmtx[i,]=sample.all[start.y:end.y,1] } return(Fmtx) } criteria.ar=function(p_cur){ Fmtx=design.mtx(p_cur) beta.hat=chol2inv(chol(Fmtx%*%t(Fmtx)))%*%Fmtx%*%Y R=t(Y-t(Fmtx)%*%beta.hat)%*%(Y-t(Fmtx)%*%beta.hat) sp.square=R/(n-p_cur) aic=2*p_cur+n*log(sp.square) bic=log(n)*p_cur+n*log(sp.square) result=c(aic,bic) return(result) } criteria=sapply(p,criteria.ar) plot(p,criteria[1,],type='p',pch='a',col='red',xlab='AR order p',ylab='Criterion',main='', ylim=c(min(criteria)-10,max(criteria)+10)) points(p,criteria[2,],pch='b',col='blue') ``` 2. Calculate the AIC and BIC for different values of p Based on your calculation, what is the order you will use for Google search index data? ```{r} #| label: lst-capstone-google-search-index-covid ## read data, you need to make sure the data file is in your current working directory covid.dat <- read.delim("data/GoogleSearchIndex.txt") covid.dat$Week=as.Date(as.character(covid.dat$Week),format = "%Y-%m-%d") y.dat=covid.dat$covid[1:57] ## this is the training data y.new=covid.dat$covid[58:60] ## this is the test data ``` ```{r} #| label: fig-capstone-ar-order-covid #| fig-cap: "AIC and BIC for Google Search Index Data" y.sample=y.dat n.all=length(y.sample) p.star=5 Y=matrix(y.sample[(p.star+1):n.all],ncol=1) sample.all=matrix(y.sample,ncol=1) n=length(Y) p=seq(1,p.star,by=1) design.mtx=function(p_cur){ Fmtx=matrix(0,ncol=n,nrow=p_cur) for (i in 1:p_cur) { start.y=p.star+1-i end.y=start.y+n-1 Fmtx[i,]=sample.all[start.y:end.y,1] } return(Fmtx) } criteria.ar=function(p_cur){ Fmtx=design.mtx(p_cur) beta.hat=chol2inv(chol(Fmtx%*%t(Fmtx)))%*%Fmtx%*%Y R=t(Y-t(Fmtx)%*%beta.hat)%*%(Y-t(Fmtx)%*%beta.hat) sp.square=R/(n-p_cur) aic=2*p_cur+n*log(sp.square) bic=log(n)*p_cur+n*log(sp.square) result=c(aic,bic) return(result) } criteria=sapply(p,criteria.ar) plot(p,criteria[1,],type='p',pch='a',col='red',xlab='AR order p',ylab='Criterion',main='', ylim=c(min(criteria)-10,max(criteria)+10)) points(p,criteria[2,],pch='b',col='blue') ``` :::::