Caution
Section omitted to comply with the Honor Code
Capstone Project: Bayesian Conjugate Analysis for Autogressive Time Series Models
Oren Bochman
July 2, 2025
Time Series, R code
Section omitted to comply with the Honor Code
---
date: 2025-07-02
title: "Homework - Determine the order of your data -- M2L2"
subtitle: "Capstone Project: Bayesian Conjugate Analysis for Autogressive Time Series Models"
description: ""
categories:
- Bayesian Statistics
- Time Series
keywords:
- Time Series
- R code
fig-caption: Notes about ... Bayesian Statistics
title-block-banner: images/banner_deep.jpg
---
::::: {.content-visible unless-profile="HC"}
::: {.callout-caution}
Section omitted to comply with the Honor Code
:::
:::::
::::: {.content-hidden unless-profile="HC"}
1. For your choice of the data, calculate the AIC and BIC for different values of
p Based on your calculation, what is the order you will use for earthquake data?
```{r}
#| label: lst-capstone-earthquake-data
## read data, you need to make sure the data file is in your current working directory
earthquakes.dat <- read.delim("data/earthquakes.txt")
earthquakes.dat$Quakes=as.numeric(earthquakes.dat$Quakes)
y.dat=earthquakes.dat$Quakes[1:100] ## this is the training data
y.new=earthquakes.dat$Quakes[101:103] ## this is the test data
```
```{r}
#| label: fig-capstone-ar-earthquake-acf-pacf
#| #| fig-cap: "ACF and PACF of Earthquake Data"
#| fig-subcap:
#| - ACF
#| - PACF
y.sample=y.dat
par(mfrow=c(1,2))
acf(y.sample,main="",xlab='Lag')
pacf(y.sample,main="",xlab='Lag')
```
```{r}
#| label: fig-capstone-ar-order-earthquake
#| fig-cap: "AIC and BIC for Earthquake Data"
n.all=length(y.sample)
p.star=5
Y=matrix(y.sample[(p.star+1):n.all],ncol=1)
sample.all=matrix(y.sample,ncol=1)
n=length(Y)
p=seq(2,p.star,by=1)
design.mtx=function(p_cur){
Fmtx=matrix(0,ncol=n,nrow=p_cur)
for (i in 1:p_cur) {
start.y=p.star+1-i
end.y=start.y+n-1
Fmtx[i,]=sample.all[start.y:end.y,1]
}
return(Fmtx)
}
criteria.ar=function(p_cur){
Fmtx=design.mtx(p_cur)
beta.hat=chol2inv(chol(Fmtx%*%t(Fmtx)))%*%Fmtx%*%Y
R=t(Y-t(Fmtx)%*%beta.hat)%*%(Y-t(Fmtx)%*%beta.hat)
sp.square=R/(n-p_cur)
aic=2*p_cur+n*log(sp.square)
bic=log(n)*p_cur+n*log(sp.square)
result=c(aic,bic)
return(result)
}
criteria=sapply(p,criteria.ar)
plot(p,criteria[1,],type='p',pch='a',col='red',xlab='AR order p',ylab='Criterion',main='', ylim=c(min(criteria)-10,max(criteria)+10))
points(p,criteria[2,],pch='b',col='blue')
```
2. Calculate the AIC and BIC for different values of p Based on your calculation, what is the order you will use for Google search index data?
```{r}
#| label: lst-capstone-google-search-index-covid
## read data, you need to make sure the data file is in your current working directory
covid.dat <- read.delim("data/GoogleSearchIndex.txt")
covid.dat$Week=as.Date(as.character(covid.dat$Week),format = "%Y-%m-%d")
y.dat=covid.dat$covid[1:57] ## this is the training data
y.new=covid.dat$covid[58:60] ## this is the test data
```
```{r}
#| label: fig-capstone-ar-order-covid
#| fig-cap: "AIC and BIC for Google Search Index Data"
y.sample=y.dat
n.all=length(y.sample)
p.star=5
Y=matrix(y.sample[(p.star+1):n.all],ncol=1)
sample.all=matrix(y.sample,ncol=1)
n=length(Y)
p=seq(1,p.star,by=1)
design.mtx=function(p_cur){
Fmtx=matrix(0,ncol=n,nrow=p_cur)
for (i in 1:p_cur) {
start.y=p.star+1-i
end.y=start.y+n-1
Fmtx[i,]=sample.all[start.y:end.y,1]
}
return(Fmtx)
}
criteria.ar=function(p_cur){
Fmtx=design.mtx(p_cur)
beta.hat=chol2inv(chol(Fmtx%*%t(Fmtx)))%*%Fmtx%*%Y
R=t(Y-t(Fmtx)%*%beta.hat)%*%(Y-t(Fmtx)%*%beta.hat)
sp.square=R/(n-p_cur)
aic=2*p_cur+n*log(sp.square)
bic=log(n)*p_cur+n*log(sp.square)
result=c(aic,bic)
return(result)
}
criteria=sapply(p,criteria.ar)
plot(p,criteria[1,],type='p',pch='a',col='red',xlab='AR order p',ylab='Criterion',main='', ylim=c(min(criteria)-10,max(criteria)+10))
points(p,criteria[2,],pch='b',col='blue')
```
:::::