105 Q&A from Bayesian Forecasting and Dynamic Models - Chapter 3

105.1 Question from Chapter 3

Tip

Unless stated explicitly these exercises refer to a univariate time series \{Y_t\} modelled by the closed regression DLM \{F_t , 1, V_t , W_t\} of Definition 3.2 with known variances and/or known discount factor \delta as follows:

Observation equation: Y_t = F_t \theta_t + \nu_t, \nu_t \sim \mathcal{N}[0, V_t],
System equation: \theta_t = \theta_{t-1} + \omega_t, \omega_t \sim \mathcal{N}[0, W_t],
Initial prior: (\theta_0 \mid \mathcal{D}_0) \sim \mathcal{N}[m_0 , C_0].

Exercise 105.1 (Bivariate Normal Distribution in DLM) In the DLM \{F_t , 1, 100, 0\} suppose that the sequence \theta_t = \theta is a precisely known constant but that the regressor variable sequence F_t is uncontrollable . You model F_t as a sequence of independent normal random quantities, F_t \sim \mathcal{N}[0, 400]. Given \mathcal{D}_{t-1}, answer the following questions.

uncontrollable

Prove that Y_t and F_t have a bivariate normal distribution and identify its mean vector and variance matrix.
What is the correlation between Y_t and F_t?
What is the regression coefficient of F_t on Y_t?
What is the posterior distribution of (F_t \mid Y_t, \mathcal{D}_{t-1})?

Exercise 105.2 (Regression DLM with Known F_t) In Exercise exr-ch3-ex01 suppose that the sequence \theta_t is known but not constant. You also adopt a random walk model for F_t, so that F_t = F_{t-1} + \varepsilon_t with independent \varepsilon_t \sim \mathcal{N}[0, U ]. Show that your overall model is equivalent to the simple regression DLM \{\theta_t , 1, 100, U \} with regressor variable \theta_t and parameter F_t.

Exercise 105.3 (First-order Polynomial DLM) In the DLM \{F_t , 1, V_t , W_t\}, suppose that F_t \neq 0 for all t.

Show that the series X_t = Y_t / F_t follows a first-order polynomial DLM and identify it fully.
Verify that the updating equations for the regression DLM can be deduced from those of the first-order polynomial DLM.
What happens if F_t = 0?

Exercise 105.4 (Predictability Measure) One measure of the predictability of Y_t at time t-1 is the modulus 1/2 of the reciprocal of the coefficient of variation, given by |f_t/Q_t| Explore this measure as a function of F_t \in [-100, 100] for each of the cases R_t = 0, 10, 20, 50, when m_t = 10, and V_t = 100.

Exercise 105.5 (Optimal Design of Control Variable) Suppose that V_t is a known function of a control variable F_t. In particular, let V_t = V(a + |F_t|^p) for known quantities V, a and p.

How should F_t be chosen in order to maximise the posterior precision C_{t-1} subject to |F_t| < k for some k > 0? What is the optimal design value of F_t in the case a = 8, p = 3 and k = 10?
How does this problem change when V is unknown and is estimated from the data?

Exercise 105.6 For the discount regression DLM in which V_t is known for all t and R_t = C_{t-1} /\delta, show that the updating equations can be written as

m_t = C_t \, \delta^{t} \, C_{0}^{-1} m_0 + C_t \sum_{j=0}^{t-1} \delta^{j} F_{t-j} \, V_{t-j}^{-1} \, Y_{t-j}

and

C_{t}^{-1} = \delta^{t} C_{0}^{-1} + \sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2} \, V_{t-j}^{-1}.

Deduce that as t \to \infty, C_{t}^{-1} \to 0 and

m_t \to \frac{\sum_{j=0}^{t-1} \delta^{j} F_{t-j} \, V_{t-j}^{-1} \, Y_{t-j}} {\sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2} \, V_{t-j}^{-1}}.

Exercise 105.7 Consider discount weighted regression applied to the estimation of a parameter \theta_t by the value m_t. In DWR, the estimate m_t is chosen to minimise the discounted sum of squares

S(\theta) = \sum_{j=0}^{t-1} \delta^{j} \left(Y_{t-j} - F_{t-j}\,\theta\right)^{2},

where all quantities other than \theta are known.

Show that

m_t = \frac{\sum_{j=0}^{t-1} \delta^{j} F_{t-j} Y_{t-j}}{\sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2}}.

Generalising (a), suppose that

Y_t = F_t \theta + \nu_t, \qquad \nu_t \sim \mathcal{N}[0, V_t],

and that m_t is more appropriately chosen to minimise

S(\theta) = \sum_{j=0}^{t-1} \delta^{j} V_{t-j}^{-1} \left(Y_{t-j} - F_{t-j}\,\theta\right)^{2}.

Show that

m_t = \frac{\sum_{j=0}^{t-1} \delta^{j} F_{t-j} V_{t-j}^{-1} Y_{t-j}}{\sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2} V_{t-j}^{-1}}.

Compare these results with those of the previous question to see that the estimates correspond to those from the discount DLM with the uninformative prior C_0^{-1} = 0.

Exercise 105.8 Suppose that V_t = V\,k_t, where V = 1/\phi is unknown and k_t is a known variance multiplier. Show how the analysis summarized in the table in [Section 3.4.1] is modified.

Exercise 105.9 Consider the simple regression DLM \{(-1)^t k, 1, V, W\}, in which k > 0 is a known constant.

By reference to the first-order polynomial constant DLM convergence results or otherwise, prove that \lim_{t \to \infty} C_t = C exists. Obtain C and the limiting values of Q_t and |A_t|.
Treating the limiting value C as a function of k, verify that it is equal to W when k = V/(2W).

Exercise 105.10 Consider the company sales/total market series in the example of Section 3.4.2. Perform similar analyses of this data using the same DLM but varying the discount factor over the range 0.6, 0.65, . . . , 1. Explore the sensitivity to inferences about the time trajectory of \theta_t as the discount factor varies in the following ways:

Plot m_t versus t, with intervals based on C_t^{1/2} to represent uncertainty, for each value of \delta and comment on differences with respect to \delta.
Compare the final estimates of observational variance S_{42} as \delta varies. Do the same for prediction variances Q_{42}. Discuss the patterns of behaviour.
Use MSE, MAD and LLR measures to assess the predictive performance of the models relative to the static model defined by \delta = 1.

Exercise 105.11 Consider a retrospective analysis in which inferences are made about historical parametric values based on the current data. In particular, this question concerns inferences about \theta_{t-1} given \mathcal{D}_t for the DLM \{F_t, 1, V_t, W_t\} with known variances.

Use the system equation directly to show that

C[\theta_t, \theta_{t-1}\mid \mathcal{D}_{t-1}] = B_{t-1}\,V[\theta_t \mid \mathcal{D}_{t-1}],

for some B_{t-1} lying between 0 and 1, and identify B_{t-1}.

Deduce that

C[\theta_{t-1}, Y_t \mid \mathcal{D}_{t-1}] = B_{t-1}\,C[\theta_t, Y_t \mid \mathcal{D}_{t-1}].

Hence identify the moments of the joint normal distribution of (\theta_{t-1}, \theta_t, Y_t \mid \mathcal{D}_{t-1}), and from this, those of the conditional distribution of (\theta_{t-1} \mid \mathcal{D}_t) (by conditioning on Y_t in addition to \mathcal{D}_{t-1}). Verify that the regression coefficient of \theta_{t-1} on Y_t is B_{t-1} A_t, where A_t is the usual regression coefficient (adaptive coefficient) of \theta_t on Y_t given \mathcal{D}_{t-1}.
Deduce that (\theta_{t-1} \mid \mathcal{D}_t) is normal with moments that can be written as

E[\theta_{t-1} \mid \mathcal{D}_t] = m_{t-1} + B_{t-1}\big(E[\theta_t \mid \mathcal{D}_t] - E[\theta_t \mid \mathcal{D}_{t-1}]\big)

and

V[\theta_{t-1} \mid \mathcal{D}_t] = C_{t-1} - B_{t-1}^{2}\,\big(V[\theta_t \mid \mathcal{D}_{t-1}] - V[\theta_t \mid \mathcal{D}_t]\big).

Exercise 105.12 Generalize the results of the previous exercise to allow retrospection back over time for more than one step, calculating the distribution of (\theta_{t-k} \mid \mathcal{D}_t) for any k, (0 \le k \le t). Do this as follows:

Using the observation and evolution equations directly, show that for any r \ge 1, C[\theta_{t-k}, Y_{t-k+r} \mid \mathcal{D}_{t-k}] \;=\; B_{t-k}\, C[\theta_{t-k+1}, Y_{t-k+r} \mid \mathcal{D}_{t-k}], where for any s, B_s = C_s / R_{s+1} lies between 0 and 1.
Writing X_t(k) = (Y_{t-k+1}, \ldots, Y_t), deduce from (a) that C[\theta_{t-k}, X_t(k) \mid \mathcal{D}_{t-k}] \;=\; B_{t-k}\, C[\theta_{t-k+1}, X_t(k) \mid \mathcal{D}_{t-k}].
Hence identify the moments of the joint normal distribution of (\theta_{t-k}, \theta_{t-k+1}, X_t(k) \mid \mathcal{D}_{t-k}), and from this those of the conditional distributions of (\theta_{t-k} \mid \mathcal{D}_t) and (\theta_{t-k+1} \mid \mathcal{D}_t) (by conditioning on X_t(k) in addition to \mathcal{D}_{t-k} and noting that \mathcal{D}_t = \{X_t(k), \mathcal{D}_{t-k}\}). Using (b), verify that the regression coefficient vector of \theta_{t-k} on X_t(k) is B_{t-k} times that of \theta_{t-k+1} on X_t(k).
Deduce that (\theta_{t-k} \mid \mathcal{D}_t) is normal with moments that can be written as E[\theta_{t-k} \mid \mathcal{D}_t] \;=\; m_{t-k} + B_{t-k}\big(E[\theta_{t-k+1} \mid \mathcal{D}_t] - E[\theta_{t-k+1} \mid \mathcal{D}_{t-k}]\big) and V[\theta_{t-k} \mid \mathcal{D}_t] \;=\; C_{t-k} - B_{t-k}^{2}\,\big(V[\theta_{t-k+1} \mid \mathcal{D}_{t-k}] - V[\theta_{t-k+1} \mid \mathcal{D}_t]\big).
Let the above moments be denoted by a_t(-k) and R_t(-k), so that (\theta_{t-k} \mid \mathcal{D}_t) \sim \mathcal{N}[a_t(-k), R_t(-k)]. Verify that the above, retrospective updating equations provide these moments back- wards over time for k = t - 1, t - 2, \ldots, 0 via a_t(-k) = m_{t-k} + B_{t-k}\,[a_t(-k + 1) - a_{t-k+1}] and R_t(-k) = C_{t-k} - B_{t-k}^{2}\,[R_t(-k + 1) - R_{t-k+1}] with a_s = m_{s-1} and R_s = C_{s-1} + W_s for all s, and initial values a_t(0) = m_t and R_t(0) = C_t.

Exercise 105.13 In the last two questions, employing the discount regression model \{F_t , 1, V, W_t\}, where W_t = C_{t-1} (\delta^{-1} - 1), show that for k \ge 0, B_{t-1} = \delta, a_t (-k) = a_{t-1} (-k + 1) + \delta^{k} A_t e_t and R_t (-k) = R_{t-1} (-k + 1) - \delta^{2k} A_t^{2} Q_t. This provides for neat and simple updating of the retrospective means and variances.

Exercise 105.14 Suppose the yield Y_t of the tth batch of a manufacturing plant is truly represented by

\begin{aligned} Y_t &= 70 - (X_t - 3)^{2} + \eta_t, && \eta_t &\sim \mathcal{N}[0, V], \\ X_t &\sim \{F_t , 1, V, W\}, && (\theta_0 \mid \mathcal{D}_0) & \sim \mathcal{N}[1, V]. \end{aligned}

Initially, the setting F_1 = 3 is optimal in the sense of maximising the expected yield.

If F_t is kept constant at 3, or if from any other speciﬁed time it is kept constant at its then perceived optimal value, what are the consequences?
Plant managers have production targets to meet and dislike changing operating conditions, fearing a drop in yield. If you were the production director would you approve this attitude or would you introduce a policy encouraging plant managers to make regular small experimental variations about the then perceived optimal value of F_t?

Exercise 105.15 The following data set refers to an internationally famous canned product. The objective is to establish a relationship between market share and price to make short-term pricing decisions. The observation series Y_t is the percentage market share for quarter t minus 42\%, and F_t is a linear function of the real price.

Qtr. t	1	2	3	4	5	6	7	8	9	10	11	12
Y_t	0.45	0.83	1.45	0.88	-1.43	-1.50	-2.33	-0.78	0.58	1.10	?	?
F_T	-0.50	-1.30	-1.50	-0.84	-0.65	-1.19	2.12	0.46	-0.63	-1.22	-2.00	2.00

Adopt the simple discount regression DLM \{F_t , 1, V, W_t \} with \delta = 0.975.

Carry out sequential forecasting with known variance V = 0.2 and \theta_0 \sim \mathcal{N}[0, 1]. Either by hand or computer prepare a calculation table that produces R_t , A_t , f_t , Q_t , e_t , m_t and C_t for each t. What are your final inferences about the price elasticity \theta_{10} ? What is your forecast for the market share in the next two quarters?
Repeat the analysis and inference when V is an unknown constant variance starting with n_0 = 1 and S_0 = 0.2.

Solution. Here are elementary solutions in python and R that.

Update equations

The updates equations are:

Known V:

\begin{aligned} a_t&=m_{t-1} & \text{(prior mean)}\\ R_t&=\tfrac{C_{t-1}}{\delta} & \text{(prior var, discount)}\\ f_t&=F_t a_t,\quad e_t=Y_t-f_t & \text{(forecast \& error)}\\ Q_t&=F_t^2R_t+V & \text{(forecast var)}\\ A_t&=\tfrac{R_tF_t}{Q_t} & \text{(gain)}\\ m_t&=a_t+A_te_t & \text{(posterior mean)}\\ C_t&=(1-A_tF_t)R_t & \text{(posterior var)} \end{aligned}

Unknown V (NIG, “starred” form):

\begin{aligned} a_t&=m_{t-1},\quad R_t^*=\tfrac{C_{t-1}^*}{\delta} \\ f_t&=F_t a_t,\quad e_t=Y_t-f_t \\ q_t^*&=F_t^2R_t^*+1,\quad A_t=\tfrac{R_t^*F_t}{q_t^*} \\ m_t&=a_t+A_te_t,\quad C_t^*=(1-A_tF_t)R_t^* \\ n_t&=n_{t-1}+1,\quad S_t=\frac{n_{t-1}S_{t-1}+e_t^2/q_t^*}{n_t} \end{aligned}

Forecast h-ahead: mean

\begin{aligned} F_{t+h}m_t \text{, var } F_{t+h}^2\,C_t/\delta^h+V && \text{(known V)} \\ S_t q_{t+h}^* \text{ with } q_{t+h}^*=F_{t+h}^2(C_t^*/\delta^h)+1 && \text{(unknown V)} \end{aligned}

Python
R

# Compute tables for Exercise ch3-ex15 parts (a) and (b) and save as CSVs
import pandas as pd
import numpy as np
import math

delta = 0.975
V_known = 0.2
m0 = 0.0
C0 = 1.0

Y = [0.45, 0.83, 1.45, 0.88, -1.43, -1.50, -2.33, -0.78, 0.58, 1.10]  # t=1..10
F = [-0.50, -1.30, -1.50, -0.84, -0.65, -1.19, 2.12, 0.46, -0.63, -1.22, -2.00, 2.00]  # t=1..12

def run_known_variance(Y, F, m0, C0, delta, V):
    rows = []
    m_prev = m0
    C_prev = C0
    for t in range(1, 11):  # t=1..10
        Ft = F[t-1]
        yt = Y[t-1]
        a_t = m_prev
        R_t = C_prev / delta
        f_t = Ft * a_t
        Q_t = Ft*Ft * R_t + V
        e_t = yt - f_t
        A_t = R_t * Ft / Q_t
        m_t = a_t + A_t * e_t
        C_t = (1 - A_t * Ft) * R_t
        rows.append({
            "t": t, "Ft": Ft, "Yt": yt, "R_t": R_t, "A_t": A_t, "f_t": f_t, "Q_t": Q_t, "e_t": e_t, "m_t": m_t, "C_t": C_t
        })
        m_prev, C_prev = m_t, C_t
    # forecasts for t=11 and t=12 without new data
    forecasts = []
    C_t = C_prev
    m_t = m_prev
    for h in [1,2]:
        Ft = F[10 + h - 1]  # F_11, F_12
        R_h = C_t / (delta**h)
        f_h = Ft * m_t
        Q_h = Ft*Ft * R_h + V
        forecasts.append({"h": h, "Ft": Ft, "f": f_h, "Q": Q_h, "sd": math.sqrt(Q_h),
                          "Share_mean_%": 42.0 + f_h})
    return pd.DataFrame(rows), {"m10": m_prev, "C10": C_prev}, pd.DataFrame(forecasts)

def run_unknown_variance(Y, F, m0, C0_star, delta, n0, S0):
    rows = []
    m_prev = m0
    C_prev = C0_star
    n = n0
    S = S0
    for t in range(1, 11):
        Ft = F[t-1]; yt = Y[t-1]
        a_t = m_prev
        R_t = C_prev / delta  # starred
        f_t = Ft * a_t
        q_t = Ft*Ft * R_t + 1.0  # starred
        e_t = yt - f_t
        A_t = R_t * Ft / q_t
        m_t = a_t + A_t * e_t
        C_t = (1 - A_t * Ft) * R_t
        n_new = n + 1
        S_new = (n * S + (e_t**2) / q_t) / n_new
        rows.append({
            "t": t, "Ft": Ft, "Yt": yt, "R_t*": R_t, "A_t": A_t, "f_t": f_t, "q_t*": q_t, "e_t": e_t, 
            "m_t": m_t, "C_t*": C_t, "n_t": n_new, "S_t": S_new
        })
        m_prev, C_prev, n, S = m_t, C_t, n_new, S_new
    forecasts = []
    for h in [1,2]:
        Ft = F[10 + h - 1]
        R_h = C_prev / (delta**h)
        q_h = Ft*Ft * R_h + 1.0
        var = S * q_h
        mean = Ft * m_prev
        forecasts.append({"h": h, "Ft": Ft, "f": mean, "q*": q_h, "S": S, "Q": var, "sd": math.sqrt(var),
                          "Share_mean_%": 42.0 + mean})
    return pd.DataFrame(rows), {"m10": m_prev, "C10*": C_prev, "n10": n, "S10": S}, pd.DataFrame(forecasts)

# Run both parts
df_known, state10_known, fc_known = run_known_variance(Y, F, m0, C0, delta, V_known)
df_u, state10_u, fc_u = run_unknown_variance(Y, F, m0, 1.0, delta, 1, 0.2)

# Save CSVs
df_known.to_csv('exr-ch3-ex15_partA_knownV_filter.csv', index=False)
fc_known.to_csv('exr-ch3-ex15_partA_knownV_forecasts.csv', index=False)
df_u.to_csv('exr-ch3-ex15_partB_unknownV_filter.csv', index=False)
fc_u.to_csv('exr-ch3-ex15_partB_unknownV_forecasts.csv', index=False)

# Show compact summaries
state10_known, state10_u, fc_known.round(4), fc_u.round(4)

({'m10': -0.6355690433103104, 'C10': 0.01650248358749487}, {'m10': -0.6045867145935548, 'C10*': 0.07849015335007685, 'n10': 11, 'S10': 0.9760333998940381},    h   Ft       f       Q      sd  Share_mean_%
0  1 -2.0  1.2711  0.2677  0.5174       43.2711
1  2  2.0 -1.2711  0.2694  0.5191       40.7289,    h   Ft       f      q*      S       Q      sd  Share_mean_%
0  1 -2.0  1.2092  1.3220  0.976  1.2903  1.1359       43.2092
1  2  2.0 -1.2092  1.3303  0.976  1.2984  1.1395       40.7908)

## exr-ch3-ex15 — discount regression DLM in base R
## Part (a): known V; Part (b): unknown V via NIG (starred) recursions.
## References: Prado, Ferreira & West (2023), Ch.3 discount regression.

Y <- c(0.45, 0.83, 1.45, 0.88, -1.43, -1.50, -2.33, -0.78, 0.58, 1.10) # t=1..10
Fv <- c(-0.50, -1.30, -1.50, -0.84, -0.65, -1.19, 2.12, 0.46, -0.63, -1.22, -2.00, 2.00) # t=1..12
delta <- 0.975

kstep_R <- function(C, delta, h) C / (delta^h)

## ---------- Part (a): KNOWN V ----------
dlm_knownV <- function(Y, Fv, delta, V, m0=0, C0=1) {
  stopifnot(length(Y) == 10L, length(Fv) >= 12L)
  n <- length(Y)
  rows <- vector("list", n)
  m <- m0; C <- C0
  for (t in seq_len(n)) {
    Ft <- Fv[t]; yt <- Y[t]
    a  <- m
    R  <- C / delta
    f  <- Ft * a
    Q  <- Ft*Ft * R + V
    e  <- yt - f
    A  <- R * Ft / Q
    m  <- a + A * e
    C  <- (1 - A * Ft) * R
    rows[[t]] <- data.frame(t=t, Ft=Ft, Yt=yt, R_t=R, A_t=A, f_t=f, Q_t=Q, e_t=e, m_t=m, C_t=C)
  }
  filt <- do.call(rbind, rows)

  ## 1- and 2-step forecasts (t=11,12) using last (m,C)
  fcast <- lapply(1:2, function(h) {
    Ft <- Fv[10 + h]
    R_h <- kstep_R(C, delta, h)
    f_h <- Ft * m
    Q_h <- Ft*Ft * R_h + V
    data.frame(h=h, Ft=Ft, f=f_h, Q=Q_h, sd=sqrt(Q_h),
               Share_mean_pct = 42 + f_h,
               Share_L95 = 42 + (f_h - qnorm(0.975)*sqrt(Q_h)),
               Share_U95 = 42 + (f_h + qnorm(0.975)*sqrt(Q_h)))
  })
  fcast <- do.call(rbind, fcast)

  list(filter=filt, state=list(m10=m, C10=C), forecast=fcast)
}

## ---------- Part (b): UNKNOWN V (NIG; starred) ----------
dlm_unknownV <- function(Y, Fv, delta, n0=1, S0=0.2, m0=0, C0_star=1) {
  stopifnot(length(Y) == 10L, length(Fv) >= 12L)
  n <- length(Y)
  rows <- vector("list", n)
  m <- m0; C <- C0_star; ndf <- n0; S <- S0
  for (t in seq_len(n)) {
    Ft <- Fv[t]; yt <- Y[t]
    a   <- m
    R   <- C / delta                   # starred
    f   <- Ft * a
    qst <- Ft*Ft * R + 1               # q_t^*
    e   <- yt - f
    A   <- R * Ft / qst
    m   <- a + A * e
    C   <- (1 - A * Ft) * R
    ndf_new <- ndf + 1
    S_new   <- (ndf*S + (e^2)/qst) / ndf_new
    rows[[t]] <- data.frame(t=t, Ft=Ft, Yt=yt, R_t_star=R, A_t=A, f_t=f,
                            q_t_star=qst, e_t=e, m_t=m, C_t_star=C,
                            n_t=ndf_new, S_t=S_new)
    ndf <- ndf_new; S <- S_new
  }
  filt <- do.call(rbind, rows)

  ## Forecasts (t=11,12): Student-t with df = n_t, mean = Ft*m, var = S * q*_h
  tcrit <- function(df) qt(0.975, df=df)
  fcast <- lapply(1:2, function(h) {
    Ft <- Fv[10 + h]
    R_h <- kstep_R(C, delta, h)
    qh  <- Ft*Ft * R_h + 1
    mean <- Ft * m
    var  <- S * qh
    sd   <- sqrt(var)
    df   <- ndf
    mL   <- mean - tcrit(df)*sd
    mU   <- mean + tcrit(df)*sd
    data.frame(h=h, Ft=Ft, f=mean, q_star=qh, S=S, Q=var, sd=sd, df=df,
               Share_mean_pct = 42 + mean,
               Share_L95 = 42 + mL,
               Share_U95 = 42 + mU)
  })
  fcast <- do.call(rbind, fcast)

  list(filter=filt, state=list(m10=m, C10_star=C, n10=ndf, S10=S),
       forecast=fcast)
}

## ---------- Run both parts ----------
res_a <- dlm_knownV(Y, Fv, delta, V=0.2, m0=0, C0=1)
res_b <- dlm_unknownV(Y, Fv, delta, n0=1, S0=0.2, m0=0, C0_star=1)

## Save tables if desired:
# write.csv(res_a$filter, "exr-ch3-ex15_partA_knownV_filter.csv", row.names=FALSE)
# write.csv(res_a$forecast, "exr-ch3-ex15_partA_knownV_forecasts.csv", row.names=FALSE)
# write.csv(res_b$filter, "exr-ch3-ex15_partB_unknownV_filter.csv", row.names=FALSE)
# write.csv(res_b$forecast, "exr-ch3-ex15_partB_unknownV_forecasts.csv", row.names=FALSE)

## ---------- Print key answers ----------
cat("Part (a): known V=0.2\n")

Part (a): known V=0.2

cat(sprintf("theta_10 mean=%.4f, Var=%.5f, SE=%.4f\n",
            res_a$state$m10, res_a$state$C10, sqrt(res_a$state$C10)))

theta_10 mean=-0.6356, Var=0.01650, SE=0.1285

ci_a <- res_a$state$m10 + c(-1,1)*qnorm(0.975)*sqrt(res_a$state$C10)
cat(sprintf("95%% CI for theta_10: [%.4f, %.4f]\n\n", ci_a[1], ci_a[2]))

95% CI for theta_10: [-0.8873, -0.3838]

print(res_a$forecast, row.names=FALSE)

 h Ft         f         Q        sd Share_mean_pct Share_L95 Share_U95
 1 -2  1.271138 0.2677025 0.5173997       43.27114  42.25705  44.28522
 2  2 -1.271138 0.2694385 0.5190746       40.72886  39.71149  41.74623

cat("\nPart (b): unknown V (n0=1, S0=0.2)\n")


Part (b): unknown V (n0=1, S0=0.2)

cat(sprintf("theta_10 mean=%.4f, Var|V = S10*C10* = %.5f, SE=%.4f, n10=%d\n",
            res_b$state$m10, res_b$state$S10*res_b$state$C10_star,
            sqrt(res_b$state$S10*res_b$state$C10_star), res_b$state$n10))

theta_10 mean=-0.6046, Var|V = S10*C10* = 0.07661, SE=0.2768, n10=11

tcrit10 <- qt(0.975, df=res_b$state$n10)
ci_b <- res_b$state$m10 + c(-1,1)*tcrit10*sqrt(res_b$state$S10*res_b$state$C10_star)
cat(sprintf("95%% t CI for theta_10: [%.4f, %.4f]\n\n", ci_b[1], ci_b[2]))

95% t CI for theta_10: [-1.2138, 0.0046]

print(res_b$forecast, row.names=FALSE)

 h Ft         f   q_star         S        Q       sd df Share_mean_pct
 1 -2  1.209173 1.322011 0.9760334 1.290327 1.135926 11       43.20917
 2  2 -1.209173 1.330268 0.9760334 1.298386 1.139467 11       40.79083
 Share_L95 Share_U95
  40.70902  45.70933
  38.28288  43.29878

## ------- Expected numeric (sanity, should match ±1e-3):
## Part (a): theta_10 ≈ -0.6356, Var ≈ 0.01650
## Q11 mean share ≈ 43.271, Q12 ≈ 40.729
## Part (b): theta_10 ≈ -0.6046, Var|V ≈ 0.0766, n10=11
## Q11 mean share ≈ 43.209, Q12 ≈ 40.791

Here’s the full sequential fit and the forecasts.

105.1.1 (a) Known V=0.2, \delta=0.975, \theta_0\sim\mathcal N(0,1)

Final posterior for price elasticity at t=10: \hat\theta_{10}= -0.6356,\quad \mathrm{Var}(\theta_{10})=0.01650 \Rightarrow s.e. =0.1285. 95% CI: [-0.8874,\,-0.3838].
One- and two-step forecasts (market share = 42\%+Y):
- Q11: \mathbb E[\text{share}]=43.27\%, 95% PI [42.26,\,44.29]\%.
- Q12: \mathbb E[\text{share}]=40.73\%, 95% PI [39.71,\,41.75]\%.

Calc table (per t=1{:}10) with R_t,A_t,f_t,Q_t,e_t,m_t,C_t: part A filter (CSV) · part A forecasts (CSV)

105.1.2 (b) Unknown V with n_0=1,\,S_0=0.2 (standard NIG DLM; starred form, \delta=0.975)

Final posterior for price elasticity at t=10: \hat\theta_{10}= -0.6046,\quad \mathrm{Var}(\theta_{10}\mid V)=S_{10}C_{10}^*=0.9760\times0.07849=0.0766 \Rightarrow s.e. =0.2768, n_{10}=11. 95% t_{11} CI: [-1.2138,\,0.0046].
Forecasts (wider, since V unknown):
- Q11: \mathbb E[\text{share}]=43.21\%, 95% PI [40.71,\,45.71]\%.
- Q12: \mathbb E[\text{share}]=40.79\%, 95% PI [38.28,\,43.30]\%.

Calc table (per t=1{:}10) with R_t^*,A_t,f_t,q_t^*,e_t,m_t,C_t^*,n_t,S_t: part B filter (CSV) · part B forecasts (CSV)

Notes: Discount-regression recursions as in Prado, Ferreira & West (2023), ch. 3. For (b), starred forms use q_t^*=F_t^2 R_t^*+1, A_t=R_t^*F_t/q_t^*, C_t^*=(1-A_tF_t)R_t^*, and S_t=(n_{t-1}S_{t-1}+e_t^2/q_t^*)/n_t; predictive variances are S_t q_{t+h}^*.

Overthinking Elasticity

So this is not a typical price elasticity calculation. The usual definitions are due to Alfred Marshall text Principles of Economics cica 1890.

E_d = \frac {\Delta Q/Q}{\Delta P/P} \qquad \text{(elasticity)} \tag{105.1}

E_d = \frac {\left({\frac {P_{1}+P_{2}}{2}}\right)}{\left({\frac {Q_{d_{1}}+Q_{d_{2}}}{2}}\right)}}\times {\frac {\Delta Q_{d}}{\Delta P}}={\frac {P_{1}+P_{2}}{Q_{d_{1}}+Q_{d_{2}}}}\times {\frac {\Delta Q_{d}}{\Delta P}} \qquad \text{(arc elasticity)} \tag{105.2}

where Q is quantity sold, P is price, and E_d is the elasticity.

E_d= \frac{\partial Q}{\partial P} \cdot \frac{P}{Q} \qquad \text{(point elasticity)} \tag{105.3}

Here we need to know the arc elasticity function and to be able to differentiate it.

Another subtlety is due to the distinction between Marshallian and Hicksian demand. The latter includes income effects due to ratio of a budget (a function of income) to the price can be substantial, possibly dominating the price effect e.g. for Giffen Goods)

We usually look at % unit sold

To estimate elasticity, we need to know the price at which the market share is measured. In this case, the price is not given in the data, but it is implied by the values of F_t. The elasticity can be interpreted as the percentage change in market share for a 1% change in price.

--- title: "Q&A from Bayesian Forecasting and Dynamic Models - Chapter 3" --- ## Question from Chapter 3 ::: {.callout-tip} Unless stated explicitly these exercises refer to a univariate time series $\{Y_t\}$ modelled by the closed regression DLM $\{F_t , 1, V_t , W_t\}$ of Definition 3.2 with known variances and/or known discount factor $\delta$ as follows: - Observation equation: $Y_t = F_t \theta_t + \nu_t$, $\nu_t \sim \mathcal{N}[0, V_t]$, - System equation: $\theta_t = \theta_{t-1} + \omega_t$, $\omega_t \sim \mathcal{N}[0, W_t]$, - Initial prior: $(\theta_0 \mid \mathcal{D}_0) \sim \mathcal{N}[m_0 , C_0]$. ::: :::: {#exr-ch3-ex01} ### Bivariate Normal Distribution in DLM In the DLM $\{F_t , 1, 100, 0\}$ suppose that the sequence $\theta_t = \theta$ is a precisely known constant but that the regressor variable sequence $F_t$ is uncontrollable [**uncontrollable**]{.column-margin}. You model $F_t$ as a sequence of independent normal random quantities, $F_t \sim \mathcal{N}[0, 400]$. Given $\mathcal{D}_{t-1}$, answer the following questions. (a) Prove that $Y_t$ and $F_t$ have a bivariate normal distribution and identify its mean vector and variance matrix. (b) What is the correlation between $Y_t$ and $F_t$? (c) What is the regression coefficient of $F_t$ on $Y_t$? (d) What is the posterior distribution of $(F_t \mid Y_t, \mathcal{D}_{t-1})$? :::: ::: {#exr-ch3-ex02} ### Regression DLM with Known $F_t$ In @exr-ch3-ex01 suppose that the sequence $\theta_t$ is known but not constant. You also adopt a random walk model for $F_t$, so that $F_t = F_{t-1} + \varepsilon_t$ with independent $\varepsilon_t \sim \mathcal{N}[0, U ]$. Show that your overall model is equivalent to the simple regression DLM $\{\theta_t , 1, 100, U \}$ with regressor variable $\theta_t$ and parameter $F_t$. ::: ::: {#exr-ch3-ex03} ### First-order Polynomial DLM In the DLM $\{F_t , 1, V_t , W_t\}$, suppose that $F_t \neq 0$ for all $t$. (a) Show that the series $X_t = Y_t / F_t$ follows a first-order polynomial DLM and identify it fully. (b) Verify that the updating equations for the regression DLM can be deduced from those of the first-order polynomial DLM. (c) What happens if $F_t = 0$? ::: ::: {#exr-ch3-ex04} ### Predictability Measure One measure of the predictability of $Y_t$ at time $t-1$ is the modulus $1/2$ of the reciprocal of the coefficient of variation, given by $|f_t/Q_t|$ Explore this measure as a function of $F_t \in [-100, 100]$ for each of the cases $R_t = 0, 10, 20, 50$, when $m_t = 10$, and $V_t = 100$. ::: ::: {#exr-ch3-ex05} ### Optimal Design of Control Variable Suppose that $V_t$ is a known function of a control variable $F_t$. In particular, let $V_t = V(a + |F_t|^p)$ for known quantities $V$, $a$ and $p$. (a) How should $F_t$ be chosen in order to maximise the posterior precision $C_{t-1}$ subject to $|F_t| < k$ for some $k > 0$? What is the optimal design value of $F_t$ in the case $a = 8$, $p = 3$ and $k = 10$? (b) How does this problem change when $V$ is unknown and is estimated from the data? ::: ::: {#exr-ch3-ex06} For the discount regression DLM in which $V_t$ is known for all $t$ and $R_t = C_{t-1} /\delta$, show that the updating equations can be written as $$ m_t = C_t \, \delta^{t} \, C_{0}^{-1} m_0 + C_t \sum_{j=0}^{t-1} \delta^{j} F_{t-j} \, V_{t-j}^{-1} \, Y_{t-j} $$ and $$ C_{t}^{-1} = \delta^{t} C_{0}^{-1} + \sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2} \, V_{t-j}^{-1}. $$ Deduce that as $t \to \infty$, $C_{t}^{-1} \to 0$ and $$ m_t \to \frac{\sum_{j=0}^{t-1} \delta^{j} F_{t-j} \, V_{t-j}^{-1} \, Y_{t-j}} {\sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2} \, V_{t-j}^{-1}}. $$ ::: ::: {#exr-ch3-ex07} Consider discount weighted regression applied to the estimation of a parameter $\theta_t$ by the value $m_t$. In DWR, the estimate $m_t$ is chosen to minimise the discounted sum of squares $$ S(\theta) = \sum_{j=0}^{t-1} \delta^{j} \left(Y_{t-j} - F_{t-j}\,\theta\right)^{2}, $$ where all quantities other than $\theta$ are known. (a) Show that $$ m_t = \frac{\sum_{j=0}^{t-1} \delta^{j} F_{t-j} Y_{t-j}}{\sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2}}. $$ (b) Generalising (a), suppose that $$ Y_t = F_t \theta + \nu_t, \qquad \nu_t \sim \mathcal{N}[0, V_t], $$ and that $m_t$ is more appropriately chosen to minimise $$ S(\theta) = \sum_{j=0}^{t-1} \delta^{j} V_{t-j}^{-1} \left(Y_{t-j} - F_{t-j}\,\theta\right)^{2}. $$ Show that $$ m_t = \frac{\sum_{j=0}^{t-1} \delta^{j} F_{t-j} V_{t-j}^{-1} Y_{t-j}}{\sum_{j=0}^{t-1} \delta^{j} F_{t-j}^{2} V_{t-j}^{-1}}. $$ (c) Compare these results with those of the previous question to see that the estimates correspond to those from the discount DLM with the uninformative prior $C_0^{-1} = 0$. ::: ::: {#exr-ch3-ex08} Suppose that $V_t = V\,k_t$, where $V = 1/\phi$ is unknown and $k_t$ is a known variance multiplier. Show how the analysis summarized in the table in [Section 3.4.1] is modified. ::: ::: {#exr-ch3-ex09} Consider the simple regression DLM $\{(-1)^t k, 1, V, W\}$, in which $k > 0$ is a known constant. (a) By reference to the first-order polynomial constant DLM convergence results or otherwise, prove that $\lim_{t \to \infty} C_t = C$ exists. Obtain $C$ and the limiting values of $Q_t$ and $|A_t|$. (b) Treating the limiting value $C$ as a function of $k$, verify that it is equal to $W$ when $k = V/(2W)$. ::: ::: {#exr-ch3-ex10} Consider the company sales/total market series in the example of Section 3.4.2. Perform similar analyses of this data using the same DLM but varying the discount factor over the range 0.6, 0.65, . . . , 1. Explore the sensitivity to inferences about the time trajectory of $\theta_t$ as the discount factor varies in the following ways: (a) Plot $m_t$ versus $t$, with intervals based on $C_t^{1/2}$ to represent uncertainty, for each value of $\delta$ and comment on differences with respect to $\delta$. (b) Compare the final estimates of observational variance $S_{42}$ as $\delta$ varies. Do the same for prediction variances $Q_{42}$. Discuss the patterns of behaviour. (c) Use MSE, MAD and LLR measures to assess the predictive performance of the models relative to the static model defined by $\delta = 1$. ::: ::: {#exr-ch3-ex11} Consider a retrospective analysis in which inferences are made about historical parametric values based on the current data. In particular, this question concerns inferences about $\theta_{t-1}$ given $\mathcal{D}_t$ for the DLM $\{F_t, 1, V_t, W_t\}$ with known variances. (a) Use the system equation directly to show that $$ C[\theta_t, \theta_{t-1}\mid \mathcal{D}_{t-1}] = B_{t-1}\,V[\theta_t \mid \mathcal{D}_{t-1}], $$ for some $B_{t-1}$ lying between $0$ and $1$, and identify $B_{t-1}$. (b) Deduce that $$ C[\theta_{t-1}, Y_t \mid \mathcal{D}_{t-1}] = B_{t-1}\,C[\theta_t, Y_t \mid \mathcal{D}_{t-1}]. $$ (c) Hence identify the moments of the joint normal distribution of $(\theta_{t-1}, \theta_t, Y_t \mid \mathcal{D}_{t-1})$, and from this, those of the conditional distribution of $(\theta_{t-1} \mid \mathcal{D}_t)$ (by conditioning on $Y_t$ in addition to $\mathcal{D}_{t-1}$). Verify that the regression coefficient of $\theta_{t-1}$ on $Y_t$ is $B_{t-1} A_t$, where $A_t$ is the usual regression coefficient (adaptive coefficient) of $\theta_t$ on $Y_t$ given $\mathcal{D}_{t-1}$. (d) Deduce that $(\theta_{t-1} \mid \mathcal{D}_t)$ is normal with moments that can be written as $$ E[\theta_{t-1} \mid \mathcal{D}_t] = m_{t-1} + B_{t-1}\big(E[\theta_t \mid \mathcal{D}_t] - E[\theta_t \mid \mathcal{D}_{t-1}]\big) $$ and $$ V[\theta_{t-1} \mid \mathcal{D}_t] = C_{t-1} - B_{t-1}^{2}\,\big(V[\theta_t \mid \mathcal{D}_{t-1}] - V[\theta_t \mid \mathcal{D}_t]\big). $$ ::: ::: {#exr-ch3-ex12} Generalize the results of the previous exercise to allow retrospection back over time for more than one step, calculating the distribution of $(\theta_{t-k} \mid \mathcal{D}_t)$ for any $k$, $(0 \le k \le t)$. Do this as follows: (a) Using the observation and evolution equations directly, show that for any $r \ge 1$, $$ C[\theta_{t-k}, Y_{t-k+r} \mid \mathcal{D}_{t-k}] \;=\; B_{t-k}\, C[\theta_{t-k+1}, Y_{t-k+r} \mid \mathcal{D}_{t-k}], $$ where for any $s$, $B_s = C_s / R_{s+1}$ lies between $0$ and $1$. (b) Writing $X_t(k) = (Y_{t-k+1}, \ldots, Y_t)$, deduce from (a) that $$ C[\theta_{t-k}, X_t(k) \mid \mathcal{D}_{t-k}] \;=\; B_{t-k}\, C[\theta_{t-k+1}, X_t(k) \mid \mathcal{D}_{t-k}]. $$ (c) Hence identify the moments of the joint normal distribution of $(\theta_{t-k}, \theta_{t-k+1}, X_t(k) \mid \mathcal{D}_{t-k})$, and from this those of the conditional distributions of $(\theta_{t-k} \mid \mathcal{D}_t)$ and $(\theta_{t-k+1} \mid \mathcal{D}_t)$ (by conditioning on $X_t(k)$ in addition to $\mathcal{D}_{t-k}$ and noting that $\mathcal{D}_t = \{X_t(k), \mathcal{D}_{t-k}\}$). Using (b), verify that the regression coefficient vector of $\theta_{t-k}$ on $X_t(k)$ is $B_{t-k}$ times that of $\theta_{t-k+1}$ on $X_t(k)$. (d) Deduce that $(\theta_{t-k} \mid \mathcal{D}_t)$ is normal with moments that can be written as $$ E[\theta_{t-k} \mid \mathcal{D}_t] \;=\; m_{t-k} + B_{t-k}\big(E[\theta_{t-k+1} \mid \mathcal{D}_t] - E[\theta_{t-k+1} \mid \mathcal{D}_{t-k}]\big) $$ and $$ V[\theta_{t-k} \mid \mathcal{D}_t] \;=\; C_{t-k} - B_{t-k}^{2}\,\big(V[\theta_{t-k+1} \mid \mathcal{D}_{t-k}] - V[\theta_{t-k+1} \mid \mathcal{D}_t]\big). $$ (e) Let the above moments be denoted by $a_t(-k)$ and $R_t(-k)$, so that $(\theta_{t-k} \mid \mathcal{D}_t) \sim \mathcal{N}[a_t(-k), R_t(-k)]$. Verify that the above, retrospective updating equations provide these moments back- wards over time for $k = t - 1, t - 2, \ldots, 0$ via $$ a_t(-k) = m_{t-k} + B_{t-k}\,[a_t(-k + 1) - a_{t-k+1}] $$ and $$ R_t(-k) = C_{t-k} - B_{t-k}^{2}\,[R_t(-k + 1) - R_{t-k+1}] $$ with $a_s = m_{s-1}$ and $R_s = C_{s-1} + W_s$ for all $s$, and initial values $a_t(0) = m_t$ and $R_t(0) = C_t$. ::: :::: {#exr-ch3-ex13} In the last two questions, employing the discount regression model $\{F_t , 1, V, W_t\}$, where $W_t = C_{t-1} (\delta^{-1} - 1)$, show that for $k \ge 0$, $B_{t-1} = \delta$, $a_t (-k) = a_{t-1} (-k + 1) + \delta^{k} A_t e_t$ and $R_t (-k) = R_{t-1} (-k + 1) - \delta^{2k} A_t^{2} Q_t$. This provides for neat and simple updating of the retrospective means and variances. :::: :::: {#exr-ch3-ex14} Suppose the yield $Y_t$ of the $t$th batch of a manufacturing plant is truly represented by $$ \begin{aligned} Y_t &= 70 - (X_t - 3)^{2} + \eta_t, && \eta_t &\sim \mathcal{N}[0, V], \\ X_t &\sim \{F_t , 1, V, W\}, && (\theta_0 \mid \mathcal{D}_0) & \sim \mathcal{N}[1, V]. \end{aligned} $$ Initially, the setting $F_1 = 3$ is optimal in the sense of maximising the expected yield. (a) If $F_t$ is kept constant at 3, or if from any other speciﬁed time it is kept constant at its then perceived optimal value, what are the consequences? (b) Plant managers have production targets to meet and dislike changing operating conditions, fearing a drop in yield. If you were the production director would you approve this attitude or would you introduce a policy encouraging plant managers to make regular small experimental variations about the then perceived optimal value of $F_t$? :::: ::: {#exr-ch3-ex15} The following data set refers to an internationally famous canned product. The objective is to establish a relationship between market share and price to make short-term pricing decisions. The observation series $Y_t$ is the percentage market share for quarter $t$ minus $42\%$, and $F_t$ is a linear function of the real price. |Qtr. t| 1 | 2 | 3 | 4|5|6|7|8|9|10|11|12| |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- | |$Y_t$ | 0.45| 0.83 | 1.45| 0.88|-1.43|-1.50|-2.33|-0.78| 0.58| 1.10| ?| ? |$F_T$ |-0.50|-1.30 |-1.50|-0.84|-0.65|-1.19|2.12|0.46|-0.63|-1.22|-2.00|2.00| Adopt the simple discount regression DLM $\{F_t , 1, V, W_t \}$ with $\delta = 0.975$. (a) Carry out sequential forecasting with known variance $V = 0.2$ and $\theta_0 \sim \mathcal{N}[0, 1]$. Either by hand or computer prepare a calculation table that produces $R_t , A_t , f_t , Q_t , e_t , m_t$ and $C_t$ for each $t$. What are your final inferences about the price elasticity $\theta_{10}$ ? What is your forecast for the market share in the next two quarters? (b) Repeat the analysis and inference when $V$ is an unknown constant variance starting with $n_0 = 1$ and $S_0 = 0.2$. ::: ::: {.solution} Here are elementary solutions in python and R that. ### Update equations {.unnumbered .unlisted} The updates equations are: **Known $V$:** $$ \begin{aligned} a_t&=m_{t-1} & \text{(prior mean)}\\ R_t&=\tfrac{C_{t-1}}{\delta} & \text{(prior var, discount)}\\ f_t&=F_t a_t,\quad e_t=Y_t-f_t & \text{(forecast \& error)}\\ Q_t&=F_t^2R_t+V & \text{(forecast var)}\\ A_t&=\tfrac{R_tF_t}{Q_t} & \text{(gain)}\\ m_t&=a_t+A_te_t & \text{(posterior mean)}\\ C_t&=(1-A_tF_t)R_t & \text{(posterior var)} \end{aligned} $$ **Unknown $V$ (NIG, “starred” form):** $$ \begin{aligned} a_t&=m_{t-1},\quad R_t^*=\tfrac{C_{t-1}^*}{\delta} \\ f_t&=F_t a_t,\quad e_t=Y_t-f_t \\ q_t^*&=F_t^2R_t^*+1,\quad A_t=\tfrac{R_t^*F_t}{q_t^*} \\ m_t&=a_t+A_te_t,\quad C_t^*=(1-A_tF_t)R_t^* \\ n_t&=n_{t-1}+1,\quad S_t=\frac{n_{t-1}S_{t-1}+e_t^2/q_t^*}{n_t} \end{aligned} $$ Forecast $h$-ahead: mean $$ \begin{aligned} F_{t+h}m_t \text{, var } F_{t+h}^2\,C_t/\delta^h+V && \text{(known V)} \\ S_t q_{t+h}^* \text{ with } q_{t+h}^*=F_{t+h}^2(C_t^*/\delta^h)+1 && \text{(unknown V)} \end{aligned} $$ :::: {.panel-tabset} ## Python ```{python} # Compute tables for Exercise ch3-ex15 parts (a) and (b) and save as CSVs import pandas as pd import numpy as np import math delta = 0.975 V_known = 0.2 m0 = 0.0 C0 = 1.0 Y = [0.45, 0.83, 1.45, 0.88, -1.43, -1.50, -2.33, -0.78, 0.58, 1.10] # t=1..10 F = [-0.50, -1.30, -1.50, -0.84, -0.65, -1.19, 2.12, 0.46, -0.63, -1.22, -2.00, 2.00] # t=1..12 def run_known_variance(Y, F, m0, C0, delta, V): rows = [] m_prev = m0 C_prev = C0 for t in range(1, 11): # t=1..10 Ft = F[t-1] yt = Y[t-1] a_t = m_prev R_t = C_prev / delta f_t = Ft * a_t Q_t = Ft*Ft * R_t + V e_t = yt - f_t A_t = R_t * Ft / Q_t m_t = a_t + A_t * e_t C_t = (1 - A_t * Ft) * R_t rows.append({ "t": t, "Ft": Ft, "Yt": yt, "R_t": R_t, "A_t": A_t, "f_t": f_t, "Q_t": Q_t, "e_t": e_t, "m_t": m_t, "C_t": C_t }) m_prev, C_prev = m_t, C_t # forecasts for t=11 and t=12 without new data forecasts = [] C_t = C_prev m_t = m_prev for h in [1,2]: Ft = F[10 + h - 1] # F_11, F_12 R_h = C_t / (delta**h) f_h = Ft * m_t Q_h = Ft*Ft * R_h + V forecasts.append({"h": h, "Ft": Ft, "f": f_h, "Q": Q_h, "sd": math.sqrt(Q_h), "Share_mean_%": 42.0 + f_h}) return pd.DataFrame(rows), {"m10": m_prev, "C10": C_prev}, pd.DataFrame(forecasts) def run_unknown_variance(Y, F, m0, C0_star, delta, n0, S0): rows = [] m_prev = m0 C_prev = C0_star n = n0 S = S0 for t in range(1, 11): Ft = F[t-1]; yt = Y[t-1] a_t = m_prev R_t = C_prev / delta # starred f_t = Ft * a_t q_t = Ft*Ft * R_t + 1.0 # starred e_t = yt - f_t A_t = R_t * Ft / q_t m_t = a_t + A_t * e_t C_t = (1 - A_t * Ft) * R_t n_new = n + 1 S_new = (n * S + (e_t**2) / q_t) / n_new rows.append({ "t": t, "Ft": Ft, "Yt": yt, "R_t*": R_t, "A_t": A_t, "f_t": f_t, "q_t*": q_t, "e_t": e_t, "m_t": m_t, "C_t*": C_t, "n_t": n_new, "S_t": S_new }) m_prev, C_prev, n, S = m_t, C_t, n_new, S_new forecasts = [] for h in [1,2]: Ft = F[10 + h - 1] R_h = C_prev / (delta**h) q_h = Ft*Ft * R_h + 1.0 var = S * q_h mean = Ft * m_prev forecasts.append({"h": h, "Ft": Ft, "f": mean, "q*": q_h, "S": S, "Q": var, "sd": math.sqrt(var), "Share_mean_%": 42.0 + mean}) return pd.DataFrame(rows), {"m10": m_prev, "C10*": C_prev, "n10": n, "S10": S}, pd.DataFrame(forecasts) # Run both parts df_known, state10_known, fc_known = run_known_variance(Y, F, m0, C0, delta, V_known) df_u, state10_u, fc_u = run_unknown_variance(Y, F, m0, 1.0, delta, 1, 0.2) # Save CSVs df_known.to_csv('exr-ch3-ex15_partA_knownV_filter.csv', index=False) fc_known.to_csv('exr-ch3-ex15_partA_knownV_forecasts.csv', index=False) df_u.to_csv('exr-ch3-ex15_partB_unknownV_filter.csv', index=False) fc_u.to_csv('exr-ch3-ex15_partB_unknownV_forecasts.csv', index=False) # Show compact summaries state10_known, state10_u, fc_known.round(4), fc_u.round(4) ``` ## R ```{r} ## exr-ch3-ex15 — discount regression DLM in base R ## Part (a): known V; Part (b): unknown V via NIG (starred) recursions. ## References: Prado, Ferreira & West (2023), Ch.3 discount regression. Y <- c(0.45, 0.83, 1.45, 0.88, -1.43, -1.50, -2.33, -0.78, 0.58, 1.10) # t=1..10 Fv <- c(-0.50, -1.30, -1.50, -0.84, -0.65, -1.19, 2.12, 0.46, -0.63, -1.22, -2.00, 2.00) # t=1..12 delta <- 0.975 kstep_R <- function(C, delta, h) C / (delta^h) ## ---------- Part (a): KNOWN V ---------- dlm_knownV <- function(Y, Fv, delta, V, m0=0, C0=1) { stopifnot(length(Y) == 10L, length(Fv) >= 12L) n <- length(Y) rows <- vector("list", n) m <- m0; C <- C0 for (t in seq_len(n)) { Ft <- Fv[t]; yt <- Y[t] a <- m R <- C / delta f <- Ft * a Q <- Ft*Ft * R + V e <- yt - f A <- R * Ft / Q m <- a + A * e C <- (1 - A * Ft) * R rows[[t]] <- data.frame(t=t, Ft=Ft, Yt=yt, R_t=R, A_t=A, f_t=f, Q_t=Q, e_t=e, m_t=m, C_t=C) } filt <- do.call(rbind, rows) ## 1- and 2-step forecasts (t=11,12) using last (m,C) fcast <- lapply(1:2, function(h) { Ft <- Fv[10 + h] R_h <- kstep_R(C, delta, h) f_h <- Ft * m Q_h <- Ft*Ft * R_h + V data.frame(h=h, Ft=Ft, f=f_h, Q=Q_h, sd=sqrt(Q_h), Share_mean_pct = 42 + f_h, Share_L95 = 42 + (f_h - qnorm(0.975)*sqrt(Q_h)), Share_U95 = 42 + (f_h + qnorm(0.975)*sqrt(Q_h))) }) fcast <- do.call(rbind, fcast) list(filter=filt, state=list(m10=m, C10=C), forecast=fcast) } ## ---------- Part (b): UNKNOWN V (NIG; starred) ---------- dlm_unknownV <- function(Y, Fv, delta, n0=1, S0=0.2, m0=0, C0_star=1) { stopifnot(length(Y) == 10L, length(Fv) >= 12L) n <- length(Y) rows <- vector("list", n) m <- m0; C <- C0_star; ndf <- n0; S <- S0 for (t in seq_len(n)) { Ft <- Fv[t]; yt <- Y[t] a <- m R <- C / delta # starred f <- Ft * a qst <- Ft*Ft * R + 1 # q_t^* e <- yt - f A <- R * Ft / qst m <- a + A * e C <- (1 - A * Ft) * R ndf_new <- ndf + 1 S_new <- (ndf*S + (e^2)/qst) / ndf_new rows[[t]] <- data.frame(t=t, Ft=Ft, Yt=yt, R_t_star=R, A_t=A, f_t=f, q_t_star=qst, e_t=e, m_t=m, C_t_star=C, n_t=ndf_new, S_t=S_new) ndf <- ndf_new; S <- S_new } filt <- do.call(rbind, rows) ## Forecasts (t=11,12): Student-t with df = n_t, mean = Ft*m, var = S * q*_h tcrit <- function(df) qt(0.975, df=df) fcast <- lapply(1:2, function(h) { Ft <- Fv[10 + h] R_h <- kstep_R(C, delta, h) qh <- Ft*Ft * R_h + 1 mean <- Ft * m var <- S * qh sd <- sqrt(var) df <- ndf mL <- mean - tcrit(df)*sd mU <- mean + tcrit(df)*sd data.frame(h=h, Ft=Ft, f=mean, q_star=qh, S=S, Q=var, sd=sd, df=df, Share_mean_pct = 42 + mean, Share_L95 = 42 + mL, Share_U95 = 42 + mU) }) fcast <- do.call(rbind, fcast) list(filter=filt, state=list(m10=m, C10_star=C, n10=ndf, S10=S), forecast=fcast) } ## ---------- Run both parts ---------- res_a <- dlm_knownV(Y, Fv, delta, V=0.2, m0=0, C0=1) res_b <- dlm_unknownV(Y, Fv, delta, n0=1, S0=0.2, m0=0, C0_star=1) ## Save tables if desired: # write.csv(res_a$filter, "exr-ch3-ex15_partA_knownV_filter.csv", row.names=FALSE) # write.csv(res_a$forecast, "exr-ch3-ex15_partA_knownV_forecasts.csv", row.names=FALSE) # write.csv(res_b$filter, "exr-ch3-ex15_partB_unknownV_filter.csv", row.names=FALSE) # write.csv(res_b$forecast, "exr-ch3-ex15_partB_unknownV_forecasts.csv", row.names=FALSE) ## ---------- Print key answers ---------- cat("Part (a): known V=0.2\n") cat(sprintf("theta_10 mean=%.4f, Var=%.5f, SE=%.4f\n", res_a$state$m10, res_a$state$C10, sqrt(res_a$state$C10))) ci_a <- res_a$state$m10 + c(-1,1)*qnorm(0.975)*sqrt(res_a$state$C10) cat(sprintf("95%% CI for theta_10: [%.4f, %.4f]\n\n", ci_a[1], ci_a[2])) print(res_a$forecast, row.names=FALSE) cat("\nPart (b): unknown V (n0=1, S0=0.2)\n") cat(sprintf("theta_10 mean=%.4f, Var|V = S10*C10* = %.5f, SE=%.4f, n10=%d\n", res_b$state$m10, res_b$state$S10*res_b$state$C10_star, sqrt(res_b$state$S10*res_b$state$C10_star), res_b$state$n10)) tcrit10 <- qt(0.975, df=res_b$state$n10) ci_b <- res_b$state$m10 + c(-1,1)*tcrit10*sqrt(res_b$state$S10*res_b$state$C10_star) cat(sprintf("95%% t CI for theta_10: [%.4f, %.4f]\n\n", ci_b[1], ci_b[2])) print(res_b$forecast, row.names=FALSE) ## ------- Expected numeric (sanity, should match ±1e-3): ## Part (a): theta_10 ≈ -0.6356, Var ≈ 0.01650 ## Q11 mean share ≈ 43.271, Q12 ≈ 40.729 ## Part (b): theta_10 ≈ -0.6046, Var|V ≈ 0.0766, n10=11 ## Q11 mean share ≈ 43.209, Q12 ≈ 40.791 ``` :::: Here’s the full sequential fit and the forecasts. ### (a) Known $V=0.2$, $\delta=0.975$, $\theta_0\sim\mathcal N(0,1)$ * Final posterior for price elasticity at $t=10$: $\hat\theta_{10}= -0.6356,\quad \mathrm{Var}(\theta_{10})=0.01650$ $\Rightarrow$ s.e. $=0.1285$. 95% CI: $[-0.8874,\,-0.3838]$. * One- and two-step forecasts (market share = $42\%+Y$): * Q11: $\mathbb E[\text{share}]=43.27\%$, 95% PI $[42.26,\,44.29]\%$. * Q12: $\mathbb E[\text{share}]=40.73\%$, 95% PI $[39.71,\,41.75]\%$. Calc table (per $t=1{:}10$) with $R_t,A_t,f_t,Q_t,e_t,m_t,C_t$: [part A filter (CSV)](exr-ch3-ex15_partA_knownV_filter.csv) · [part A forecasts (CSV)](exr-ch3-ex15_partA_knownV_forecasts.csv) --- ### (b) Unknown $V$ with $n_0=1,\,S_0=0.2$ (standard NIG DLM; starred form, $\delta=0.975$) * Final posterior for price elasticity at $t=10$: $\hat\theta_{10}= -0.6046,\quad \mathrm{Var}(\theta_{10}\mid V)=S_{10}C_{10}^*=0.9760\times0.07849=0.0766$ $\Rightarrow$ s.e. $=0.2768$, $n_{10}=11$. 95% $t_{11}$ CI: $[-1.2138,\,0.0046]$. * Forecasts (wider, since $V$ unknown): * Q11: $\mathbb E[\text{share}]=43.21\%$, 95% PI $[40.71,\,45.71]\%$. * Q12: $\mathbb E[\text{share}]=40.79\%$, 95% PI $[38.28,\,43.30]\%$. Calc table (per $t=1{:}10$) with $R_t^*,A_t,f_t,q_t^*,e_t,m_t,C_t^*,n_t,S_t$: [part B filter (CSV)](exr-ch3-ex15_partB_unknownV_filter.csv) · [part B forecasts (CSV)](sandbox:/exr-ch3-ex15_partB_unknownV_forecasts.csv) --- Notes: Discount-regression recursions as in *Prado, Ferreira & West (2023),* ch. 3. For (b), starred forms use $q_t^*=F_t^2 R_t^*+1$, $A_t=R_t^*F_t/q_t^*$, $C_t^*=(1-A_tF_t)R_t^*$, and $S_t=(n_{t-1}S_{t-1}+e_t^2/q_t^*)/n_t$; predictive variances are $S_t q_{t+h}^*$. ::: ::: {.callout-note} ## Overthinking Elasticity So this is not a typical price elasticity calculation. The usual definitions are due to Alfred Marshall text Principles of Economics cica 1890. $$ E_d = \frac {\Delta Q/Q}{\Delta P/P} \qquad \text{(elasticity)} $$ {#eq-elasticity} $$ E_d = \frac {\left({\frac {P_{1}+P_{2}}{2}}\right)}{\left({\frac {Q_{d_{1}}+Q_{d_{2}}}{2}}\right)}}\times {\frac {\Delta Q_{d}}{\Delta P}}={\frac {P_{1}+P_{2}}{Q_{d_{1}}+Q_{d_{2}}}}\times {\frac {\Delta Q_{d}}{\Delta P}} \qquad \text{(arc elasticity)} $$ {#eq-arc-elasticity} where $Q$ is quantity sold, $P$ is price, and $E_d$ is the elasticity. $$ E_d= \frac{\partial Q}{\partial P} \cdot \frac{P}{Q} \qquad \text{(point elasticity)} $$ {#eq-point-elasticity} Here we need to know the arc elasticity function and to be able to differentiate it. Another subtlety is due to the distinction between Marshallian and Hicksian demand. The latter includes income effects due to ratio of a budget (a function of income) to the price can be substantial, possibly dominating the price effect e.g. for Giffen Goods) We usually look at % unit sold To estimate elasticity, we need to know the price at which the market share is measured. In this case, the price is not given in the data, but it is implied by the values of $F_t$. The elasticity can be interpreted as the percentage change in market share for a 1% change in price. :::