8 Feb, 2016
Sequential MCMC (2016)
Algorithm
- At time $t+1$ obtain $D_{t+1}$
- Update sufficient stats
$C_{1,t+1} = g(C_{1,t},D_{t+1}),…,C_{m,t+1}$
where $C$ is a sufficient statistic and $m$ is the number of sufficient stats
- Draw $n_{t+1}$ MCMC samples from the conditional distributions.
- Set $t+1$ to $t+2$ and carry out 1-3
Since conditional distributions at the $t+1$ only depend on the data through a
few sufficient statistics, once we update sufficient stat at time $t+1$ we can
draw samples from the full conditional.
Let $(\theta_1^{(t)},…,\theta_k^{(t)})$ be the $n_t$-th MCMC sample at time $t$.
At time $t+1$, this will be the first MCMC sample.
Model: $y_t = X_t \beta + \epsilon$, $\epsilon \sim N(0,\sigma^2)$
$\beta \sim N(0,I)$
\(\beta | \sigma^2, D_1,...,D_t \sim N(S\sum_{l=1}^t X_l Y_l, S),\) where $S=\brak{\frac{\sum_{l=1}^t X_l’X_l}{\sigma^2}+I}^{-1}$
$\sigma^2 \sim InvGamma(a+st/2, b+\sum_{l=1}^t (y_t - X_t\beta)’(y_t - X_t\beta)/2)$
$C_{1,t} = \sum_{l=1}^t X_l’X_l$. So at time $t$, $g(C_{1,t},D_{t+1}) = C_{1,t}+X_{t+1}’X_{t+1}$.
Let $\pi_t$ be the full posterior distribution at time $t$. $T_t$ be the
transition kernel (proposal density).
- If your posterior $\pi_t$ changes slowly over time as $t\rightarrow\infty$
- And the transition kernel is “ergodic” (something which is required for markov chain to converge)
then the sequential MCMC
Sequential MC for Fixed Parameter Settings (2002)
- important idea that tech companies are starting to use if they are at all
using Bayesian statistics
- Based on importance sampling
- refer to notes
- Suppose the aim is to compute $E[h(\theta)]$. This means evaluating the integral
$\frac{\int~h(\theta)\pi(\theta)~d\theta}{\int~\pi(\theta)~d\theta}$
- if it’s easy to sample from $\pi$ but easy to sample from $g$
- $\frac{\int~h(\theta)\pi(\theta)~d\theta}{\int~\pi(\theta)~d\theta} = \frac{\int~h(\theta)\frac{\pi(\theta)}{g(\theta)}g(\theta)~d\theta}{\int~\frac{\pi(\theta)}{g(\theta)}g(\theta)~d\theta}$
Algorithm:
- Draw samples from $\pi_0$ the prior for $\theta$
- You got $D_1$
- You have to draw samples from $\pi_1$ using importance sampling
- Get $D_2$
- Use importance sampling with $g=\pi_1$ to draw samples from $\pi_2$.
Possible Project
- Compare sequential updating MLE vs Sequential MCMC samples will correspond to samples
from $\pi_t$ at large $t$.