7 Jan, 2016
Posterior Predictive: $p(y^* | y) = \displaystyle \int p(y^* | \theta,y)~p(\theta | y)~d\theta$
Note that the second portion in the integral is what we sample from (the posterior in this case), and the first portion is what we evaluate based on the sampled posterior to obtain our posterior predictive samples. Also, the first portion is the data generating mechanism, which is simply $p(y^*|\theta)$ if the mechanism doesn’t depend on previous draws (which isn’t the case in a time series).
Prior Predictive (a.k.a. Marginal): $p(y^*)$ = $\displaystyle \int p(y^*|\theta)~p(\theta)~d\theta$
Conjugate Prior: A prior is conjugate if the posterior comes from the same distributional family as the prior. Note that if the likelihood is in the exponential family, there exists a conjugate prior (often from the exponential family). You can also find likelihoods that are not in the exponential family that have conjugate priors (e.g. Uniform likelihood, Pareto prior). You can also get non-conjugate priors with closed form posteriors (e.g. Normal likelihood, Laplace prior, used in LASSO).
Mixtures of conjugate priors are conjugate
Stopping Rule Principle: Stopping rule $\tau$ should have no influence on the final reported evidence about $\theta$ obtained from the data. $\tau: \mathbb R^p \rightarrow [0,1]$, and $\tau_p(x_1,…,x_n)$ probability of stopping after observing $x_1,…,x_p$.
Natural Exponential Family: $p(x|\eta) = h(x)\exp\{\sum_{k=1}^K \eta_k t_k(x) - \psi(\eta)\}$, with natural parameter $\eta$, and sufficient statistic $\sum_{k=1}^K t_k(x)$.
If $ p(x|\eta) $ is the sampling distribution in the natural exponential family, then the conjugate prior is \(\pi(\eta) = A(\nu,\mu)\exp\\{\sum_{k=1}^K \mu_k\eta_k - \nu\psi(\eta) \\}\) where $\nu,\mu$ are hyperparameters.
$p(\eta|x) = \pi(\eta) p(x|\eta)$ is a family in the natural exponential family with $\tilde\nu = \nu+n$ and $\tilde{\mu}_k = \mu_k + \sum_{i=1}^nt_k(x_i)$.
Exercises:
$p(x) = \displaystyle\int\prod_{i=1}^n N(x_i;\theta,\sigma^2) N(\theta;\mu,\tau^2)d\theta$
Notice that:
So,
$ \mathbf x = \mathbf 1 \theta + \mathbf\epsilon $
$ \mathbf x = \mathbf 1 (\mu+\delta) + \mathbf\epsilon $
E[$ \mathbf x$] = $\mathbf 1 \mu$
V[$ \mathbf x$] = $\tau^2 \mathbf{11’} + \sigma^2 I_n$
Original R.V | Transformation | Back Transformation | $f_w(w)$ |
---|---|---|---|
$\phi\sim \text{Unif}(a,b)$ | $w = \displaystyle\log\p{\frac{\phi-a}{b-\phi}}$ | $\phi=\displaystyle\frac{be^w+a}{1+e^w}$ | $\displaystyle\frac{e^w}{\p{1+e^w}^2}$ |
$\phi\sim \text{InvGamma}(a,b)$ | $w=\log(\phi)$ | $\phi=e^w$ | $\displaystyle\frac{b^a}{\Gamma(a)} e^{-aw-be^{-w}} $ |
$\phi\sim \text{Gamma}(a,rate=b)$ | $w=\log(\phi)$ | $\phi=e^w$ | $\displaystyle\frac{b^a}{\Gamma(a)} e^{aw-be^{w}} $ |