17 Feb, 2016
Estimated posterior \(\hat{p}(u) = \frac{q^{-i}(u) t_i(u)}{q^{-i}(u) t_i(u)}\)
where w is known.
$ p(u) = N(u | 0,100 I)$. \(D = \brac{y_1,...,y_n}\)
Goal: Find posterior $p(u | y_1,…,y_n$. We want to find an approximating posterior using ADF.
\[\brac{q:q(u) = N(u|m_u,v_uI)}\]So the posterior and prior have the same form.
\[\begin{aligned} q^{-i} &= N(u|m_u^{-i},v_u^{-i}I_d) \\ \\ z_i &= \int~p(y_i|u)q{-i}(u)~du \\ &=(1-w)N(y_i | m^{-i}_u, (v_u^{-i})I) + w N(y_i|0,10I)\\ \\ \hat{p}(u) &= \frac{p(y_i|u)q^{-i}(u)}{\int~p(y_i|u)q^{-i}(u)~du} \\ &= \frac{p(y_i|u)q^{-i}(u)}{z_i} \end{aligned}\]choose $q$ that minimizes KL with $\hat{p}$
$\hat{p}(u)$ approximate this density with a spherical normal distribution $q(u)$. This approximation is wrt the KL divergence. Match moments: solve these two equations for first and second moments: $E_{hat{p}}[u] = E_q[u]$, $E_{hat{p}}[uu’] = E_q[uu’]$. This will be $m_u,v_u I$.
In the example above, we’ll get that \(m_u = m_u^{-i} + v_u^{-i} r_i \frac{y_i-m_i^{-i}}{v_u^{-i}+1}\), where
\[\begin{aligned} r_i &= 1-\frac{w}{z_i} N(y_i \| 0, 10I_d) \\ v_u &= v_u^{-i} - \frac{r_i(v_u^{-i})^2}{v_u^{-i}+1} + r_i(1-r_i)(v_u^{-i})^2\frac{\norm{y_i-m_u^{-i}}^2}{d(v_u^{-i}+1)} \\ \end{aligned}\]Initialize \(m_u,v_u\) at the prior.
Check out Ormerod (2015)
treat \(s_i, m_i, v_i\) as parameters and update them.