22 Jan, 2016

Stochastic Search Variable Selection (Cont'd)

Problems with SSVN

Computational issues (large $p$ is a problem)
Correlation among predictors, say $x_1,x_2,x_3$ are highly correlated, then each of them will be included $1/3$ of the time, instead of including one of the predictors (if they’re supposed to be included).
SSVS not great for more than hundred of predictors
Not appealing philosophically for a Bayesian

Shrinkage Priors

unimportant predictors shrunk to 0; important ones are kept.
Laplace (Bayesian Lasso)
- bad shrinkage prior: $f(x) \sim \exp^{-x}$
- good shrinkage prior: $f(x) \sim a^{-x}$
research started in 2005
Polson & Scott (2010, 2012)

Global Local Representation

global parameter shrinks all coefficients to 0
local parameter avoids over-shrinking
Normal Means Problem

\[\begin{matrix} y_1 & = & \beta_1 + \epsilon_1 \\ \vdots & = & \vdots \\ y_n & = & \beta_n + \epsilon_n \\ \end{matrix}\]

\[\begin{matrix} \beta_j | \psi_j,\tau & \sim & N(0,\psi_j\tau) \\ \psi_j & \overset{iid}{\sim} & g \\ \tau & \sim & f \\ \end{matrix}\]

$g$ should be heavy tailed
$f$ should have sufficient mass around zero

Bayesian Lasso (Park and Casella, 2008 (read); Hans 2009). Also a bad prior.