Note that setting $\alpha$ to be large, encourages $\nu_n$ to be close to
$\gamma$. In addition,
A diffuse prior for $\gamma$ with large $\alpha$ conveys that $\gamma$
should be learned from the data, but the $\nu_n$ should be close to each
other (and $\gamma$). This promotes the borrowing of strength.
If $\alpha$ is too large, you might as well replace the model with
$\nu \sim \InverseGamma(a, b)$. i.e., remove the index.
An informed (concentrated) prior for $\gamma$ but small $\alpha$ conveys
that the $\nu_n$ could be far from the global variance $\gamma$.
If $\alpha$ is very small, it might make sense to remove the dependence on $\gamma$
and simply model $\nu_n \sim \InverseGamma(a, b)$.
A diffuse prior for $\gamma$ and small $\alpha$ could be problematic.
Mixing issues will occur, as there’s little constraint in the priors to learn
anything meaningful from data.
A concentrated prior for $\gamma$ and large $\alpha$ means that
you are certain of the value of the global variance, and the individual
variances are close to the global. In the extreme case, you might consider
just fixing the $\nu_n$ at the prior mean of $\gamma$ (the same value).
Thus, configurations (1) and (2) above might be most commonly used.