Spatial and Spatio-temporal Bayesian Models with R-INLA, by
Marta Blandiardo and Michela Cameletti is a short textbook that introduces INLA
(Integrated nested Laplace approximations) and some preliminaries, and works
through the math and R-code of a variety of problems that the R-INLA package
can tackle. Several non-spatial regression models are also covered in enough
detail to get you started on R-INLA in general.

Why I read it:

I’m working on a research problem in which INLA was
recommended as a possible solution. I was sold on the idea, but also warned
that the original paper describing the INLA method. (Link) When Dr. Dave
Campbell says something is hard, you can safely translate this as ‘herculean’.
The paper

(Rue, et al. 2009) was 74 pages of material too intimidating
for me to examine. So, I looked for other summary work of INLA, and came across
this book which describes INLA and the use of an R package being developed by a
community called R-INLA. R-INLA is not available on CRAN at the time of writing
this, but it is available at www.r-inla.org

What is INLA:

INLA, or integrated nested Laplace approximations is a method
used to estimate the likelihood in high-dimension cases that integrates out
unimportant variables, and then approximates the likelihood by using the
product of marginals for all these assumed unimportant variables. Compared to
MCMC, or Markov Chain Monte Carlo methods, INLA scales much better with the
number of dimensions, but is reliant on the quality of that extra layer of
approximations. Also, INLA is deterministic, instead of simulation based like
most popular Bayesian methods, including ABC (Approximate Bayesian
Computation). This has the advantage of replicability, but the disadvantage that
you can’t improve the results simply by throwing more computers at it.

Who is this book for:

This is book is for people with some statistical and
mathematical fluency already. If you feel ready for Julian J. Faraway’s
“Extending the Linear Model with R”, a popular book for teaching GLMs, then you
are probably ready for this R-INLA book. That is, a background in calculus and
statistical inference are required, and a background in R and in Bayesian
statistics are helpful.

From what I’ve read (Chapters 4 and 5), R-INLA is a beast
that can provide quick solutions to a vast set of popular analyses, and
solutions of any sort for high-dimensional problems. It could be worth your
reading time, especially if the INLA method becomes more popular.

There are 300 pages in this text, but 100 pages are dedicated
to bringing the reader up to speed on using R and some basic analyses, on
Bayesian theory, and on Bayesian computational methods like MCMC,
Metropolis-Hastings, and BUGS.

The rest is a guide on applying INLA to Bayesian regression,
hierarchical models (my focus for this project), spatial, and Spatio-Temporal
models.

What I learned about using R-INLA:

At the basic level, the inla() function in R-INLA is used a
lot like the glm() function.

1 Specify
a formula, y ~ x. (You can also specify custom functions, see below)

2 Specify
a family of distributions for the likelihood.

3 Specify
a dataset.

However, under the hood, it’s performing a Bayesian
regression (or other estimation, if you wish), with a default of very diffuse
priors. For a regression, for example, you can use the control.fixed=list()
setting in inla() and...

- change the priors on the intercept and all the regression
parameters.

- change the prior on precision (inverse of standard error
sigma) of response estimates, which defaults to Gamma(1,10^-5), which you
replace with another gamma or even a distribution from another family.

The book provides a way to specify the standard deviation
sigma directly, but it’s bit of an awkward workaround.

To get the fitted values that you would normally get from
$fitted in a lm()or glm() model, you have specify with control.predictor =
list(compute=TRUE)) in inla() to produce any fitted values in your model
object. If you wish to predict for new data where the response is missing, you
can use link= in the control.predictor list to specify which predictors to
compute, and make sure to specify rows in which the response is missing.

1 The
custom functions that you can specify in the formula that gets fed into inla()
has many options that allow to specify that you are doing something more
complex than a linear regression. The list of currently available ones is
here: http://www.r-inla.org/models/latent-models

For example, you can set up an autocorrelational model by
adding f(varname, model="rw1", hyper=list(…), prior=…, param=…) where "rw1"
stands for Random Walk of Order 1, hyper stands for the hyper-prior like the
one set for precision of estimates, and prior and param define the prior for
the parameter(s) of the effect of varname. You can also specify multiple such
functions by adding them as f1(…) + f2(…) + ….

If you want to treat levels of a particular variable as
random effects, use f(varname, "iid"), and inla() will treat the
effects of varname as independent and identically distributed values from some
common distribution. This is particular useful in hierarchical models. These
random effects can be used in different family specifications.

2.When specifying a family, the default is Gaussian. If you
want to reduce the effect of outliers, using a family with fatter tails, like
~~Gosset’s~~ Student’s T. Remember, this is maximizing likelihood, not minimizing
squared error, so using a fatter tailed family means that the likelihood
doesn’t drop off as quickly as deviations get larger. As with other parts of
inla(), you can override the defaults (like df for T, which it will get from
n-p), with a list in control.family.

Other options, listed
at http://www.r-inla.org/models/likelihoods , include Weibull, skew
normal, binomial (for logistic regression), Poisson (for regression on counts),
and Cox (as in Cox Proportional Hazard for survival data).

What else I learned:

A random Markov field, or RMF, is a network of variables that
shows explicitly which variables depend of which others. A connection between
nodes represents a dependency, and a lack of a connection between nodes
represents conditional independence. If the variables form a multivariate
normal distribution, you have a GRMF, or Gaussian Markov random field. I can
see how a concept like this would be useful in identifying which variables
could be safely integrated out. The ‘Markov’ part specifies that only the immediate
connections imply dependence.

The approxfun() function. It’s in base R, and it takes two
numeric vectors of equal length, x and y, and makes a function that you can use
to put any new x in to get an interpolated y value. By default, it has nothing
more sophisticated than linear interpolation, but I’d be surprised if there
wasn’t a polynomial or spline-based extension out there. Functions output by
approxfun() can be used just like other continuous value functions of x, like
sin() and cos(), which is very handy when integrating with integrate().

Original paper:

Rue, H., Martino, S., & Chopin, N. (2009). Approximate
Bayesian inference for latent Gaussian models by using integrated nested
Laplace approximations.

*Journal of the royal statistical society: Series b (statistical methodology)*,*71*(2), 319-392.