Basis function models are especially suited to areal data — they reduce \(O(n^2)\) double integrals over observation polygons to \(O(nK)\) single integrals.
The Heaton et al. (2019) competition
Heaton et al. (2019) compared 12 leading methods on 105,000 satellite surface temperature observations. See my writeup of these methods.
9 of 12 methods fit the \(y = Ax + \varepsilon\) framework:
Category
Methods
Low-rank \(C\)
FRK, Pred. Process, Partitioning, Metakriging
Sparse \(C^{-1}\)
SPDE/INLA, LatticeKrig, MRA, NNGP
Diagonal via Fourier
Periodic Embedding
Two performers: SPDE/INLA (the lowest RMSE) and LatticeKrig — both use SAR-like priors on basis coefficients.
Why SAR priors win:
Sparse \(Q\) via local differential operator
Matérn covariance emerges from \(( \kappa^2 - \Delta)x = \mathcal{W}\)
Encodes smoothness without dense precision
This is exactly the prior used in this demo. We also show how to match a Fourier basis to this for grid-free inference.
The Model
\[y = A x + \varepsilon, \qquad x \sim \mathcal{N}(0,\, \phi\, Q^{-1}), \qquad \varepsilon \sim \mathcal{N}(0,\, \sigma^2_e I)\]
\(y\) — observed SIF values (one per footprint)
\(A\) — matrix of basis function integrals: \(A_{ik} = \frac{1}{|D_i|}\int_{D_i} \phi_k(s)\,ds\). With a raster basis, \(\phi_k\) is the indicator of pixel \(k\), so \(A_{ik}\) is the overlap fraction.
\(x\) — basis coefficients. With a raster basis these coincide with grid-cell field values.
fit <-fit_fastblm(y = y_sif, # n observationsA = A, # n x p basis integral matrixQ = Q_sar, # p x p prior precisionphi = phi_hat, # signal-to-noise: sigma2_b / sigma2_esolver ="cholesky"# "cholesky" | "woodbury" | "pcg")fit$posterior_mean # p-vector: E[x | y]fit$sigma2e # estimated noise variance
Returns a fastblm_fit object used by tune_cv and posterior_se.
fastblm: Function Philosophy
Where possible, fastblm accepts operator functions instead of explicit matrices — avoiding materialisation of \(Q^{-1}\).
# Diagonal Q: pass inverse as a function instead of a matrixfit <-fit_fastblm(...,Q =Diagonal(x = q_diag),Q_inv =function(v) v / q_diag # never forms Q^{-1} explicitly)# PCG: Q only needs to be applied as a matrix-vector productfit <-fit_fastblm(...,solver ="pcg",apply_Q =function(v) Q %*% v)
PCG never stores \(Q^{-1}\) — enables millions of parameters
Woodbury uses Q_inv to avoid \(p \times p\) solves when \(p \gg n\)
fastblm: Tuning \(\phi\)
# To also tune rho, Q_fun can depend on theta:Q_fun_rho <-function(theta) { rho <- theta[1] W <-make_row_normalised_W(grid) # row-normalised adjacency Q <-crossprod(Diagonal(p) - rho * W)list(Q = Q)}# Tune both phi and rho jointlytuned <-tune_cv(y = y_sif,A = A,Q_fun = Q_fun_rho,theta_init =c(rho =0.99), # starting value for rhok =5L,verbose =TRUE)tuned$phi # CV-optimal phituned$theta # CV-optimal rho
tune_cv optimises \(\phi\) analytically on each fold; \(\theta\) via optim().
fastblm: Three Solvers
\(n\) = observations, \(p\) = basis functions (pixels or modes)
Solver
When to use
Cost
"cholesky"
\(p\) small–medium, sparse \(Q\)
\(O(p^{1.5})\)
"woodbury"
\(p \gg n\), diagonal \(Q\)
\(O(n^2 p)\)
"pcg"
\(p\) very large, \(Q\) cheap to apply
\(O(n_{\text{iter}} \cdot p)\)
Rule of thumb for this demo:
Raster SAR (\(p = 28{,}900\) pixels, sparse \(Q\)) → "cholesky"
Woodbury — exact, reuses cached \(n \times n\) factor. Cost independent of \(p\) — SEs over 28k cells cost the same as 100 cells
PCG — approximate via Lanczos/Hutchinson. Builds a low-rank approximation to the posterior covariance via Lanczos iteration, then estimates diagonal stochastically. Accuracy controlled by n_probes.
# Raster basis: phi_k = indicator of pixel k -> overlap fractionsA_raster <-compute_overlap_fractions(soundings, grid)# Fourier basis: phi_k = cos(kx*pi*x) * cos(ky*pi*y) -> exact integralsA_fourier <-fourier_integrate_basis(soundings, freq_grid)# General: phi_k evaluated at quasi-random points inside each polygonA_qmc <-qmc_integrate_basis(soundings, basis, n_points =5000)
spatintegrate: Three Integration Types
Raster overlap
Exact polygon intersection via sf. No approximation.
Best for: SAR/INLA-style raster models.
Fourier / sinusoidal
Closed-form integral of \(\cos(k\pi x)\) over triangulated polygon.
Best for: stationary covariances, large \(K\), irregular prediction geometry.
QMC average
Quasi-random point average of \(\phi_k\) inside polygon.
Best for: non-standard bases, rapid prototyping.
Worked Example: Data
We use projected coordinates for accurate intersection math.
# ensure_projected() reprojects to UTM if neededsoundings_proj <- spatintegrate::ensure_projected( goebel2026::soundings_augmented)# target_grid_square: 170x170 cells at 330m, square domaintarget_proj <- spatintegrate::ensure_projected( goebel2026::target_grid_square)
Worked Example: Observation Matrix
# Each entry A[i,k] = fraction of footprint i covered by pixel kA <- spatintegrate::compute_overlap_fractions( soundings_proj, target_proj, parallel =TRUE)p <-ncol(A) # number of grid cells
Row sums of \(A\) equal 1 — each footprint is fully partitioned across grid cells.
Worked Example: Response
y_sif <- soundings_proj$SIF_757nm
No noise added — raw SIF_757nm observations directly.
Worked Example: SAR Prior
spdep::nb2listw gives a row-normalised weight matrix directly.
# Fix Q, tune only phi# To also tune rho, make Q_fun depend on theta (see fastblm slide)Q_fun <-function(theta) list(Q = Q_sar)tuned <- fastblm::tune_cv(y = y_sif, A = A, Q_fun = Q_fun,theta_init =numeric(0), k =5L,verbose =TRUE)phi_hat <- tuned$phi
Worked Example: Fit SAR Model
fit_sar <- fastblm::fit_fastblm(y = y_sif, A = A, Q = Q_sar,phi = phi_hat, solver ="cholesky")pred_sar <-as.numeric(fit_sar$posterior_mean)fitted_sar <-as.numeric(A %*% pred_sar)
pred_sar is the posterior mean field — one value per grid cell.
Raster: \(p = 28{,}900\) parameters at this resolution — grows quadratically finer
Fourier: \(K \ll p\) modes represent the smooth part of the field
Fourier basis diagonalises all stationary covariances — \(Q\) becomes diagonal
Predictions at arbitrary locations, not tied to a fixed grid
Key insight: the SAR prior and Matérn covariance both penalise high-frequency modes — they are spectrally equivalent. We can match one to the other exactly.
In this demo:\(K = 10{,}000\) Fourier modes instead of \(p = 28{,}900\) pixels.