Cross-Validation for Spatial Downscaling

spatial statistics

cross-validation

methodology

Why spatial blocking isn’t always the right choice for cross-validation, and what observation density has to do with it

Published

March 22, 2026

The problem

Cross-validation in spatial data is tricky. The standard advice is to use spatially blocked folds to avoid optimistic estimates from spatial autocorrelation — but that advice doesn’t always apply.

Simple random CV

[Standard k-fold CV, soundings assigned randomly to folds]

Spatially blocked CV

[K-means blocking — what it is, how it works, why people prefer it for spatial data]

Why observation density changes the calculus

[The key argument: in the downscaling setting, what you’re evaluating is how well the model recovers the latent field given overlapping footprints. Spatially blocking removes the dense overlap that the model actually relies on, making CV overly pessimistic. Simple random CV better reflects the real prediction task.]

In SpatialBasis

[How to run both, how to compare them]

Example

[Code demonstrating both approaches, comparison of results]