4-5 Sep 2025 Fontainebleau (France)
Benchmarking probabilistic spatial machine learning models with complex sample distributions
Jérémy Rohmer  1@  , Julie Billy  1  , Vivien Baudouin  1  
1 : BRGM
Bureau de Recherches Géologiques et Minières (BRGM)

In recent years, machine learning models (ML) have become increasingly key to spatial interpolation. The cornerstone of these approaches is prediction accuracy, i.e. the extent to which a prediction of the target variable is estimated with high quality given the geographical coordinates and/or the spatial features that have not been used in the training dataset. Beyond spatial prediction, the question of predictive uncertainty has emerged as a challenge. Though the connection between ML and uncertainty quantification is an active research field, developments for spatial data still present open questions. One of the major concerns stems from the specificities of the spatial context, i.e. spatial dependence, uneven sample distribution, presence of strong anisotropies or/and non-stationarities. In this study, we benchmark four types of probabilistic ML models adapted to the spatial context: (1) Kriging, (2) Deep Gaussian process regression, (3) Conformal approaches adapted to non-exchangeable data, and (4) Adversarial-based generative models. We define repeated random experiments using synthetic random fields as well as real cases. The latter are based on the high resolution L1B radiance data sets measured from the MODIS satellite instrument that present different degrees of non-stationarities and anisotropies depending on the presence of ice or clouds. We discuss the influence of the characteristics of the sample distribution (sparsity, degree of clustering, spatial dependence, nonstationary) on the quality of the predictive uncertainty by using multiple performance metrics (coverage of the prediction intervals, informativeness, sharpness and calibration of the full predictive probability distribution) as well as practical considerations (computational burden, stability of the results, implementation effort, degree of required expertise). Our results show the high importance of using multiple performance metrics, and highlight the strong influence of the validation procedure (random or structured cross validation), which impacts the selection of the most optimal model. Finally, these results are used to guide the selection of ML interpolation models for a highly clustered real case. The latter aims at mapping bedrock topography in order to model the land-sea continuum of the hard basement beneath sediment cover, in the Pays-De-Monts coastal dune system, Atlantic coast, France.


Loading... Loading...