A scalable framework for soil property mapping tested across a highly diverse tropical data-scarce regionZENODO
Rodrigo de Q. Miranda,
Rodolfo L.B. Nóbrega,
Anne Verhoef,
Estevão L.R. da Silva,
Jadson F. da Silva,
José C. de Araújo Filho,
Magna S.B. de Moura,
Alexandre H.C. Barros,
Alzira G.S.S. Souza,
Wanhong Yang,
Hui Shao,
Raghavan Srinivasan,
Feras Ziadat,
Suzana M.G.L. Montenegro,
Maria do S.B. Araújo,
Josiclêda D. Galvíncio
Affiliations
Rodrigo de Q. Miranda
PRODEMA, Universidade Federal de Pernambuco, Recife, Brazil
Rodolfo L.B. Nóbrega
University of Bristol, School of Geographical Sciences, University Road, Bristol BS8 1SS, UK; Cabot Institute for the Environment, University of Bristol, Bristol, UK; Imperial College London, Georgina Mace Centre for the Living Planet, Department of Life Sciences, Silwood Park Campus, Buckhurst Road, Ascot SL5 7PY, UK; Corresponding author at: University of Bristol, School of Geographical Sciences, University Road, Bristol BS8 1SS, UK.
Anne Verhoef
The University of Reading, Department of Geography and Environmental Science, Reading, UK
Estevão L.R. da Silva
PRODEMA, Universidade Federal de Pernambuco, Recife, Brazil; Brazilian Agricultural Research Corporation – Embrapa Soils, Recife, Brazil
Jadson F. da Silva
PRODEMA, Universidade Federal de Pernambuco, Recife, Brazil
José C. de Araújo Filho
Brazilian Agricultural Research Corporation – Embrapa Soils, Recife, Brazil
Magna S.B. de Moura
Brazilian Agricultural Research Corporation – Embrapa Semi-arid Region, Petrolina, Brazil; Brazilian Agricultural Research Corporation - Embrapa Tropical Agroindustry, Fortaleza, Brazil
Alexandre H.C. Barros
Brazilian Agricultural Research Corporation – Embrapa Soils, Recife, Brazil
Alzira G.S.S. Souza
Instituto Federal Baiano, Uruçuca, Bahia 45680-000, Brazil
Wanhong Yang
University of Guelph, Department of Geography, Guelph, Ontario N1G 2W1, Canada
Hui Shao
University of Guelph, Department of Geography, Guelph, Ontario N1G 2W1, Canada
Raghavan Srinivasan
Spatial Sciences Laboratory, Texas A&M University, College Station, USA; Blackland Research and Extension Center, Agrilife Research, Temple, USA
Feras Ziadat
Food and Agriculture Organization of the United Nations (FAO), Rome 00153, Italy
Suzana M.G.L. Montenegro
Departamento de Engenharia Civil, Universidade Federal de Pernambuco, Recife, Brazil
Maria do S.B. Araújo
Departamento de Ciências Geográficas, Universidade Federal de Pernambuco, Recife, Brazil
Josiclêda D. Galvíncio
PRODEMA, Universidade Federal de Pernambuco, Recife, Brazil
Reliable soil property maps are essential for environmental modeling, yet conventional mapping methods remain costly and time-consuming. We developed a machine learning framework that integrates the Soil-Landscape Estimation and Evaluation Program (SLEEP) with gradient boosting to predict soil properties at regional scales and multiple depths. Our approach addresses multicollinearity through a recursive feature selection algorithm. We applied this framework to a tropical region characterized by a ∼700-km longitudinal gradient of contrasting topography, climate, and vegetation (∼98,000 km²; NE Brazil), where scarce soil physicochemical data limit environmental modeling. We used six topographical, ten climate, and two vegetation covariates, along with data from 223 soil profiles (∼1 profile per 440 km²). Training and testing of our framework demonstrated strong spatial performance (r² = 0.79–0.98 and percent bias = −1.39–1.14 %). Topographic and climatic factors held greater weight than other variables in predicting soil layers, texture, and sum of bases. Moreover, we used our soil parameters combined with multiple pedotransfer functions (PTFs) to derive soil hydraulic properties. Our PTFs-derived estimates of hydraulic conductivity were considerably lower than high-resolution global predictions available for our study areadue to differences in clay fraction and mineralogy. Therefore, we recommend the use of region-specific PTFs for hydraulic properties based on multi-covariate soil property maps. This cost-effective framework accurately integrates diverse environmental covariates, adapts to varying soil data availability, and scales across spatial resolutions, making it highly transferable to other data-scarce regions.