Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Soil Organic Carbon (SOC) is a vital component of ecosystem services, influencing carbon sequestration and soil health, particularly in cash crop ecosystems. However, the impact of feature selection and predictor optimization on SOC estimation accuracy remains underexplored. This study integrates dual feature selection techniques, Pearson Correlation and Mutual Information Maximization (MIM) to identify key predictors while minimizing redundancy. SOC predictions were derived from multispectral Sentinel-2 and Synthetic Aperture Radar (SAR) Sentinel-1 data, fused with environmental variables. The Extreme Gradient Boosting (XGBoost) model was applied and optimized using Bayesian hyperparameter tuning, with performance compared to Random Forest (RF) and Support Vector Regression (SVR). Model evaluation metrics included Coefficient of Determination (R2), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the Ratio of Performance to Interquartile Range (RPIQ). The results indicated that dual feature selection techniques improved predictor relevance and reduced redundancy. XGBoost achieved the highest accuracy, with an R2 of 0.905, RMSE of 2.453 ton C ha−1, MAE of 1.185, and PIRQ of 16.691. Rubber tree plantations exhibited the highest SOC content (14.29 t C ha− 1), while paddy fields displayed the lowest (9.83 t C ha− 1); and perennial crops exhibited higher uncertainty in SOC estimation compared to annual crops. This approach effectively enhanced predictor relevance while reducing redundancy, resulting in superior model performance. By integrating data from Sentinel-2, Sentinel-1, and environmental variables, the proposed framework provides a scalable and accurate solution for carbon accounting and sustainable soil management at a spatial resolution of 30 m. This methodology proves to be a valuable tool for capturing and addressing variability in SOC dynamics across diverse agroecosystems.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent.
Comments on this article