AI-Driven Soil Organic Carbon Prediction Using Random Forest: A Data-Driven Study on Uzbekistan’s Agricultural Soils
Аннотация
Soil Organic Carbon (SOC) is a key determinant of soil health and agricultural sustainability. However, large-scale assessment remains challenging because of the costly and labor-intensive laboratory procedures. This challenge is particularly relevant to Uzbekistan, where soil degradation and nutrient depletion threaten long-term productivity. To address the need for scalable SOC estimation, this study evaluated whether machine learning, specifically a Random Forest (RF) model, can accurately predict SOC using only low-cost agrochemical data and geospatial information. Unlike many existing studies that depend either on extensive physicochemical soil profiles or spatially rich environmental covariates, this study introduces a hybrid minimal-feature approach that combines basic laboratory measurements with approximated sampling coordinates. This study aimed to develop, optimize, and evaluate an RF-based SOC prediction model using a comprehensive dataset of 97,449 soil samples collected between 2022 and 2024. The document outlines the methodological workflow, including data preprocessing, correlation analysis, baseline modeling, and hyperparameter optimization using GridSearchCV.
The optimized RF model achieved an $R^2$ of 0.619 and an RMSE of 0.243 on the test set, outperforming the baseline configuration and demonstrating a stable predictive behavior across most SOC values. These results show that meaningful SOC estimation is possible even in data-limited contexts, marking the first large-scale AI-driven SOC prediction study based on nationally collected soil laboratory data from Uzbekistan. These findings highlight a practical pathway for developing digital soil monitoring tools in regions with sparse environmental datasets. Future research should incorporate temporal indicators, additional soil attributes, and remote sensing features to further enhance model accuracy and support advanced spatiotemporal soil analytics.
Загрузки
Опубликован
Как цитировать
Выпуск
Раздел
Лицензия
Copyright (c) 2026 Центрально азиатский журнал STEM

Это произведение доступно по лицензии Creative Commons «Attribution-ShareAlike» («Атрибуция — На тех же условиях») 4.0 Всемирная.