Тип публикации: статья из журнала
Год издания: 2024
Идентификатор DOI: 10.21177/1998-4502-2024-16-3-1144-1154
Ключевые слова: mountain soils, soil acidity, machine learning, random forest, prediction, почвы горных районов, кислотность почвы, машинное обучение, "случайный лес", предсказание
Аннотация: В данной статье рассматривается проблема прогнозирования кислотности почвы в горных районах с использованием методов машинного обучения. Основная цель исследования заключалась в оптимизации модели предсказания уровня кислотности почвы путем исключения избыточных факторов и фокусировки на наиболее значимых переменных. В ходе экспериПоказать полностьюментов была использована модель «случайного леса». Анализ важности признаков показал, что содержание азота, изотопное соотношение азота и процентное содержание глины являются ключевыми факторами, влияющими на кислотность почвы. Построение модели на основе этих факторов позволило сохранить достаточную точность предсказаний при уменьшении объема данных. Полученные результаты демонстрируют возможность снижения затрат на сбор данных и упрощения процесса анализа, что важно для эффективного управления почвенными ресурсами в горных районах. Introduction. The paper considers the issue of predicting soil acidity in mountainous regions using machine learning methods. The study focuses on optimizing the model by removing redundant factors and identifying the most significant variables affecting soil acidity. The primary goal was to create an accurate model to predict soil pH by analyzing various soil properties. Materials and methods. The research utilized soil property data from the Marshall Gulch project in Arizona, including nitrogen content, clay percentage, and organic matter, among others. «Random Forest» and Gradient Boosting models were employed to analyze the data and predict soil pH. Missing values were filled with median values, and categorical variables were excluded. Numerical features were standardized using Standard Scalar. Results. The Random Forest model was applied to the dataset. After removing factors highly correlated with soil pH, such as pH-KCl and pH-CaCl2, the model's accuracy was maintained, but slightly reduced. Further analysis identified nitrogen content, nitrogen isotope ratio, and clay percentage as the most influential factors. A simplified model using only these three factors demonstrated sufficient accuracy for practical applications. Discussion. The reduction of factors led to a more interpretable model without significant loss in prediction accuracy. While the exclusion of pH-related factors decreased precision slightly, the analysis highlighted the importance of nitrogen and clay content in predicting soil acidity. The findings suggest that reducing data collection efforts without compromising model accuracy can streamline soil resource management. Conclusion. The study demonstrated that optimizing feature selection based on their importance allows maintaining predictive accuracy while reducing the amount of required data. This is crucial for effective soil resource management, particularly in mountainous areas. Resume. The research focused on predicting soil acidity using machine learning. Key factors such as nitrogen content and clay percentage were identified. A «Random Forest» model showed that using fewer factors could still provide accurate predictions, simplifying data collection and analysis. Suggestions for Practical Applications and Directions for Future Research. The results can be applied to improve soil resource management in mountainous areas, reducing costs associated with data collection. Future research should focus on refining prediction models for other soil properties and environmental factors using a similar approach.
Журнал: Устойчивое развитие горных территорий
Выпуск журнала: Т. 16, № 3
Номера страниц: 1144-1154
ISSN журнала: 19984502
Место издания: Владикавказ
Издатель: Северо-Кавказский горно-металлургический институт