Hostname: page-component-6766d58669-h8lrw Total loading time: 0 Render date: 2026-05-20T19:31:31.267Z Has data issue: false hasContentIssue false

Habitat suitability modeling of dominant weed in canola (Brassica napus) fields using machine learning techniques

Published online by Cambridge University Press:  27 January 2025

Emran Dastres
Affiliation:
Ph.D, Department of Plant Production and Genetics, School of Agriculture, Shiraz University, Fars Province, Iran
Ghazal Shafiee Sarvestani
Affiliation:
Ph.D, Department of Plant Production and Genetics, School of Agriculture, Shiraz University, Fars Province, Iran
Mohsen Edalat*
Affiliation:
Associate Professor, Department of Plant Production and Genetics, School of Agriculture, Shiraz University, Fars Province, Iran
Hamid Reza Pourghasemi
Affiliation:
Professor, Department of Soil Science, School of Agriculture, Shiraz University, Fars Province, Iran
*
Corresponding author: Mohsen Edalat; Email: edalat@shirazu.ac.ir
Rights & Permissions [Opens in a new window]

Abstract

Weed infestations have been identified as a major cause of yield reductions in canola (Brassica napus L.), a vital oil crop that has gained significant prominence in Iran, especially within Fars Province. Weed management using machine learning algorithms has become a crucial approach within the framework of precision agriculture for enhancing the efficacy and efficiency of weed control strategies. The evolution of habitat suitability models for weeds represents a significant advancement in agricultural technology, offering the capability to predict weed occurrence and proliferation accurately and reliably. This study focuses on the issue of dominant weed infestation in canola cultivation, particularly emphasizing the prevalence and impact of wild oat (Avena fatua L.) as the dominant weed species in canola farming in 2023. We collected data on 12 environmental variables related to topography, climate, and soil properties to develop habitat suitability models. Three machine learning techniques, including random forest (RF), support vector machine (SVM), and boosted regression tree (BRT), were estimated based on the receiver operating characteristic (ROC) and area under the curve (AUC) to model the distribution of A. fatua. Model performance was quantified using the ROC curve and AUC metrics to identify the best predictive algorithm. The findings indicated that RF, BRT, and SVM models exhibited accuracies of 99%, 97%, and 96% for the habitat suitability of A. fatua, respectively. The Boruta feature selection method identified the slope variable as significantly influential in A. fatua habitat suitability modeling, followed by plan curvature, clay, temperature, and silt. This study serves as a case study that highlights the utility of machine learning for habitat suitability predictions when information on multiple environmental variables is available. This approach supports effective weed management strategies, potentially enhancing canola productivity and mitigating the ecological impacts associated with weed infestation.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Weed Science Society of America
Figure 0

Figure 1. The research region as situated in Iran’s Fars Province (left). Topographic map of research region showing locations for training and validation datasets (right).

Figure 1

Figure 2. An Avena fatua habitat suitability mapping flowchart. AUC, area under the curve; BRT, boosted regression tree; DEM, digital elevation model; EC, electrical conductivity; RF, random forest; ROC, receiver operating characteristic; SVM, support vector machine.

Figure 2

Table 1. Reviewing the fields of any county in Fars province

Figure 3

Figure 3. Spatial distribution of canola and weed sampling.

Figure 4

Figure 4. Important layers, including: (A) mean annual temperature, (B) mean annual precipitation, (C) sand percent, (D) silt percent, (E) clay percent, (F) electrical conductivity (EC), (G) pH, (H) elevation/digital elevation model, (I) slope degree, (J) slope aspect, (K) plan curvature, and (L) distance from rivers.

Figure 5

Table 2. The receiver operating characteristic (ROC) curve classification (Richardson et al. 2024).

Figure 6

Table 3. Frequency (%) of weeds in canola fields.

Figure 7

Table 4. Variance inflation factor (VIF).

Figure 8

Figure 5. Habitat suitability maps of Avena fatua based on (A) random forest (RF), (B) boosted regression tree (BRT), and (C) support vetor machine (SVM).

Figure 9

Table 5. Habitat suitability classes areas for all applied models.

Figure 10

Figure 6. The receiver operating characteristic (ROC) curve for evaluating algorithms. BRT, boosted regression tree; RF, random forest; SVM, support vector machine.

Figure 11

Table 6. Area under the curve (AUC).

Figure 12

Table 7. Examining the significance of variables using the Boruta algorithm.