Introduction
The turfgrass industry plays a significant role in various sectors, including sports, landscaping, and recreation, contributing to the economy and enhancing environmental aesthetics (Breuninger et al. Reference Breuninger, Welterlen, Augustin, Cline, Morris, JC, BP and SA2013). However, weed infestations pose a persistent challenge, compromising turfgrass quality (Brosnan et al. Reference Brosnan, Elmore and Bagavathiannan2020; Yu and McCullough Reference Yu and McCullough2016). Weeds compete with turfgrass for essential resources such as nutrients, water, and light, leading to reduced vigor and aesthetic appeal (Brosnan et al. Reference Brosnan, Elmore and Bagavathiannan2020). Effective weed management is crucial to maintaining healthy turfgrass. Among available strategies, herbicides have long been a cornerstone of weed management in turfgrass (Hahn et al. Reference Hahn, Sallenave, Pornaro and Leinauer2020; Yu and McCullough Reference Yu and McCullough2016). However, the industry faces persistent challenges due to the emergence of herbicide-resistant weed populations (Brosnan et al. Reference Brosnan, Elmore and Bagavathiannan2020) and the lack of new herbicide modes of action (Duke and Dayan Reference Duke and Dayan2022). In response, chemical companies are actively developing and evaluating new herbicide products to strengthen weed management practices.
Accurate estimation of weed coverage and density is essential for evaluating herbicide efficacy and developing effective weed management tactics (Liu et al. Reference Liu, Abbas and Noor2021; Shorewala et al. Reference Shorewala, Ashfaque, Sidharth and Verma2021). By assessing weed density before and after herbicide application, researchers can quantify herbicide effectiveness and facilitate comparisons across treatments (Ozaslan et al. Reference Ozaslan, Gürsoy and DiTommaso2024; Patton et al. Reference Patton, Braun and Weisenberger2019). Such evaluations are essential for determining the optimal herbicide dosages that maximize control while minimizing environmental impact (Jin et al. Reference Jin, Zhao, Kong, Han, Lei, Zu, Chen and Yu2025). Moreover, weed density data support the development of integrated weed management strategies (Harker and O’Donovan Reference Harker and O’Donovan2013). Understanding weed density and distribution within a field allows agronomists and researchers to tailor management practices, improving overall weed management efficacy (Norsworthy et al. Reference Norsworthy, Korres and Bagavathiannan2018). In addition, in the context of herbicide development, analyzing the interaction between weed density and herbicide effectiveness is crucial (Harker and O’Donovan Reference Harker and O’Donovan2013). Such analyses contribute to designing targeted weed control solutions that address specific weed density pressures, ensuring precise and effective management (Dammer Reference Dammer2016; Jin et al. Reference Jin, Zhao, Kong, Han, Lei, Zu, Chen and Yu2025; Zheng et al. Reference Zheng, Zhao, Fu, Tan, Zhai and Chen2025).
In herbicide experiments, the collection of weed control data requires regular monitoring of weed coverage following pesticide application. However, human visual observation often leads to errors, especially at low (e.g., 5%, 10%) and intermediate (e.g., 50%) levels of weed density (Andújar et al. Reference Andújar, Ribeiro, Carmona, Fernández-Quintanilla and Dorado2010). These limitations hinder visual methods in threshold-based weed management programs (Andújar et al. Reference Andújar, Ribeiro, Carmona, Fernández-Quintanilla and Dorado2010), which are common in turfgrass weed evaluations (Bertin et al. Reference Bertin, Senesac, Rossi, DiTommaso and Weston2009). Moreover, data collected solely through visual observation are often insufficient for publication in many scientific journals, making it challenging for researchers to share their findings. To address this issue, researchers have used wooden frames with a grid of lines to create multiple squares (Fermanian et al. Reference Fermanian, Huffine and Morrison1980). In this method, the frame is randomly placed in the field plot, and the number of squares containing weeds is manually counted. The percentage of squares with weeds, relative to the total number of squares, provides an estimate of weed coverage (Fermanian et al. Reference Fermanian, Huffine and Morrison1980). These frames typically contain a 10 by 10 grid or more. While increasing the number of squares improves estimation accuracy, it also increases the time required for manual counting.
Convolutional neural networks (CNNs) have revolutionized agricultural practices, enhancing precision in crop management (Jin et al. Reference Jin, Han, Zhao, Wang, Chen and Yu2024; Tao and Wei Reference Tao and Wei2025; Zhuang et al. Reference Zhuang, Jin, Chen, Meng, Wang, Yu and Bagavathiannan2023). These algorithms are particularly useful in precision weed management, enabling the identification of specific weed species, counting tillers, and pinpointing weed locations for targeted herbicide application (Deng et al. Reference Deng, Miao, Zhao, Yang, Gao, Zhai and Zhao2025; Jin et al. Reference Jin, Liu, McCullough, Chen and Yu2023). Despite advancements, there remains a gap in using CNNs for accurate weed density and coverage estimation, particularly in turfgrass scenarios. Previous research has explored weed localization using CNNs by generating grid cells on input images and using image classification networks to detect weeds within those cells (Jin et al. Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022). This approach, which employs a grid-based segmentation method, has demonstrated high performance (F1 score ≥ 0.972) (Jin et al. Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022). However, its applicability for estimating turfgrass weed coverage and density has not yet been investigated. We hypothesize that this grid-based segmentation method can be adapted for weed coverage estimation in turfgrass, a concept that, to our knowledge, remains unexplored in the literature.
Developing deep learning models capable of identifying grid cells on input images offers the potential to automate weed coverage and density estimation, significantly reducing time and effort while improving the accuracy of herbicide efficacy assessments. This highlights the need for research to improve both the accuracy and efficiency of herbicide evaluations. Therefore, the objectives of this research were (1) to develop a mobile application (app) that automates the process of weed detection and density estimation using deep learning and (2) to compare the performance of various deep learning models in estimating weed coverage in turfgrass.
Materials and Methods
Overview
Seven CNN architectures—ResNet (He et al. Reference He, Zhang, Ren and Sun2016), ResNeXt (Xie et al. Reference Xie, Girshick, Dollár, Tu and He2017), ShuffleNetV1 (Zhang et al. Reference Zhang, Zhou, Lin and Sun2018), ShuffleNetV2 (Ma et al. Reference Ma, Zhang, Zheng and Sun2018), EfficientNet (Tan and Le Reference Tan and Le2019), MobileNetV3 (Howard et al. Reference Howard, Sandler, Chu, Chen, Chen, Tan, Wang, Zhu, Pang and Vasudevan2019), and MobileOne (Vasu et al. Reference Vasu, Gabriel, Zhu, Tuzel and Ranjan2023)—were investigated to detect weeds in bahiagrass (Paspalum notatum Flueggé), dormant bermudagrass [Cynodon dactylon (L.) Pers.], and perennial ryegrass (Lolium perenne L.).
ResNet is a groundbreaking CNN architecture that addresses the issue of vanishing gradients in very deep networks (He et al. Reference He, Zhang, Ren and Sun2016). It introduces residual learning by using shortcut connections, allowing for the training of extremely deep networks with improved accuracy. ResNet’s architecture significantly advanced the field of deep learning by enabling the successful training of networks with hundreds or even thousands of layers (He et al. Reference He, Zhang, Ren and Sun2016).
ResNeXt builds upon the ResNet architecture, aiming to improve accuracy and efficiency in deep learning tasks (Xie et al. Reference Xie, Girshick, Dollár, Tu and He2017). It employs a strategy called “cardinality,” grouping multiple parallel paths (similar to ResNet blocks) to increase model capacity without significantly raising computational complexity. ResNeXt demonstrates that improving network performance is more effectively accomplished by widening the network using grouped convolutions, as opposed to merely increasing its depth (Xie et al. Reference Xie, Girshick, Dollár, Tu and He2017).
ShuffleNetV1 is a lightweight CNN architecture designed for efficient image classification (Zhang et al. Reference Zhang, Zhou, Lin and Sun2018). It uses group convolution and channel shuffling to reduce computational cost while maintaining accuracy. Its low complexity and resource requirements make it particularly suitable for mobile and embedded devices.
ShuffleNetV2 further optimizes the original ShuffleNet architecture to enhance both efficiency and accuracy (Ma et al. Reference Ma, Zhang, Zheng and Sun2018). It introduces the concept of equal channel width and simplified network design, addressing the limitations of the first version, particularly in terms of memory access cost. The architecture also features a more effective channel split and a new feature map shuffle operation, enhancing both speed and performance (Ma et al. Reference Ma, Zhang, Zheng and Sun2018).
EfficientNet employs a state-of-the-art CNN architecture with a compound scaling method to uniformly scale network depth, width, and resolution (Tan and Le Reference Tan and Le2019). This approach creates a family of models that achieve superior accuracy with fewer parameters and lower computational resources than previous architectures, making it ideal for tasks requiring high efficiency, particularly in resource-constrained environments.
MobileNetV3 optimizes the earlier MobileNet architecture, focusing on enhancing both efficiency and accuracy for mobile and edge devices (Howard et al. Reference Howard, Sandler, Chu, Chen, Chen, Tan, Wang, Zhu, Pang and Vasudevan2019). It combines neural architecture search with platform-aware network optimization, resulting in a model that balances high performance with low computational costs.
MobileOne is a streamlined CNN architecture designed for real-time apps on mobile and edge devices (Vasu et al. Reference Vasu, Gabriel, Zhu, Tuzel and Ranjan2023). It uses a pre-parameterization technique to enhance both speed and accuracy during inference, achieving high efficiency with minimal computational requirements.
Image Acquisition
Images of Florida pusley (Richardia scabra L.) in actively growing bahiagrass were captured over several time periods from May to August 2018. The training images were taken at the Gulf Coast Research and Education Center in Balm, FL, USA (27.71°N, 82.29°W), while the test images were collected from multiple commercial and residential lawns in Riverview, FL, USA (27.81°N, 82.42°W), and the University of South Florida campus in Tampa, FL, USA (27.95°N, 82.45°W).
For dormant bermudagrass, training images were captured at the University of Georgia Griffin Campus in Griffin, GA, USA (33.26°N, 84.28°W), while the test images were primarily collected from various golf courses in Peachtree City, GA, USA (33.39°N, 84.59°W). The training and test images predominantly contained annual bluegrass (Poa annua L.) and various winter annual broadleaf weeds.
For perennial ryegrass, images of common dandelion (Taraxacum officinale F.H. Wigg.), ground ivy (Glechoma hederacea L.), and spotted spurge [Chamaesyce maculata (L). Small; syn.: Euphorbia maculata L.) growing within perennial ryegrass were collected from various golf courses and institutional lawns in Indianapolis, IN, USA (39.76°N, 86.15°W) to construct the training datasets. The testing dataset consisted of images of the same weed species collected from multiple institutional lawns and golf courses in Carmel, IN, USA (39.97°N, 86.11°W).
All training and testing images were acquired using a Sony Cyber-shot camera (Sony, Minato-ku, Tokyo, Japan) with a resolution of 1,920 × 1,080 pixels and were captured between 9:00 AM and 5:00 PM under diverse weather and outdoor lighting conditions, including clear, cloudy, and partly cloudy skies.
Training and Testing
During image classification training, images of bahiagrass, dormant bermudagrass, and perennial ryegrass were cropped into subimages with a resolution of 426 × 240 pixels using Irfanview (v. 5.50, Irfan Skiljan, Jajce, Bosnia). The training dataset included 20,000 positive images (containing weeds) and 20,000 negative images (without weeds). For validation, an additional set of 5,000 positive images and 5,000 negative images was utilized.
The image classification neural networks were trained and evaluated using the PyTorch (v. 1.12.0) open-source deep learning framework developed by Facebook (San Jose, CA, USA). All computations were performed on an NVIDIA GeForce RTX 3080 graphics processing unit (Santa Clara, CA, USA). To ensure consistency across models, the following hyperparameters were standardized for each experimental setup:
-
Training epochs: 30
-
Learning rate: 0.001
-
Batch size: 32
In image classification, precision (Equation 1), recall (Equation 2), and F1 score (Equation 3) are widely used metrics for evaluating model performance. These metrics are derived from the confusion matrix, which compares a model’s predicted classifications with the actual labels. The confusion matrix categorizes predictions into four categories: true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), specifically:
-
TP: Number of instances correctly classified as positive.
-
FN: Number of instances incorrectly classified as negative when actually positive.
-
FP: Number of instances incorrectly classified as positive when actually negative.
-
TN: Number of instances correctly classified as negative.
Precision is the ratio of instances correctly predicted as positive to the total number of instances predicted as positive, focusing on minimizing FPs to ensure accurate positive predictions (Sokolova and Lapalme Reference Sokolova and Lapalme2009).

Recall is the proportion of TPs correctly classified by the model. A high recall indicates the model’s effectiveness in classifying TPs, highlighting its ability to minimize FNs (Sokolova and Lapalme Reference Sokolova and Lapalme2009).

The F1 score is a comprehensive metric that combines precision and recall into a single value. It is calculated as the harmonic mean of precision and recall, offering a balanced assessment of the model’s performance, especially in scenarios where maintaining high precision and recall is crucial (Sokolova and Lapalme Reference Sokolova and Lapalme2009).

The rationale for selecting these metrics lies in their critical role in evaluating weed coverage and density using a classification neural network. Precision measures the proportion of grids predicted as containing weeds that are actually weed infested, helping to reduce FPs that could lead to overestimation of weed density. Recall quantifies the proportion of actual weed-infested grids that are correctly identified, ensuring that the model effectively captures weed presence. Because both FPs and FNs impact estimation, the F1 score, which balances precision and recall, provides a meaningful overall assessment of the model’s classification performance.
Development of the Mobile App
The mobile app for this study was developed using Android Studio, with Kotlin as the programming language. Kotlin offers several advantages, such as code conciseness and reduced NullPointerExceptions (Ardito et al. Reference Ardito, Coppola, Malnati and Torchiano2020). The Android Software Development Kit version employed was Application Programming Interface (API) level 33. The app was designed following the Model-View-ViewModel (MVVM) architectural pattern, utilizing Kotlin co-routines to create a robust and efficient app (Chauhan et al. Reference Chauhan, Kumar, Sethia and Alam2021). In Android development, MVVM provides a clear separation of concerns among components, promoting clean code and enhancing testability (Lou Reference Lou2016). Figure 1 illustrates the MVVM pattern’s three fundamental components: the Model, the View, and the ViewModel.
-
Model: The Model component represents the data and core business logic of the app. It interacts with underlying data sources, such as databases or network APIs, to retrieve and manage data efficiently.
-
View: The View component is responsible for rendering and handling the user interface (UI). It displays data provided by the ViewModel and triggers actions based on user input.
-
ViewModel: The ViewModel serves as an intermediary between the Model and the View. It holds the data to be displayed in the UI and provides methods and properties that can be bound directly to UI elements. Designed to be independent of the specific UI framework, the ViewModel enhances testability and allows for better decoupling from the View layer.

Figure 1. The Model-View-ViewModel (MVVM) architectural diagram.
To deploy the model on Android, this study utilized the Open Neural Network Exchange (ONNX) platform. Research has demonstrated that converting machine learning models to the ONNX format and deploying them with ONNX Runtime significantly improves inference speed while maintaining accuracy, making it particularly beneficial for real-time apps on mobile devices (Openja et al. Reference Openja, Nikanjam, Yahmed, Khomh and Jiang2022). ONNX serves as an open standard for representing machine learning models, enabling AI developers to use models across multiple frameworks, tools, runtimes, and compilers (Lin et al. Reference Lin, Tsai, Tang, Hsieh, Chou, Chang and Hsu2019). It plays a crucial role in promoting interoperability and standardization in machine learning, allowing for more efficient model development, deployment, and optimization workflows (Kim et al. Reference Kim, Han and Heo2021).
Figure 2 illustrates the app’s UI. The process begins with the user importing an image from the camera or gallery and selecting a suitable recognition model based on the image content (Figure 2A). Once the model is prepared, the user clicks the recognition button to initiate weed coverage analysis with the chosen model, and the results are then displayed (Figure 2B).

Figure 2. User interface of the app. (A) Configuration page showing (I) button to select an image from the camera; (II) button to select an image from the gallery; (III) dropdown to select a model; (IV) button to start recognition; and (V) image view displaying the selected image. (B) Result page showing (VI) text view displaying weed coverage; (VII) image view showing the recognized image.
To enhance segmentation accuracy and performance, a 10 by 10 grid was selected, as it strikes a balance between segmentation precision and computational efficiency, while also aligning with the actual wooden grid configurations commonly used by weed scientists. Given the hardware limitations inherent to mobile devices, such as limited memory and processing power, careful consideration was necessary to execute this segmentation efficiently. This constraint influenced the choice of in-memory segmentation, ensuring the process remains feasible within typical mobile device capabilities. For accurate recognition, images were preprocessed by segmenting them into 100 smaller sections using the 10 by 10 grid. Due to the time-intensive nature of this process, in-memory segmentation was used. In the Android system, the screen follows a two-dimensional coordinate system, with the origin at the top left corner, the positive x axis extending to the right, and the positive y axis extending downward (Figure 3). The blue area represents the selected image, while the red area indicates a subimage. Each sub-image’s dimensions were determined by dividing the width and height of the selected image by 10. Starting from the top left corner, subimages were processed sequentially from left to right across each row. Once a row was completed, the process shifted to the next row, continuing until the bottom-right subimage was reached. Each subimage was stored in a list for subsequent recognition operations. The detailed segmentation process is illustrated in Figure 4.

Figure 3. Grid-based coordinate system for image segmentation on Android. This figure shows how the input image is divided into a 10 by 10 grid for subimage processing. The origin (0, 0) starts at the top left, and subimages (highlighted in red) are extracted in row-major order from the original image (in blue).

Figure 4. Workflow of mobile-based image segmentation. This figure outlines the process of segmenting the input image into a grid of subimages. After subimage dimensions are computed, a nested loop iterates through each cell to extract and store subimages for further classification.
Detailed Process Description
-
1. Start: The function begins, ready to process the source image.
-
2. Retrieve source image dimensions: The width and height of the source image are obtained for subsequent calculations.
-
3. Calculate segment dimensions: The width and height of each segment are calculated based on a 10 by 10 grid structure.
-
4. Initialize SubImage list: An empty list is created to store the subimages.
-
5. Outer loop (columns): Iterates over columns, controlled by columnIndex (0 to columnTotal -1).
-
6. Inner loop (rows): Iterates over rows, controlled by rowIndex (0 to rowTotal -1).
-
7. Calculate starting coordinates: For each segment, the starting coordinates (subImageXValue and subImageYValue) are calculated based on rowIndex and columnIndex.
-
8. Create SubBitmap: A sub-bitmap is created from the source image at the calculated starting coordinates (subImageXValue, subImageYValue) with the determined width and height using Bitmap.createBitmap.
-
9. Add subbitmap to list: The new sub-bitmap is added to subImageList.
-
10. Inner loop condition check: If rowIndex < rowTotal -1, the inner loop continues.
-
11. Outer loop condition check: If columnIndex < columnTotal -1, the outer loop continues.
-
12. Return subimage list: After all loops complete, the function returns subImageList containing all subimages.
-
13. End: The function execution concludes.
Given the requirement to process 100 segmented images, the initial sequential recognition method was found to be inefficient. Dynamic batch processing using ONNX was introduced to improve efficiency and overall performance. Dynamic batching adjusts the batch size during inference based on real-time input data, enabling more flexible utilization of computational resources. This approach significantly improves system efficiency and response time, particularly when handling asynchronous or real-time requests.
App-Based Counting versus Manual Counting
In this study, seven models were trained for each turfgrass: bahiagrass, dormant bermudagrass, and perennial ryegrass. The optimal model for each turfgrass was selected through a comprehensive comparison of precision, recall, and F1 score from both the validation and test datasets. The best model for each turfgrass species was integrated into the app for weed density estimation. Weed density was assessed manually and using the app, with each method repeated four times per image. A Student’s t-test was conducted to compare the accuracy of the app-based method with manual counting, using a significance level of 0.05 to determine whether statistically significant differences existed between the two approaches.
Results and Discussion
Model Performance
For bahiagrass detection, EfficientNet and MobileOne demonstrated superior performance with identical F1 scores (0.988), outperforming other architectures by notable margins. Although they maintained strong detection capabilities, ResNet and ResNeXt showed relatively lower generalization capabilities in this category (Table 1).
Table 1. Neural network validation and testing result for detection of weeds in turfgrass.

a Neural network: The specific neural network model used in the weed detection task.
b Turfgrass species: The turfgrass species used in the weed detection model.
c Validation: Precision, recall, and F1 score results on the validation dataset, used for model tuning.
d Testing: Precision, recall, and F1 score results on the independent testing dataset, used to evaluate model generalization.
In the case of dormant bermudagrass, ResNet achieved the highest F1 score (0.996), with ShuffleNetV1 and MobileOne forming a close second tier. The performance hierarchy revealed consistent patterns of difference between validation and testing phases across architectures.
Lolium perenne detection results exhibited tight clustering among top performers, with ResNeXt leading with an F1 score of 0.996 on the testing dataset, followed by ShuffleNetV2 and ResNet. Notably, MobileOne showed a relatively reduced level of effectiveness in this category compared with its strong performance on other species.
App-based versus Manual Weed Coverage Estimation
The app integrates the top-performing models for each turfgrass species to estimate weed coverage and density: EfficientNet, ResNet, and ResNeXt for bahiagrass, dormant bermudagrass, and perennial ryegrass, respectively. For each turfgrass species, four individuals provided manual weed density estimates, and the app generated four corresponding estimates for comparison. As shown in Table 2, the app-based method produced a significantly higher average weed coverage estimate for bahiagrass compared with manual counting (P < 0.0001; 95% confidence interval (CI): [10.7600, 11.7400]; Cohen’s d = 31.8198). In contrast, the app-based method provided estimates comparable to manual estimates for dormant bermudagrass (P = 0.3560; 95% CI: [−0.7400, 0.2400]; Cohen’s d = −0.7071) and perennial ryegrass (P = 0.1340; 95% CI: [−1.0658, 0.0658]; Cohen’s d = −1.2247), with no significant differences observed between the two methods.
Table 2. Comparison of weed coverage percentage estimated by app-based and manual counting methods.

a App-based method: Percentage of weed coverage estimated using an automated app.
b Manual counting: Percentage of weed coverage estimated through manual counting.
c P-value (t-test): The statistical significance of the difference between the app-based and manual methods for estimating weed coverage. A P-value less than 0.05 indicates a significant difference.
d Specification: Details the specific turfgrass images used along with the associated weed species.
Time Efficiency
The comparative analysis revealed substantially improved time efficiency using the app-based method compared with manual counting across all turfgrass species (Table 3). For bahiagrass, the app-based approach reduced processing time by approximately 79% relative to manual counting. Similar efficiency gains were observed for dormant bermudagrass (79% reduction) and L. perenne (65% reduction), with the app consistently completing tasks in under 15 s compared with manual counting, which required more than 37 seconds. These quantitative improvements demonstrate the app’s ability to accelerate weed coverage assessment while maintaining accuracy across diverse growth conditions and geographic locations, as specified in Table 3.
Table 3. Descriptive statistics of model and manual recognition time efficiency.

a App-based method: Time (in seconds) taken by the app to recognize weeds.
b Manual counting: Time (in seconds) taken for manual counting of weeds.
c Specification: Details the specific turfgrass images used along with the corresponding weed species.
The application of deep learning models such as ResNet, ResNeXt, and EfficientNet for detecting weeds in bahiagrass, dormant bermudagrass, and perennial ryegrass has shown promising results that can benefit weed management practices (He et al. Reference He, Zhang, Ren and Sun2016; Tan and Le Reference Tan and Le2019; Xie et al. Reference Xie, Girshick, Dollár, Tu and He2017). These models effectively balance the need for high accuracy with computational efficiency, offering a tool that can save time and resources in large-scale agricultural operations. EfficientNet’s compound scaling method enabled it to excel in detecting weeds in bahiagrass, capturing complex features with high precision and recall. The residual connections in ResNet and ResNeXt enhanced their capacity to learn deep patterns, contributing to their success with dormant bermudagrass and perennial ryegrass (Ebski et al. Reference Ebski, Arpit, Ballas, Verma, Che and Bengio2018; Zhou et al. Reference Zhou, Zhao and Wu2021). In contrast, lightweight models, ShuffleNetV1 and MobileOne, although computationally efficient, lacked the depth needed for detailed feature extraction, resulting in lower accuracy and F1 scores, especially in the testing dataset (Vasu et al. Reference Vasu, Gabriel, Zhu, Tuzel and Ranjan2023; Zhang et al. Reference Zhang, Zhou, Lin and Sun2018).
The app tended to overestimate weed coverage in bahiagrass, particularly when R. scabra was present. To effectively distinguish R. scabra growing in bahiagrass, augmenting the training dataset by incorporating more images of R. scabra in bahiagrass is essential. In contrast, the app performed well in determining weed coverage in dormant bermudagrass and actively growing perennial ryegrass, showing strong agreement with manual counting. However, further improvements are needed to enhance its accuracy in more complex, actively growing environments.
This study demonstrates that integrating deep learning models into a mobile app for weed coverage and density estimation provides a significant time-saving advantage over traditional manual methods. This efficiency not only streamlines the estimation process but also reduces labor costs, making it particularly well suited for large-scale agricultural experiments. By automating weed coverage assessments, the app-based method offers a faster and more accurate alternative, potentially facilitating herbicide efficacy experiments and improving overall field management practices.
Beyond its technical merits, the app offers substantial economic value for turfgrass practitioners and agricultural researchers. By automating weed coverage assessments, it mitigates human error, enhances consistency, and accelerates data collection. These capabilities facilitate more timely and informed decisions regarding herbicide applications and other weed management strategies. Ultimately, the app enhances both the precision and sustainability of weed control efforts, serving as a practical and cost-effective tool for improving field management and supporting high-throughput herbicide efficacy evaluations.
The app-based method demonstrated promise in estimating weed coverage but exhibited several limitations that warrant improvement. A primary issue was the misclassification of bahiagrass as a weed, particularly in areas where R. scabra was present. This misidentification significantly contributed to an overestimation of weed coverage during the active growing phase of bahiagrass and R. scabra, underscoring the app’s difficulty in distinguishing morphologically similar species. An insufficient amount of R. scabra training data may have led to an overestimation of bahiagrass weed coverage, particularly during active growth stages. This finding suggests that further refinement of the app’s deep learning models, particularly for actively growing turfgrass species mixed with morphologically similar weeds, is necessary to improve accuracy. In contrast, the app performed well in detecting weeds in dormant bermudagrass and perennial ryegrass, closely aligning with manual counting and indicating its effectiveness in scenarios where turfgrass and weeds are more morphologically distinct.
Due to the limitations of mobile device processors, achieving high-performance weed recognition remains challenging, highlighting the need for a more efficient and accurate algorithm. While our model is currently deployed locally on mobile devices, future deployments could leverage cloud computing. By utilizing the powerful computational capabilities of the cloud, we can overcome the limitations of mobile hardware. Cloud-based deployment would allow for batch processing, significantly improving recognition efficiency. Additionally, cloud-based models could be updated seamlessly, providing greater flexibility and ensuring users have access to the latest advancements in weed detection.
The app’s effectiveness in estimating weed coverage may vary across geographic regions and management practices, influenced by factors such as weed species composition and turfgrass dormancy levels. Expanding the training dataset to include a broader range of weed and turfgrass species could enhance the app’s generalizability and performance in diverse contexts. Factors such as weed species composition and turfgrass surface quality—shaped by diverse management practices—may influence the accuracy of weed coverage and density estimation. These variations present challenges to the model’s generalization, potentially limiting its performance in unfamiliar environments. The training dataset used to develop the model is currently limited in geographic range and weed diversity, raising concerns about its generalizability to different regions or weed species. To address these limitations, users are encouraged to manually verify the app’s weed identification accuracy under their specific local conditions. Future efforts will focus on expanding the training image database to include a broader range of turfgrass types, weed species, growth stages, weed densities, and biotypes. This expansion aims to enhance the model’s adaptability and ensure consistent performance across diverse regions, weeds, and management practices. To address the misclassification of bahiagrass and R. scabra, several strategies can be employed. First, expanding the dataset to include more diverse images of R. scabra, particularly in mixed environments with bahiagrass, as well as across varying growth stages, would improve the model’s ability to differentiate between the two species. Additionally, incorporating spectral or texture-based features could improve species differentiation by capturing unique characteristics. Finally, implementing a multi-class classification layer to explicitly distinguish turfgrass from weeds would help reduce overestimation and enhance detection accuracy. These approaches are crucial for improving overall classification performance.
This study evaluated various deep learning models for detecting weeds in actively growing bahiagrass, dormant bermudagrass, and actively growing perennial ryegrass and selected the best-performing model for integration into a mobile app. Subsequently, the app-based method was compared with traditional manual counting. For bahiagrass, the app-based method tended to overestimate weed coverage, likely due to challenges in distinguishing bahiagrass from morphologically similar weeds, such as R. scabra, during active growth stages. This highlights the need for improved species differentiation. This gap in detection accuracy reflects a broader challenge in the field of weed management, where distinguishing between species with similar morphology is crucial for precision agriculture and effective herbicide application. Encouragingly, the app demonstrated strong alignment with manual counting for dormant bermudagrass and perennial ryegrass. These results suggest that the app is effective for weed detection in cases where turfgrass is dormant or when weed species are morphologically distinct from turfgrasses. While the app performs well for dormant bermudagrass and perennial ryegrass, further refinement is needed for more complex cases like bahiagrass during active growth. Addressing these challenges will be a crucial step toward improving the utility of mobile app–based weed detection in diverse agricultural settings. Future research should expand to cover a broader range of turfgrass species and other crops, such as wheat (Triticum aestivum L.). Additionally, collaborations with agricultural researchers and field practitioners would facilitate the development of region-specific models and datasets, improving the app’s adaptability to different environments and weed species. Such expansion would allow users to rapidly assess weed coverage and density through simple image capture, supporting enhanced weed management and herbicide efficacy evaluation.
Funding statement
This work was supported by the National Natural Science Foundation of China (grant no. 32072498), the Weifang Science and Technology Development Plan Project (grant no. 2024ZJ1097), the Key R&D Program of Shandong Province, China (grant no. 202211070163), the Taishan Scholar Program of Shandong Province, and the Yuandu Scholar Program of Weifang, Shandong China.
Competing interests
The authors declare no conflicts of interest.