EBS-YOLO: edge-optimized bidirectional spatial feature augmentation for in-field detection of wheat Fusarium head blight epidemics

Mao, Rui; Yuan, Hongli; Li, Feilong; Shi, Ying; Zhou, Jia; Hu, Xuemei; Hu, Xiaoping

doi:10.1186/s13007-025-01449-7

Research
Open access
Published: 22 October 2025

EBS-YOLO: edge-optimized bidirectional spatial feature augmentation for in-field detection of wheat Fusarium head blight epidemics

Plant Methods volume 21, Article number: 133 (2025) Cite this article

Abstract

Fusarium head blight (FHB), caused by the Fusarium species complex, significantly endangers wheat yield and safety. Accurate and timely assessment of FHB epidemic level in the field is crucial for effective disease management. However, the complex environment and indistinct edges of diseased areas present substantial challenges in distinguishing between healthy and diseased ears, thereby impacting the accuracy of FHB epidemic level detection. This study proposes EBS-YOLO, a novel Edge-Optimized Bidirectional Spatial Feature Augmentation YOLO Network, specifically designed for the rapid and precise determination of FHB epidemic levels at the canopy level. The Focal-Edge Selection Module (FSM) within the backbone replaces original C2f module to enhance edge feature representation and facilitate multi-scale feature extraction. Furthermore, the Dual Spatial-Connection Feature Pyramid Network (DSCFPN), integrating Global-to-Local Spatial Aggregation (GLSA) with bidirectional pyramid interaction, balances global and local feature acquisition while optimizing the feature fusion mechanism. This design enables the model to effectively handle occlusions, scale variations, and complex environments. Experimental results demonstrate substantial improvements over eight comparative models in detecting healthy and diseased wheat ears, achieving mean Average Precision (mAP) of 86.1% and 82.9%, respectively. Notably, the model achieved a mean accuracy of 94.7% in detecting FHB epidemic levels through rigorous spatiotemporal validation using datasets collected from independent fields across different years, underscoring its robust generalization capability. Characterized by its low complexity and lightweight design, EBS-YOLO features a parameter count of 2.05 M, 7.4 GFLOPs, and a model size of 5.0 MB, making it an efficient approach for real-time FHB epidemic level detection.

Background

Wheat, as the world’s most extensively cultivated cereal crop, yields over 700 million tons annually, serving as a staple food for more than 40% of the global population. This highlights the indispensable role of wheat productivity and quality in sustaining global food security [1, 2]. Fusarium head blight (FHB), caused by the Fusarium species complex, poses a significant threat to wheat by substantial reducing yields and causing economic losses. In China, FHB affects over 4.5 million hectares each year, representing approximately 20% of the country’s wheat cultivation, with average annual losses surpassing 3.41 million tons from 2000 to 2018 [3, 4]. Furthermore, the increased reliance on fungicides to maintain wheat production has raised environmental concerns [5, 6]. Therefore, accurately identifying wheat FHB epidemic levels is crucial for effective disease management and fungicide application.

The assessment of a FHB epidemic is based on the diseased ear rate, which is the percentage of diseased wheat ears within the total number of wheat ears per unit area. Traditionally, FHB assessment in wheat has relied on manual observation, a method that is both time-consuming and prone to observer bias, leading to inconsistent and inaccurate detection [7, 8]. Recent advancements in deep learning techniques and computational analytic have revolutionized crop disease detection on a global scale [9]. Numerous deep learning algorithms, such as Faster R-CNN, SSD, CenterNet, EfficientDet and the YOLO Series, have been successfully applied to crop disease detection. These technologies facilitate more precise evaluation of FHB epidemic levels in wheat canopies by both counting wheat ears and distinguishing between infected and healthy ones [10,11,12,13,14].

Accurate detection and counting of wheat ears are essential for estimating yield and evaluating epidemic levels. Hasan et al. [15] implemented a deep learning model, R-CNN, showing remarkable capabilities in detecting and counting wheat ears in complex environments, achieving an average accuracy of 93.4% and an F1 score of 95% across 20 images featuring 1570 ears. Wang et al. [16] developed GrainNet, with a lightweight and efficient feature fusion module for wheat grain detection and counting, achieving a mean average precision (mAP) of 93.15%, an F1 score of 94.6%, and a detection speed of 29.10 frames per second (FPS). Dandrifosse et al. [17] combined YOLOv5 with DeepMAC segmentation technique for both counting and segmentation of wheat ears, achieving a bounding box detection F1-score of 93% and a segmentation F1-score of 86%. Li et al. [18] presented an improved wheat ear counting method combining YOLOv7, multiple object tracking, and cross-line partitioning counting. The improved YOLOv7 achieves a detection precision of 93.8%, with mAP50 reaching 94.9% on the test set. Meng et al. [19] introduced the YOLOv7-MA model, which addresses issues of ear overlap and small sizes in complex backgrounds, achieving an impressive mAP of 93.7% on the Global Wheat Head Dataset 2021. Yu et al. [20] proposed the Oriented Feature Pyramid Network, achieving an mAP of 85.77% for oriented wheat ear detection and a counting accuracy of 93.97%. Although prior studies have achieved remarkable accuracy in the detection and counting of wheat ears, limitations persist in distinguishing wheat ears infected by FHB from healthy ones, classifying and calculating them, and detecting the epidemic levels of FHB.

To accurately assess FHB epidemic levels, it is crucial to accurately identify and quantify infected ones, along with detecting and enumerating wheat ears. Kukreja and Kumar [21] employed a Deep Convolutional Neural Network (DCNN) to categorize four types of wheat rust, achieving 97.2% accuracy. Yang et al. [22] introduced a framework integrating a spectrogram generative adversarial network with progressive neural architecture search to classify mildew-damaged, insect-damaged, and healthy wheat kernels with a 96.2% F1 score through 5-fold cross-validation. X. Zhang et al. [23] developed an algorithm utilizing an enhanced YOLOv4 model and EfficientNet for detection and classification tasks, distinguishing five common citrus diseases with 89% accuracy and an F1 score of 87.2%. Zhao et al. [24] employed Faster R-CNN architecture to tackle complicated backgrounds and tiny diseased spots in strawberry disease images, achieving an mAP of 92.2% and a mean detection time of merely 229 milliseconds. Pan et al. [25] developed RiceNet, a two-stage method for identifying four rice diseases, achieving 99.0% accuracy. Specifically, for assessing the degree of FHB infection in individual wheat ear, Su et al. [26] developed a technique using a double-mask R-CNN network, attaining a detection accuracy of 77.8% and an assessment accuracy of 77.2%. Mao et al. [27] introduced the GSEYOLOX-s model for assessing wheat FHB severity at the individual ear level, incorporating a parameterless attention module (SimAM), ghost convolution, and Efficient Intersection over Union (EIoU) loss, achieving 99.2% accuracy. Bao et al. [28] proposed an FHB detection algorithm, PCSA-YOLO, based on the YOLO detection framework. This method facilitates the detection of wheat FHB using UAV remote sensing images at the field scale. Experimental results indicate that this approach achieves a precision of 80.6%, an mAP of 83.2%, and a recall of 74.5% in detecting wheat FHB. These studies have significantly advanced the identification of crop susceptibility and FHB severity in individual wheat ear. However, challenges persist in accurately predicting FHB epidemic levels at the field scale due to FHB manifestations across various growth stages, indistinct edges of diseased areas, and interference from complex backgrounds.

In response, this study introduces an innovative approach to enhance the precision of detecting FHB epidemic levels in wheat fields with complex backgrounds. We propose EBS-YOLO, an Edge-optimized Bidirectional Spatial feature augmentation YOLO network. EBS-YOLO substitutes the initial C2f modules in the backbone network and feature pyramid network with FocalEdge Selection Module (FSM) modules, enhancing edge feature representation and extracting multi-scale feature information. Through dual spatial feature enhancement and bidirectional pyramid interaction, the Dual Spatial-Connection Feature Pyramid Network (DSCFPN) balances global and local feature acquisition while optimizing the feature fusion mechanism. The substitution of the Complete-IoU (CIoU) loss function with the more efficient Wise-IoU (WIoU) loss function improves localization accuracy. These innovations enable exceptional accuracy in identifying and classifying diseased and healthy wheat ears, thereby significantly enhancing our ability to detect the epidemic levels of wheat FHB.

Materials and methods

Data collection

Images utilized in this study were sourced from eight experimental bases located in the winter wheat region of Shannxi Province and three FHB survey sites in Hubei and Anhui Provinces between 2023 and 2024. The Shaanxi sites included Yaoliu Experimental Base in Meixian County, Wuhe Experimental Base in Huayin County, Zhangqiao Experimental Base in Fuping County, Wanxing Experimental Base in Pucheng County, Baijia Experimental Base in Meixian County, and Songcun, Qijia, and Heyuan Experimental Bases in Qishan County (Fig. 1). The survey sites in Hubei and Anhui were located in Tianmen, Jianli, and Lujiang. Over 50 wheat varieties, such as ‘Xinong No. 822’, ‘Xinong No. 100’, ‘Jinmai No. 1’, and ‘Weilong No. 169’, were cultivated across these sites.

Wheat FHB typically occurs in northern regions like Shaanxi from May to June. During this period, images were collected to capture varying epidemic levels of FHB. Experts from the College of Plant Protection at Northwest A&F University supervised the image collection process. To reflect real-world conditions, images were taken under different weather and lighting conditions between 10:00 a.m. and 5:00 p.m. using various devices, including smartphones and digital cameras. Images were saved in JPG format at two resolutions: 6000 × 4000 pixels and 4032 × 3024 pixels. To meet the training criteria of the proposed detection model, the images were uniformly resized to 640 × 640 pixels. All devices were set to automatic white balance and optical focusing. Specific samples from the wheat FHB disease dataset, captured at the canopy level, are shown in Fig. 2.

Data partitioning

The total dataset consists of 1152 images, with 1002 collected in 2023 (23A, 23B) and 150 in 2024 (24A, 24B, 24C), as outlined in Table 1. Of these, plant protection specialists captured the 150 images in 2024 during disease investigations under natural field conditions. Dataset 23B was utilized to evaluate the model’s performance, whereas datasets 24A, 24B, and 24C were used to assess its robustness. Dataset 23A was specifically allocated for model training, featuring an approximate 8:1 ratio of training to validation sets.

Table 1 The details of data set partitioning

Full size table

Data augmentation

This study employed an online data augmentation technique to augment the diversity of training samples through real-time data augmentation applied exclusively to the training set during the model training phase. This approach is crucial for increasing the model’s robustness and effectiveness, enabling better adaptation to unfamiliar data. The data augmentation procedures involved adjustments in image HSV (hue, saturation, value), translation, scaling, left-right flipping, and mosaic creation.

Criteria for assessing the epidemic levels of wheat FHB

The classification of wheat FHB epidemic levels adhered to the National Standard of the People’s Republic of China (GB/T 15796 -2011), which was established by the State Administration of Quality Supervision in 2011. The primary criterion for classifying the epidemic level of wheat FHB is the percentage of diseased wheat ears (Q), as described in Eq. (1). The epidemic levels were categorized based on the threshold of Q. Diseased wheat ears are denoted as SW, representing the total number of diseased ears in the image, while AW denotes the total number of wheat ears, including both diseased and healthy ones. The epidemic levels are categorized into five stages: level 1 for minor damage; level 2 for light damage; level 3 for moderate damage; level 4 for severe damage; and level 5 for extreme damage. Further details on these levels are provided in Table 2.

Table 2 The classification of FHB epidemic levels

Full size table

$$Q=SW/AW \times 100\% $$

(1)

EBS-YOLO structure

Despite YOLOv8’s advantages over other YOLO series networks in detection accuracy, challenges persist in detecting FHB epidemic levels in wheat, including indistinct edges of diseased areas, diverse disease manifestations across various growth stages, and interference from complex backgrounds. To address these challenges, we developed EBS-YOLO, an edge-optimized bidirectional spatial feature augmentation YOLO network. The EBS-YOLO architecture comprises three primary components: a backbone, a neck, and a head, as illustrated in Fig. 3. In the backbone, the FocalEdge Selection Module (FSM) optimizes the original C2f module to improve edge features representation and extract multi-scale feature information, allowing the model to identify diseased areas of varying sizes and stages more accurately. In the neck, a Dual Spatial-Connection Feature Pyramid Network (DSCFPN) employs dual spatial feature enhancement and bidirectional pyramid interaction. This architecture achieves a balance between global and local feature acquisition and optimizes the feature fusion mechanism, significantly improving the model’s capacity to differentiate wheat ears from background noise. The head predicts bounding boxes, categories, and object confidence based on feature maps, with localization accuracy further improved by substituting the CIoU loss function with more efficient WIoU loss function.

FocalEdge selection module (FSM)

FSM aims to systematically enhance the model’s proficiency in capturing the intricate details and edges of wheat spike and diseases at various granularities. This is achieved through a multi-scale feature extraction strategy and an edge information selection strategy. As illustrated in Fig. 4, convolutional layers initially capture local features from original input. Assuming the input feature is $\:X\in\:\:{R}^{H\times\:W\times\:C}$, where $\:H\times\:W$ represents the spatial dimensions and C represents the channel count, the feature extraction process can be represented by Eq. (2):

$$F'~=~Con{v_3}\left( X \right)$$

(2)

Here, $\:{Conv}_{3}(\cdot)$ represents the convolution layer with a kernel dimension of $\:3\times\:3$, resulting in.

$F' \in ~{R^{H \times W \times C}}.$

Adaptive average pooling method processes the input feature map at multiple scales to capture local information of varying sizes. Two convolutional layers are subsequently applied: the first reduces the channel dimension, while the second employs depthwise separable convolution for efficient feature extraction, defined in Eq. (3):

$${F_i}~=~\left( {DCon{v_3}\left( {Con{v_1}\left( {AAvgPool\left( {X,i} \right)} \right)} \right)} \right)$$

(3)

Here, $\:i\in\:\{1,\:2,\:3,\:4\}$ represents one branch at different scales, $\:{DConv}_{3}(\cdot)$ denotes the depthwise separable convolution layer with a $\:3\times\:3$ kernel size, $\:{Conv}_{1}(\cdot)$ stands for a convolution layer with a kernel size set at $\:1\times\:1$, $\:AAvgPool(\cdot)$ indicates adaptive average pooling, and $\:{F}_{i}\in\:\:{R}^{{H}_{i}\times\:{W}_{i}\times\:\frac{C}{4}}$.

The Edge Feature Enhance (EFE) module reinforces edge information, significantly improving the network’s perception of edge features. The input feature representation is smoothed via average pooling to capture its low-frequency information. Enhanced edge information is derived by subtracting the smoothed feature map from the original input feature map. Subsequently, the enhanced edge information is further refined using a convolutional layer, and a weight mask ranging from 0 to 1 is produced by the Sigmoid activation function. The processed edge information is subsequently added into the initial input feature map to yield an edge-enhanced feature map, as described in Eq. (4):

$$EFE\left( {{F_i}~} \right)~=~{F_i}~+~\sigma \left( {Con{v_1}\left( {{F_i}~ - AvgPool\left( {{F_i}~} \right)} \right)} \right)$$

(4)

where $\:AvgPool(\cdot)$ represents the average pooling, and$\:\:\sigma\:(\cdot)$ denotes the Sigmoid function that regulates the amplitude of edge enhancement.

According to Eqs. (5) and (6), the feature maps from each scale are upsampled to match the original dimensions thereby preserving spatial consistency. Local features are concatenated and fused with enhanced edge features from multi-scale processing:

$$Fi=Interpolate\left( {EFE\left( {{F_i}~} \right)} \right)$$

(5)

$$\bar {F}~=~Concat\left( {F^{\prime},F1, \ldots ,Fi} \right)$$

(6)

where $\:Interpolate(\cdot)$ represents the upsampling method, $\:EFE(\cdot)$ denotes edge information enhancement processing, and $\:Concat(\cdot)$ signifies feature concatenation. These operations yield $\:Fi\:\in\:{R}^{H\times\:W\times\:\frac{C}{4}}$ and $\:\bar{F}\in\:{R}^{H\times\:W\times\:2C}$.

Leveraging the Dual-Domain Selection Mechanism (DSM) [29], key features aligned with the target task are adaptively selected from multi-scale edge information, markedly improving feature selection accuracy and overall model performance. The enhanced feature map is then output through sequential convolutional as described in Eq. (7).

$$\stackrel{\sim}{F}=Con{v_1}(DSM\left( {\bar {F}~} \right)$$

(7)

where $\:DSM(\cdot)$ refers to the adaptive selection operation, and $\:Conv(\cdot)$ involves sequential convolution, batch normalization, and activation functions, resulting in $\:\stackrel{\sim}{F}\in\:{R}^{H\times\:W\times\:C}$.

Overall, the FSM module exhibits superior representational performance through its efficient feature extraction and edge enhancement strategies, significantly improving the model’s capability to depict image edges and detailed features.

Dual spatial-connection feature pyramid network (DSCFPN)

In the field of object detection and image recognition within deep learning, the accurate and efficient extraction and fusion of multi-scale feature information remains a pivotal issue. As demonstrated in the neck of Fig. 3, this study proposes a DSCFPN that employs dual spatial feature enhancement and bidirectional information interaction. By synergizing the Global-to-Local Spatial Aggregation (GLSA) [30] with a bidirectionally connected feature pyramid architecture [31], DSCFPN enhances both global contextual features and local fine-grained details while optimizing the feature fusion mechanism.

The GLSA plays a crucial role in extracting and fusing local-global spatial features from the backbone network. It employs a dual-branch channel separation mechanism: one channel captures global feature representations through the global context block, while the other channel processes local feature information extracted by multiple deep convolutions. Specifically, given a feature map $\:X\in\:\:{R}^{H\times\:W\times\:C}$, it is evenly divided into two groups of feature maps, $\:{X}^{1}\:\text{a}\text{n}\text{d}\:{X}^{2}$, which are then input into the global and local spatial attention module, respectively. Subsequently, the outputs of the two attention units are combined and processed through a 1 × 1 convolutional layer. This process can be represented as Eqs. (8) and (9):

$${X^1},~{X^2}=Split\left( X \right)$$

(8)

$$X'=Con{v_1}\left( {Concat\left( {{G_{sa}}\left( {{X^1}} \right),{L_{sa}}\left( {{X^2}} \right)} \right)} \right)$$

(9)

where $\:{G}_{sa}(\cdot)$ indicates the global spatial attention, and $\:{L}_{sa}(\cdot)$ indicates the local spatial attention, leading to $\:X'\in\:\:{R}^{\frac{H}{8}\times\:\frac{W}{8}\times\:\frac{C}{2}}$. This dual-channel seseparation design greatly preserves the modeling capabilities of both local and global information, making the module more adaptable and efficient in processing targets of diverse scales.

In the design of the feature pyramid architecture, a weighted bidirectional connection feature pyramid network is selected. Compared to the traditional FPN structure, this architecture exhibits several significant advantages. It facilitates rapid multi-scale feature fusion. By introducing learnable weights, the network can automatically gauge the relevance of diverse input features. Additionally, through iterative application of top-down and bottom-up multi-scale feature fusion, it realizes effective bidirectional cross-scale connections and weighted feature fusion. In this DSCFPN, the GLSA and the FSM serve as core components, significantly enhancing the weighted bidirectional structure to achieve efficient feature fusion and optimization. GLSA processes global and local spatial features in parallel, effectively integrating semantic information across diverse scales and enhancing the feature pyramid with multi-scale representations. FSM focuses on enhancing image edge details by extracting high-frequency information to heighten feature distinctiveness. The two modules work together within the framework of the weighted bidirectional structure. They facilitate bidirectional cross-scale information flow, from higher to lower levels and vice versa, while adaptively weighting and fusing features of varying levels and types. This enables the network to capture both the overall contours of objects and small details in complex scenes. Ultimately, it significantly improves the model’s performance in distinguishing wheat spikes from background noise.

WIoU

WIoU, introduced by Zheng et al. [32], is an advanced bounding box loss function using an adaptive, non-monotonic focusing mechanism. WIoU emphasizes the importance of object bounding boxes, distinguishes between different object categories, and provides a more nuanced performance evaluation. It incorporates a weighting factor that tailors the assessment of different target box categories, thereby enhancing the precision of target detection. Traditional IoU calculations are susceptible to inaccuracies due to size discrepancies, potentially skewing the similarity assessment. However, WIoU addresses this by integrating weight factors, which refine the similarity calculations for smaller target boxes. Additionally, WIoU considers both similarity and weight factors in target boxes, improving the accuracy of overlap measurements and thereby bolstering the robustness of object detection algorithms while mitigating false detection and omissions. WIoU v1 offers an attention-based bounding box loss, whereas WIoU v2 integrates a focusing mechanism through a gradient gain (focusing coefficient) calculation, enhancing the model’s sensitivity to complex cases. This study employs WIoU v2, which applies a monotonic focusing mechanism to the cross-entropy calculation, effectively minimizing the impact of less complex examples on loss computation. During training, as $\:{L}_{IoU}$ decreases, so does the gradient gain $\:{L}_{IoU}^{{\gamma\:}^{*}}$, leading to a slower convergence rate in later training stages. Therefore, the mean of $\:{L}_{IoU}$ is introduced as a normalization factor.

$${L_{WIoUv2}}={\left( {\frac{{L_{{IoU}}^{{\gamma *}}}}{{\overline {{{L_{IoU}}}} }}} \right)^\gamma }{L_{WIoUv1}}$$

(10)

$$\begin{aligned} {L_{WIoUv1}} & ={R_{WIoU}}{L_{IoU}} \\ &={\text{exp}}\left( {\frac{{{{\left( {x - {x_{gt}}} \right)}^2}+{{\left( {y - {y_{gt}}} \right)}^2}}}{{{{\left( {W_{g}^{2}+H_{g}^{2}} \right)}^{\text{*}}}}}} \right) \cdot \left( {1 - \frac{{{W_i}{H_i}}}{{{s_u}}}} \right) \end{aligned}$$

(11)

where $\overline{{L_{{{\text{IoU}}}} }} $ denotes the exponential moving average with momentum m. By dynamically updating the normalizing factor, it is feasible to uphold the gradient gain $\:r={\left(\frac{{L}_{IoU}^{\gamma\:*}}{{L}_{IoU}}\right)}^{\gamma\:}$ at a consistently high level. This approach effectively mitigates the issue of slowed convergence in the later phases of training.

Evaluation

In this research, the EBS-YOLO model’s performance was assessed via precision (P), recall (R), average precision (AP), mean average precision (mAP), parameter count (Params), giga Floating Point Operations Per Second (GFLOPs), and accuracy (Acc). P quantifies the ratio of true positives to all positive predictions, whereas R gauges the ratio of actual positives accurately detected by the model. AP represents the average precision for a single class, derived from the precision-recall curve and calculated as the area under this curve. mAP is the average of AP values across all categories. Specifically, mAP50 denotes the mAP when the IoU between the predicted and ground-truth bounding boxes is at least 0.5. Conversely, mAP50-95 refers to the mAP over IoU thresholds ranging from 0.5 to 0.95. Model complexity was quantified by the Params, while computational efficiency was measured in GFLOPs. Acc is the proportion of correctly predicted epidemic levels to all predictions made by the model. The formulas for these metrics are provided in Eqs. (12)–(16).

$${\text{P}}=\frac{{TP}}{{TP+FP}}~ \times 100\% $$

(12)

$${\text{R}}=\frac{{TP}}{{TP+FN}}~ \times 100\% $$

(13)

$${\text{AP}}=\mathop \smallint \limits_{0}^{1} ~P\left( R \right)dR~ \times 100\% $$

(14)

$${\text{mAP}}=\frac{{\mathop \sum \nolimits_{{i=1}}^{n} ~AP\left( n \right)}}{n}~ \times 100\% $$

(15)

$${\text{Acc}}=\frac{{\left( {{\text{TP}}+{\text{TN}}} \right)}}{{\left( {{\text{TP}}+{\text{TN}}+{\text{FP}}+{\text{FN}}} \right)}} \times 100\% $$

(16)

Results and discussion

Experimental environments

The experiment was conducted using the Pytorch deep learning framework, with specific hardware and software configurations listed in Table 3. The model was trained over 200 epochs with a batch size of 16. A Stochastic Gradient Descent (SGD) optimizer facilitated the fine-tune of the model.

Table 3 Hardware and software specifications

Full size table

Comparison experiments with different IoU losses

In the comparative experiments on loss functions (Table 4), both CIoU and WIoU3 achieved an mAP50 value of 84.1%, ranking them second, whereas WIoU1 had a relatively lower mAP50 value of 83.3%. WIoU2 recorded the highest mAP50 value of 84.5%, indicating that this loss function enables the model to identify objects more precisely during detection. Overall, the WIoU2 loss function demonstrated strong performance across mAP50, mAP50-95, and recall metrics. Although its precision was not the highest, its overall performance was noteworthy, striking a favorable balance between detection accuracy and recall, thus making it the optimal choice in the experiments. While WIoU3 excelled in precision, it did not match WIoU2’s comprehensive performance across other metrics.

Table 4 Comparative experimental results of different losses

Full size table

Ablation experiments and analysis

As demonstrated by ablation experiments (Table 5), sequentially integrating the DSCFPN, FSM, and WIoU2 modules into YOLOv8 significantly enhances its performance: the mAP50 metric increases from 83.1% to 84.5%, and mAP50-95 grows from 48.0% to 49.3%, indicating improved detection accuracy, stability, and generalization across IoU thresholds. Each module uniquely affects precision and recall: DSCFPN increases recall from 78.6 to 79.0% (with precision stable); FSM slightly enhances precision but decreases recall; WIoU2 elevates recall to the highest recorded value (80.6%) with a minor precision loss, achieving a superior precision-recall balance. Progressive module integration also reduces model complexity: parameters decrease from 3.00 to 2.05 M, GFLOPs from 8.1 to 7.4, and weights from 6.3 to 5.0 MB. Notably, DSCFPN cuts parameters and computational requirements without compromising performance gains. Overall, DSCFPN, FSM, and WIoU2 collectively enhance YOLOv8’s detection accuracy while reducing its size and computational cost, achieving a favorable balance between performance and efficiency.

Table 5 Ablation experiments

Full size table

Real-time inference performance verification of EBS-YOLO model

The inference performance of EBS-YOLO across multiple hardware configurations is presented in Table 6, demonstrating excellent real-time detection capabilities in all test scenarios. On the NVIDIA GeForce RTX 3090, EBS-YOLO achieves a maximum frame rate of 38.4 FPS and a single-frame inference time of only 26.07ms, well above the real-time detection baseline of ≥ 25 FPS. Even in virtualized environments such as vGPU-32GB and on mid-range hardware like the RTX 3080 Ti, it maintains stable performance between 33.7 and 34.3 FPS, with inference times kept within 29ms. These results confirm that EBS-YOLO meets real-time detection requirements across various GPU configurations, specifically maintaining efficient performance on low-end and mid-range hardware and in virtualized environments, thereby providing robust technical support for edge deployment scenarios like field inspections.

Table 6 Inference performance test results

Full size table

Comparative experiments and analysis in wheat ear detection

Comparisons with different detection algorithms

As shown in Table 7, EBS-YOLO achieves the highest mAP50 at 84.5%, which significantly outperforms CenterNet’s 52.6%. It also exhibits the strongest target-capturing capability with a recall of 80.6%, contrasting sharply with CenterNet’s 39.0% recall. Notably, CenterNet has a leading precision of 82.0%, but suffers from severe missed detections due to its low recall. EBS-YOLO also surpasses the latest YOLOv12 in both mAP50 and recall. In terms of model complexity, EBS-YOLO has 2.05 M parameters, second only to YOLOv10’s 2.26 M. In contrast, CenterNet’s parameter count of 32.66 M leads to higher training and computational costs. EfficientDet requires the least computation with 5.2 GFLOPs, while RTDETR demands a high computation of 56.9 GFLOPs. Model weight is crucial for storage and transmission costs; EBS-YOLO is remarkably lightweight at 5.0 MB, substantially lighter than CenterNet’s 97.8 MB, which incurs significantly higher costs. Compared to the published wheat scab detection model PCSA-YOLO, the proposed EBS-YOLO achieves superior performance in the comprehensive metric mAP50 and in recall, with only 2.05 M parameters. EBS-YOLO balances high accuracy and efficiency, delivering superior mAP50 and recall for accuracy, while maintaining low parameters, low computation, and a lightweight design for efficiency. Conversely, CenterNet underperforms overall, and RTDETR lacks competitiveness in mAP50, recall, and efficiency. Therefore, EBS-YOLO is the preferred choice for scenarios requiring both high detection accuracy and model efficiency.

Table 7 Performance comparison of different detection algorithms

Full size table

Figure 5 compares model performance in detecting diseased and healthy wheat ears using the AP-diseased and AP-healthy metrics. The YOLO series achieved high detection accuracy overall: YOLOv11 excelled in detecting healthy ears (AP-healthy = 84.8%), while YOLOv8 was most effective for diseased ears (AP-diseased = 82%). Conversely, CenterNet underperformed across both metrics, revealing the limitations of its architecture for this task. Models such as RTDETR, SSD, and EfficientDet exhibited basic detection capabilities but lacked the accuracy required for practical applications, suggesting the need for further algorithmic refinement. Notably, EBS-YOLO demonstrated exceptional performance, with high precision and robust generalization for detecting both diseased and healthy wheat ears.

Su et al. [26] used Mask R-CNN with a ResNet-101 FPN for wheat ear disease detection and severity assessment. They achieved 77.8% accuracy for infected ears, 98.8% for affected areas, and 77.2% for FHB severity prediction based on the lesion-to-ear ratio. While their study showed potential for rapid FHB severity evaluation, it also identified areas needing improvement in diseased ear detection. In contrast, our research emphasizes the evaluation of FHB epidemic level at the canopy scale rather than individual ears. With EBS-YOLO, we achieved 84.5% AP in wheat ear detection, which is 27.8% higher than Su et al.’s method, and a recall improvement of 9.6%. EBS-YOLO excels at distinguishing infected from healthy wheat ears, providing a foundation for FHB epidemic level estimation. Its compact parameters and low computational complexity enable rapid data processing, supporting real-time detection and deployment in resource-limited environments. Continuous optimization of the proposed method remains crucial to enhance its practicality. Future research will focus on improving the detection framework’s performance and adaptability across diverse agricultural scenarios.

Comparative analysis of heatmaps for YOLO series methods

Heatmap comparisons of YOLO series methods illuminate their focus areas concerning target features. Based on previous performance assessments, we conducted a detailed heatmap analysis that compares EBS-YOLO with other YOLO models [Fig. 6: (a) original image; (b)–(f) heatmaps for EBS-YOLO, YOLOv8, YOLOv10, YOLOv11, YOLOv12]. In complex field environments (with weeds), EBS-YOLO outperforms its counterparts in capturing and localizing target features. Its heatmap accurately encompass target regions with concentrated feature responses and distinct boundaries, effectively highlighting crucial semantic information. Conversely, YOLOv11 exhibits feature diffusion, with responses deviating from target cores. YOLOv8, YOLOv10, and YOLOv12 can identify targets but show weak focus on lesion regions and incomplete feature coverage. This analysis confirms EBS-YOLO’s superiority in target localization and feature representation, providing visual evidence for its effectiveness.

Comparison and analysis of wheat FHB epidemic level detection

This study evaluates the accuracy of various models in identifying wheat FHB epidemic levels, as illustrated in Fig. 7. Among baseline models, SSD achieved the second-highest accuracy (69%), closely followed by YOLOv8 (68%) and YOLOv12 (67%). While these models show basic reliability, they perform over 20% less accurately than EBS-YOLO. Notably, both CenterNet and EfficientDet recorded accuracies below 60%, making them unsuitable for high-precision FHB epidemic detection due to increased risks of misjudgment and missed detections. In summary, EBS-YOLO’s significantly superior accuracy positions it as the optimal solution for detecting wheat FHB epidemic levels.

Figure 8 illustrates confusion matrixes comparing the performance of these models in recognizing wheat FHB epidemic levels. EBS-YOLO demonstrates exceptional accuracy and stability in detection tasks across levels 1 to 5. In contrast, other models, including YOLOv8, YOLOv10, YOLOv11, YOLOv12, REDETR, CenterNet, SSD, and EfficientDet, achieve high correct identification rates only at levels 1 and 5. This can be attributed to the distinct characteristics of these levels. At epidemic level 1, most wheat plants are healthy, allowing models to identify the level by recognizing numerous healthy ears. Conversely, at epidemic level 5, fields are heavily infected with FHB, allowing models to identify the level by recognizing numerous diseased ears. However, at epidemic levels 2, 3, and 4, disease progression is dynamic, resulting in visually complex scenes of healthy and diseased ears. Detecting these levels requires models to distinguish wheat ear health accurately and achieve precise spatial localization; failure to do so leads to inaccuracies in calculating the diseased ear rate, lowering detection accuracy for levels 2 to 4.

EBS-YOLO exhibits robust performance in detecting wheat FHB epidemic levels, comparable to established studies. For example, Zhang et al. [33] proposed an enhanced YOLOv5 method (integrating background removal, multi-modal feature extraction, and random forest classification) for wheat ear detection and FHB epidemic assessment. Validated across two locations during 2020 and 2021, their method achieved 92% (2020A), 90% (2021A), and 90% (2021B) accuracy. EBS-YOLO matches Zhang et al.’s 2020 A performance and surpasses their 2021 outcomes. Moreover, EBS-YOLO offers a significant advantage for real-time processing applications, unlike Zhang et al.’s method, which requires complex segmentation and feature extraction. This advantage facilitates early FHB detection and management, helping reduce potential yield losses.

Visual comparison of detection results of different FHB epidemic levels

Figure 9 showcases examples of wheat at various epidemic levels, as identified by EBS-YOLO, YOLOv8, YOLOv11, CenterNet, and RTDETR. Specifically, the third column (Ⅲ) illustrates detection outcomes of the YOLOv8 model, while the second column (Ⅱ) presents the results from the EBS-YOLO model. Comparing these results with the ground truth in the first column (Ⅰ) reveals that the YOLOv8 model frequently misses instances and makes false detections, particularly in challenging scenarios involving densely packed wheat ears, exposed ground, and yellowing wheat. In contrast, the EBS-YOLO model exhibits significantly fewer inaccuracies. For example, at epidemic level 1, YOLOv8 erroneously detects one instances, whereas EBS-YOLO reports no false detection. This pattern of improved accuracy persists across all levels, highlighting EBS-YOLO’s superior performance in detecting smaller targets. The remaining columns (Ⅳ to Ⅵ) illustrate detection results from YOLOv11, CenterNet, and RTDETR, which all exhibit increasing rates of missed and false detections, thereby compromising accurate prediction of FHB epidemic levels. These findings emphasize EBS-YOLO’s ability to precisely identify and differentiate various stages of FHB, confirming its efficiency in disease detection.

Spatiotemporal independent validation experiments

The datasets 24A, 24B, and 24C, constructed from FHB survey samples collected by plant protection experts in 2024, were kept independent of the temporal and geographical context of the training data. To evaluate the robustness of EBS-YOLO in detecting FHB epidemic levels under natural field conditions, a spatiotemporal independent validation was applied to the 24A, 24B, and 24C. EBS-YOLO’s accuracy in identifying FHB epidemic levels on these datasets was 96%, 96%, and 92% (Table 8), respectively. Compared to 23B, EBS-YOLO’s accuracy remained above 92%, demonstrating the mode’s reliability in detecting FHB epidemic levels. Employing spatiotemporal independent validation provides a more accurate reflection of real-world detection scenarios and effectively assesses EBS-YOLO’s generalization ability under diverse spatiotemporal conditions.

Table 8 The accuracy of EBS-YOLO in detecting the wheat FHB epidemic levels evaluated through Spatiotemporal independent validation experiments

Full size table

Conclusion

In this study, the EBS-YOLO model was developed to enable rapid and precise detection of FHB epidemic levels in wheat at the canopy level within field environments. The feature extraction capability of YOLOv8 was enhanced by substituting the C2f module with FSM in the backbone. Additionally, the DSCFPN achieved a balance between global and local feature acquisition, optimizing the feature fusion mechanism. Replacing the CIoU loss with the more efficient WIoU loss improved localization accuracy. The model achieved a mAP of 84.5% for wheat ear detection, outperforming YOLOv8, YOLOv10, YOLOv11, YOLOv12, RTDETR, CenterNet, SSD, and EfficientDet models by 1.4%, 1.9%, 1.5%, 1,8%, 9.2%, 31.9%, 8.5%, and 15%, respectively. Furthermore, EBS-YOLO reached an accuracy of 92% in detecting FHB epidemic levels, significantly surpassing these models by 24%, 29%, 30%, 25%, 31%, 33%, 23%, and 42%, respectively. Additionally, the EBS-YOLO contains only 2.05 M parameters, representing a 31.6% reduction compared to YOLOv8, with a computational requirement of just 7.4 GFLOPs, highlighting its efficiency. In spatiotemporal independent validation, EBS-YOLO demonstrated a high mean detection accuracy of 94.7% on test datasets collected under natural field conditions for FHB surveys.

Notably, EBS-YOLO’s advantages—including its lightweight architecture, robust feature extraction, and adaptive localization—suggest strong potential for application to other crop disease detection tasks. For instance, its ability to capture fine-grained features in complex field backgrounds (such as distinguishing wheat ears from weeds) could be extended to detecting wheat blast (Magnaporthe oryzae pathotype Triticum), rice blast (by focusing on leaf lesion characteristics), or corn ear rot (by identifying diseased kernel textures).

Overall, the EBS-YOLO model represents a significant advancement for field applications, delivering superior performance in FHB epidemic detection while offering advantages in terms of complexity and efficiency. Future research will focus on deploying the EBS-YOLO in WeChat mini programs or in unmanned aircraft system to provide growers with user-friendly FHB detection and expert decision support, while also exploring its adaptation to rice, corn, and other crop disease detection scenarios, thereby broadening its agricultural utility.

Data availability

The datasets supporting the conclusions of this article are available in the GitHub repository, https://github.com/yuanYuan8686/wheat-FHB-dataset.

Abbreviations

EBS-YOLO:: Edge-Optimized Bidirectional Spatial Feature Augmentation YOLO
FHB:: Fusarium head blight
FSM:: Focal-Edge Selection Module
DSCFPN:: Dual Spatial-Connection Feature Pyramid Network
GLSA:: Global-to-Local Spatial Aggregation
mAP:: Mean average precision
AP:: Average precision
DCNN:: Deep Convolutional Neural Network
EIoU:: Efficient Intersection over Union
CIoU:: Complete Intersection over Union
WIoU:: Wise Intersection over Union
HSV:: Hue, saturation, value
DSM:: Dual-Domain Selection Mechanism
GFLOPs:: Giga Floating Point Operations Per Second
Acc:: Accuracy
P:: Precision
R:: Recall
Params:: Parameter count
SGD:: Stochastic Gradient Descent

References

Ma Z, Xie Q, Li G, Jia H, Zhou J, Kong Z, et al. Germplasms, genetics and genomics for better control of disastrous wheat fusarium head blight. Theor Appl Genet. 2020;133:1541–68.
Article PubMed Google Scholar
Zhang Q, Men X, Hui C, Ge F, Ouyang F. Wheat yield losses from pests and pathogens in China. Agric Ecosyst Environ. 2022;326:107821.
Article Google Scholar
Chen Y, Kistler HC, Ma Z. Fusarium graminearum trichothecene mycotoxins: Biosynthesis, Regulation, and management. Annu Rev Phytopathol. 2019;57:15–39.
Article CAS PubMed Google Scholar
Wang H, Sun S, Ge W, Zhao L, Hou B, Wang K, et al. Horizontal gene transfer of Fhb7 from fungus underlies Fusarium head blight resistance in wheat. Science. 2020;368:eaba5435.
Article CAS PubMed Google Scholar
Chen A, Islam T, Ma Z. An integrated pest management program for managing fusarium head blight disease in cereals. J Integr Agric. 2022;21:3434–44.
Article Google Scholar
Jia H, Zhou J, Xue S, Li G, Yan H, Ran C, et al. A journey to understand wheat fusarium head blight resistance in the Chinese wheat landrace Wangshuibai. Crop J. 2018;6:48–59.
Article Google Scholar
Barbedo JGA. A review on the main challenges in automatic plant disease identification based on visible range images. Biosyst Eng. 2016;144:52–60.
Article Google Scholar
Zhang D, Gu C, Wang Z, Zhou X, Li W. Evaluating the efficacy of fungicides for wheat scab control by combined image processing technologies. Biosyst Eng. 2021;211:230–46.
Article CAS Google Scholar
Barbedo JGA. Plant disease identification from individual lesions and spots using deep learning. Biosyst Eng. 2019;180:96–107.
Article Google Scholar
Chen S, Sun P, Song Y, Luo P, DiffusionDet IEEE. 2023. p. 19773–86. https://ieeexplore.ieee.org/document/10378478/. Accessed 10 Dec 2024.
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q. CenterNet: keypoint triplets for object detection. In: 2019 IEEECVF international conference on computer vision ICCV. Seoul, Korea (South): IEEE; 2019. p. 6568–77. https://ieeexplore.ieee.org/document/9010985/. Accessed 10 Dec 2024.
Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: 2016 IEEE conference computer vision pattern recognition CVPR. Las Vegas, NV, USA: IEEE; 2016. p. 779–88. http://ieeexplore.ieee.org/document/7780460/. Accessed 10 Dec 2024.
Ren S, He K, Girshick R, Sun J, Faster R-CNN. Towards Real-Time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017;39:1137–49.
Article PubMed Google Scholar
Tan M, Pang R, Le QV, EfficientDet. Scalable and efficient object detection. In: 2020 IEEECVF conference computer vision pattern recognition CVPR. Seattle, WA, USA: IEEE; 2020. p. 10778–87. https://ieeexplore.ieee.org/document/9156454/. Accessed 10 Dec 2024.
Hasan MM, Chopin JP, Laga H, Miklavcic SJ. Detection and analysis of wheat spikes using convolutional neural networks. Plant Methods. 2018;14:100.
Article PubMed PubMed Central Google Scholar
Wang X, Li C, Zhao C, Jiao Y, Xiang H, Wu X, et al. GrainNet: efficient detection and counting of wheat grains based on an improved YOLOv7 modeling. Plant Methods. 2025;21:44.
Article CAS PubMed PubMed Central Google Scholar
Dandrifosse S, Ennadifi E, Carlier A, Gosselin B, Dumont B, Mercatoris B. Deep learning for wheat ear segmentation and ear density measurement: from heading to maturity. Comput Electron Agric. 2022;199:107161.
Article Google Scholar
Li Z, Zhu Y, Sui S, Zhao Y, Liu P, Li X. Real-time detection and counting of wheat ears based on improved YOLOv7. Comput Electron Agric. 2024;218:108670.
Article Google Scholar
Meng X, Li C, Li J, Li X, Guo F, Xiao Z. YOLOv7-MA: improved YOLOv7-Based wheat head detection and counting. Remote Sens. 2023;15:3770.
Article Google Scholar
Yu J, Chen W, Liu N, Fan C. Oriented feature pyramid network for small and dense wheat heads detection and counting. Sci Rep. 2024;14:8106.
Article CAS PubMed PubMed Central Google Scholar
Kukreja V, Kumar D, Automatic, classification of wheat rust diseases using deep convolutional neural networks. In: 2021 9th international conference reliability infocom technologies optimization trends future directions ICRITO. Noida, India: IEEE; 2021. pp. 1–6. https://ieeexplore.ieee.org/document/9596133/. Accessed 10 Dec 2024.
Yang X, Guo M, Lyu Q, Ma M. Detection and classification of damaged wheat kernels based on progressive neural architecture search. Biosyst Eng. 2021;208:176–85.
Article CAS Google Scholar
Zhang X, Xun Y, Chen Y. Automated identification of citrus diseases in orchards using deep learning. Biosyst Eng. 2022;223:249–58.
Article CAS Google Scholar
Zhao S, Liu J, Wu S. Multiple disease detection method for greenhouse-cultivated strawberry based on multiscale feature fusion faster R_CNN. Comput Electron Agric. 2022;199:107176.
Article Google Scholar
Pan J, Wang T, Wu Q, RiceNet. A two stage machine learning method for rice disease identification. Biosyst Eng. 2023;225:25–40.
Article CAS Google Scholar
Su W, Zhang J, Yang C, Page R, Szinyei T, Hirsch CD, et al. Automatic evaluation of wheat resistance to fusarium head blight using dual Mask-RCNN deep learning frameworks in computer vision. Remote Sens. 2020;13:26.
Article Google Scholar
Mao R, Wang Z, Li F, Zhou J, Chen Y, Hu X. GSEYOLOX-s: an improved lightweight network for identifying the severity of wheat fusarium head blight. Agronomy. 2023;13:242.
Article Google Scholar
Bao W, Huang C, Hu G, Su B, Yang X. Detection of fusarium head blight in wheat using UAV remote sensing based on parallel channel space attention. Comput Electron Agric. 2024;217:108630.
Article Google Scholar
Cui Y, Ren W, Cao X, Knoll A. Focal, network for image restoration. In: 2023 IEEECVF international conference computer vision ICCV. Paris, France: IEEE; 2023. pp. 12955–65. https://ieeexplore.ieee.org/document/10377428/. Accessed 18 May 2025.
Tang F, Xu Z, Huang Q, Wang J, Hou X, Su J et al. DuAT: dual-aggregation transformer network for medical image segmentation. Pattern recognition computer vision. Singapore: Springer Nature Singapore; 2024. p. 343–56. https://link.springer.com/https://doi.org/10.1007/978-981-99-8469-5_27. Accessed 18 May 2025.
Xiao J, Guo H, Zhou J, Zhao T, Yu Q, Chen Y, et al. Tiny object detection with context enhancement and feature purification. Expert Syst Appl. 2023;211:118665.
Article Google Scholar
Zheng H, Wang G, Xiao D, Liu H, Hu X. FTA-DETR: an efficient and precise fire detection framework based on an end-to-end architecture applicable to embedded platforms. Expert Syst Appl. 2024;248:123394.
Article Google Scholar
Zhang D, Luo H, Wang D, Zhou X, Li W, Gu C, et al. Assessment of the levels of damage caused by fusarium head blight in wheat using an improved YoloV5 method. Comput Electron Agric. 2022;198:107086.
Article Google Scholar

Download references

Acknowledgements

The authors express gratitude to the staff of the Yaoliu, Songcun, Wuhe, Zhangqiao, Wanxing, Tianmen, Jianli, Lujiang, Heyuan, Qijia and Baijia Experimental Bases for their invaluable assistance and support in data collection. We also sincerely appreciate the experts from College of Plant Protection who provide the essential test data for the 2024 FHB field survey.

Funding

This work was supported by the National Key Research and Development Program [2022YFD1400101-1]; the China Agriculture Research System of Wheat [CARS-03-37]; the Key Research and Development Program of Shaanxi Province [2024NC-ZDCYL-05-06]; the Key Research and Development Program of Shaanxi Province [2023-YBNY-220]; the International Cooperation Project of the Ministry of Science and Technology [G2023172013L]; the Shaanxi Province Qinchuangyuan Scientists & Engineers Team Development Project [2025QCY-KXJ-070]; the Science and Technology Partnership Program, Ministry of Science and Technology of China [KY202002018] and the National Natural Science Foundation of China [32081330501].

Author information

Authors and Affiliations

College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
Rui Mao, Hongli Yuan, Feilong Li, Ying Shi & Xuemei Hu
College of Plant Protection, Northwest A&F University, Yangling, 712100, Shaanxi, China
Jia Zhou & Xiaoping Hu
Key Laboratory of Plant Protection Resources and Pest Integrated Management of Ministry of Education, Yangling, 712100, Shaanxi, China
Xiaoping Hu
Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture and Rural Affairs, Yangling, 712100, Shaanxi, China
Xiaoping Hu
Shaanxi Engineering Research Center of Agriculture Information Intelligent Perception and Analysis, Yangling, 712100, China
Rui Mao & Xuemei Hu

Authors

Rui Mao
View author publications
Search author on:PubMed Google Scholar
Hongli Yuan
View author publications
Search author on:PubMed Google Scholar
Feilong Li
View author publications
Search author on:PubMed Google Scholar
Ying Shi
View author publications
Search author on:PubMed Google Scholar
Jia Zhou
View author publications
Search author on:PubMed Google Scholar
Xuemei Hu
View author publications
Search author on:PubMed Google Scholar
Xiaoping Hu
View author publications
Search author on:PubMed Google Scholar

Contributions

R. M.: Conceptualization, writing—review and editing, and funding acquisition. H.l. Y.: writing—original draft, methodology, and investigation. F.l. L.: validation, methodology, and formal analysis. Y. S.: resources and validation. J. Z.: data curation and visualization. X.m. H.: software and investigation. X.p. H.: supervision and funding acquisition.

Corresponding authors

Correspondence to Rui Mao or Xiaoping Hu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Mao, R., Yuan, H., Li, F. et al. EBS-YOLO: edge-optimized bidirectional spatial feature augmentation for in-field detection of wheat Fusarium head blight epidemics. Plant Methods 21, 133 (2025). https://doi.org/10.1186/s13007-025-01449-7

Download citation

Received: 17 June 2025
Accepted: 22 September 2025
Published: 22 October 2025
DOI: https://doi.org/10.1186/s13007-025-01449-7

EBS-YOLO: edge-optimized bidirectional spatial feature augmentation for in-field detection of wheat Fusarium head blight epidemics

Abstract

Background

Materials and methods

Data collection

Data partitioning

Data augmentation

Criteria for assessing the epidemic levels of wheat FHB

EBS-YOLO structure

FocalEdge selection module (FSM)

Dual spatial-connection feature pyramid network (DSCFPN)

WIoU

Evaluation

Results and discussion

Experimental environments

Comparison experiments with different IoU losses

Ablation experiments and analysis

Real-time inference performance verification of EBS-YOLO model

Comparative experiments and analysis in wheat ear detection

Comparisons with different detection algorithms

Comparative analysis of heatmaps for YOLO series methods

Comparison and analysis of wheat FHB epidemic level detection

Visual comparison of detection results of different FHB epidemic levels

Spatiotemporal independent validation experiments

Conclusion

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Plant Methods

Contact us