Next Article in Journal
Deep Ensemble Learning and Explainable AI for Multi-Class Classification of Earthstar Fungal Species
Previous Article in Journal
Bioinformatics-Based Analysis of the Screening and Evaluation of Potential Targets of FTY720 for the Treatment of Non-Small Cell Lung Cancer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advanced Multiscale Attention Network for Estrous Cycle Stage Identification from Rat Vaginal Cytology

1
The School of Mechanical and Electrical Engineering, Chengdu University of Technology, Chengdu 610059, China
2
The School of Clinical Medicine, Guizhou Medical University, Guiyang 550004, China
*
Authors to whom correspondence should be addressed.
Biology 2025, 14(10), 1312; https://doi.org/10.3390/biology14101312
Submission received: 18 August 2025 / Revised: 14 September 2025 / Accepted: 15 September 2025 / Published: 23 September 2025
(This article belongs to the Section Bioinformatics)

Simple Summary

Understanding the estrous cycle of female rats is crucial for ensuring the reliability of biomedical experiments, as hormonal fluctuations can significantly affect drug responses and physiological behaviors. However, traditional manual identification of estrous stages from microscopic images is time-consuming, subjective, and requires specialized expertise. In this study, we developed a deep learning model called SLENet to automatically classify the four stages of the rat estrous cycle (proestrus, estrus, metestrus, diestrus) based on vaginal smear images. By introducing spatial and global attention mechanisms, our model achieved a high accuracy of 96.31 on a curated dataset of 2655 images. This approach not only improves classification performance compared to existing models but also reduces human workload, providing a reliable tool for researchers in reproductive biology and pharmacological studies.

Abstract

In clinical medicine, rats are commonly used as experimental subjects. However, their estrous cycle significantly impacts their biological responses, leading to differences in experimental results. Therefore, accurately determining the estrous cycle is crucial for minimizing interference. Manually identifying the estrous cycle in rats presents several challenges, including high costs, long training periods, and subjectivity. To address these issues, this paper proposes a classification network, Spatial Long-distance EfficientNet (SLENet). This network is designed based on EfficientNet, specifically modifying the Mobile Inverted Bottleneck Convolution (MBConv) module by introducing a novel Spatial Efficient Channel Attention (SECA) mechanism to replace the original Squeeze Excitation (SE) module. Additionally, a non-local attention mechanism is incorporated after the last convolutional layer to enhance the network’s ability to capture long-range dependencies. On 2655 microscopy images of rat vaginal epithelial cells (with 531 test), SLENet achieves 96.31% accuracy, surpassing EfficientNet (94.20%). This finding provides practical value for optimizing experimental design in rat-based studies such as reproductive and pharmacological research, but this study is limited to microscopy image data, without considering other factors; thus, future work could incorporate temporal pattern and multi-modal inputs to further enhance robustness.

1. Introduction

Laboratory rodents, particularly rats, are widely used animal models in many fields of study related to human disease [1,2], drug development [3], and genetic function [4]. In 2015, the National Institutes of Health (NIH) in policy NOT-OD-15-102 highlighted the significance of considering sex as an experimental variable to assess its impact on outcomes [5]. Similarly, the Canadian Institutes of Health Research (CIHR), in its Sex and Gender-Based Analysis Policy, emphasizes that sex and gender can influence disease susceptibility, response to pharmacological treatments, and patterns of healthcare utilization [6]. As a result, the number of medical experiments involving female rats has gradually increased [7,8]. However, the estrous cycle of animals (including rats) significantly affects gene expression [9], protein levels [10,11,12,13], behavior [14,15], and drug responses [16,17,18], leading to substantial differences in experimental results. For example, Kaustubh et al. [19] employed rat as an animal model to investigate the impact of different stages of the estrous cycle on the oral bioavailability of Genistein, an active anticancer compound. Their findings showed that, higher estrogen levels enhanced hepatic metabolism and, consequently, reduced systemic bioavailability. Another study by Lovick et al. [20] examined the influence of the estrous cycle on anxiety-related behaviors and pharmacological responses in female rats, with the aim of informing strategies for alleviating menstrual-related anxiety disorders in women. The study revealed that female rats exhibited heightened anxiety-like behaviors and fear responses during the diestrus phase. Moreover, the anxiolytic efficacy of benzodiazepines was more pronounced during the proestrus phase, whereas selective serotonin reuptake inhibitors (SSRIs) were more effective during diestrus. According to these studies, we can notice that misclassifying the estrous cycle of rat can lead to erroneous conclusions, such as attributing hormone-induced behavioral variations to sex differences, or result in flawed assessments of drug metabolism and therapeutic efficacy, potentially translating into clinical risks, such as inappropriate drug selection, dosing inaccuracies, and adverse treatment outcomes. Therefore, accurately determining the estrous cycle of rats is crucial for conducting experiments.
The estrous cycle in rats mainly consists of four stages: proestrus (P), estrus (E), metestrus (M), and diestrus (D), with a typical cycle lasting 4–5 days [21]. Various methods are currently used to classify the stages of the estrous cycle in rats, with the most common being the identification of the types, shapes, numbers, sizes, and proportions of vaginal smear cells [22,23,24,25]. The characteristics of each stage are as follows: the D stage is marked by the presence of a large number of leukocytes and small number of nucleated cells in the smear; the P stage contains nucleated epithelial cells and a small number of keratinized cells but no leukocytes; the E stage only contains anucleated keratinized epithelial cells, with no leukocyte cells; and the M stage is marked by a mixture of keratinized cells, nucleated epithelial cells, and leukocytes [26,27]. Figure 1 shows microscopic images of vaginal smear cells at the four stages of the rats’ estrous cycle.
In experiments, manually identifying the estrous cycle of rats is a commonly used and effective method. Hubscher et al. [27] used a method called PAP to stain vaginal smears of rats, quantifying the different cell populations throughout the cycle and providing guidelines for distinguishing the stages of the estrous cycle. However, several issues remain: (1) the efficiency of manual classification is low; (2) there is subjectivity in the classification results, leading to variations in outcomes; and (3) the diverse types of cells at different stages make it challenging to distinguish them, and misidentification is likely if the examiner lacks special training.
In recent years, deep learning [28,29], particularly research methods based on convolutional neural networks (CNNs), has been widely applied in the field of medical image processing [30,31,32,33]. By adjusting the convolutional kernel size, CNN extracts local features at various scales [34], and its hierarchical structure captures finer-grained representations. More importantly, the design of parallel computing aligns perfectly with the operational logic of GPUs [35,36]. Therefore, CNN-based models have achieved excellent performance in the field of medical image processing. For instance, HAQ M A et al. [37] developed a classification model called DCNNBT by modifying and optimizing the EfficientNet network for brain tumor images, using a large dataset of brain MRI images for training, ultimately achieving an accuracy of 99.18%. In the same task, Liao et al. [38] proposed a model called GraphMriNet, achieving an average accuracy of 99.92% on four open datasets. Walid et al. [39] proposed a lightweight network model for classifying multi-modal medical images based on traditional CNN architecture; this model achieved classification accuracies of 92.70%, 91.10%, 100%, and 100% in the task of classifying medical images from ultrasound, X-rays, CT, and MRI datasets under direct training.
Using CNN-based models to assess the estrous cycle in rats has achieved some promising results. For example, Kyohei et al. [40] utilized a VGG16-based network, SECREIT, to identify rat vaginal cells in microscopic images to classify their estrous cycle. They compared the performance of this model with experienced examiners and found that SECREIT achieved an accuracy of 93.30%, surpassing the accuracy of the examiners. Wolcott et al. [41]. proposed EstrousNet, another CNN-based model to deal the same task, they used ResNet50 architecture with transfer learning, achieving an average accuracy of 88.90%. Babaev et al. [42] proposed ODES, a two-stage estrous cycle classification framework that first employs YOLOv8 to detect and categorize individual vaginal epithelial cells, followed by a rule-based algorithm to classify the estrous stages based on cell type and proportions, achieving up to 88% accuracy. However, these models still present some limitations, for example, SECREIT and EstrousNet use two fully connected layers (with 500 and 3 nodes) and may excessively compress the high-dimensional convolutional features, which can result in the loss of complex information such as cell type proportions and spatial relationships. ODES decouples feature extraction and stage classification (e.g., high-level stage inference is based solely on discrete cell counts), which may limit the model’s ability to learn hierarchical or context-dependent features, and its performance deteriorates in cases of cell clustering, low contrast, or ambiguous transitional samples, where features are essential but underutilized.
Although existing studies have attempted to classify the estrous cycle stages of female rats using deep learning methods, most approaches still lack the capability to extract morphological and cytological features specific to vaginal smear images. Moreover, there is currently a lack of customized models and attention mechanisms that can effectively integrate channel features, spatial features, and global information of mouse vaginal cell images. As a result, existing methods often overlook some unique challenges, such as cell distribution patterns, ambiguous boundaries between categories, and staining artifacts. Therefore, to fill the gap, and further enhance the performance of the deep learning model in the classification of the rat estrous cycle, we propose a novel multiscale model that combines feature fusion and global attention SLENet, the overall structure is shown in Figure 2.
This model is designed to analyze microscopic images of rat vaginal epithelial cells and classify the four stages of the estrous cycle, providing classification results as a reference to assist researchers in their assessments.
The main contributions of this work are summarized as follows:
  • We construct a dataset of 2655 stained images of rat vaginal epithelial cells, which integrates veterinary science, biomedical research, and computer vision, providing valuable data support for applying deep learning methods to estrous cycle classification tasks.
  • We proposed SLENet, a multiscale medical image classification network that integrates a novel attention mechanism, SECA. SECA improves upon conventional attention approaches by jointly capturing channel and spatial features while maintaining computational efficiency, providing a solution for medical image classification in a specific domain.
  • We further enhance the model’s ability to capture global information by integrating a non-local module after the final convolutional layer. This design enables the network to effectively model long-range dependencies, addressing the challenge of insufficient global feature modeling capability in specific medical image classification tasks.

2. Materials & Methods

2.1. Dataset

In this study, vaginal exfoliative cell staining was used to obtain microscopic images of the estrous cycle in rats. The specific procedure was as follows: A medical cotton swab moistened with saline was used to collect cells from the vaginal wall of the rat, which were then evenly smeared onto a glass slide. After air drying, the slides were stained with 0.20% methylene blue for approximately 15 minutes, followed by rinsing with water and air drying again before sealing for preservation. The experimental rats were injected with cyclophosphamide and leucovorin. Cyclophosphamide was used to induce immunosuppression, providing an experimental model to study the effects of immunosuppression on reproduction. Additionally, cyclophosphamide can cause alterations or disruptions in the estrous cycle of rat, which helps in collecting cell images representing various cycle stages with more diverse cellular morphology. Leucovorin was administered as a protective agent to mitigate the toxic side effects caused by cyclophosphamide, thereby ensuring the animals maintained a basic level of health during the experiment. Data were collected daily over a period of four weeks.
The annotator in this study is a researcher with a veterinary science and animal physiology background, with over six years of experience in laboratory animal management and reproductive physiology. Based on expert domain knowledge [27,43], the estrous cycle stages E, P, M, and D were annotated by analyzing the morphology, quantity, and proportion of cornified epithelial cells, leukocytes, and nucleated epithelial cells in the images. The research team conducted spot checks and reviews on a subset of the samples; no significant discrepancies were observed. Finally, the estrous cycle was classified into four phases (P, E, M, D), yielding 646, 672, 670, and 667 images, respectively. The dataset used in this study can be accessed from the corresponding author upon reasonable request.

2.2. Data Preprocessing

Data preprocessing aims to ensure a consistent format and enhance the comprehensiveness of the dataset, improving the accuracy and robustness of the model. In the original dataset, some images exhibited noticeable differences in brightness, which may have been caused by uneven smear thickness during slide preparation or cell overlap due to high cell density. Moreover, due to differences in cellular structures, appropriate contrast adjustments were necessary to extract richer detail from the images. To address this, we experimentally evaluated different adjustment strategies and determined that randomly increasing or decreasing brightness by up to 15% and contrast by up to 10% as a part of the preprocessing process provided an empirically validated trade-off between introducing variability and preserving image detail. Additionally, data augmentation methods like flipping and random rotation are applied to expand the dataset, allowing the results to have a certain degree of generalization. All images are converted into JPG format, which is suitable for processing by convolutional networks.
Before training, the images in the original dataset needed to be rescaled to match the input requirements (224 × 224) of SLENet. In this process, interpolation methods were used to reasonably estimate and reconstruct the values of new pixels, thereby preserving the visual quality and information content of the images after scaling. Commonly used simple interpolation methods present various issues. For instance, nearest-neighbor interpolation often results in a strong aliasing effect, with jagged edges and significant detail loss in the cell images. Although bilinear interpolation offers some improvement, it still lacks sufficient smoothness for cellular images, potentially causing blurred cell boundaries. Other methods, such as Lanczos or area-based interpolation, may theoretically produce better results but are computationally intensive and more sensitive to noise, which can lead to edge artifacts after scaling and compromise image quality.
In comparison, bicubic interpolation offers a better balance by preserving image detail quality while maintaining relatively low computational complexity. Therefore, this study adopted bicubic interpolation as the method for image scaling. The interpolation function of this is defined as follows:
W ( x ) = ( a + 2 ) | x | 3 ( a + 3 ) | x | 2 + 1 f o r | x | 1 a | x | 3 5 a | x | 2 + 8 a | x | 4 a f o r 1 < | x | < 2 0 o t h e r w i s e
where x represents the absolute distance from the target pixel to its neighboring points, and a is a constant typically set to −0.5 or 1. This function is used to calculate the weights of the target pixel relative to the 16 surrounding neighbor pixels, and the target pixel value can then be computed using the following formula:
I ( x , y ) = i = 1 2 j = 1 2 W ( x x i ) · W ( y y i ) · I ( x i , y i )
Using bicubic interpolation allows the compressed images to retain more information from the original images, resulting in more accurate model outputs.
To train and evaluate the model’s performance, the original dataset was divided into training, validation, and test sets at a ratio of 6:2:2 (1593 images for training, 531 for validation, and 531 for testing). A random sampling strategy was employed during the splitting process to ensure independence and generalizability across the training and evaluation phases.

2.3. EfficientNet Model

The EfficientNet model, proposed by Tan and Le in 2019 [44], is designed based on CNN. The network architecture is automatically optimized using Neural Architecture Search (NAS) algorithms, balancing computational efficiency and accuracy [45,46]. EfficientNet introduces a compound scaling technique that utilizes a fixed compound factor Φ to allow simultaneous adjustments of the network’s depth, width, and resolution. Compared to the traditional single-dimensional scaling techniques used in conventional network models, this approach more evenly increases model capacity and utilizes resources more effectively, enhancing model performance while maintaining computational efficiency, as described by Formula (3):
D e p t h : D = α Φ W i d t h : W = β Φ R e s o l u t i o n : R = γ Φ s . t . α · β 2 · γ 2 2
The core of this model is the Mobile Inverted Bottleneck Convolution (MBConv) module, which incorporates the Squeeze-and-Excitation (SE) channel attention mechanism. The design of this module is similar to that of MobileNetV2 and utilizes an inverted residual structure, providing strong feature extraction capabilities. The EfficientNet model is composed of 16 stacked MBConv modules, along with 2 convolutional layers, 1 global average pooling layer, and 1 fully connected layer.
However, despite its advantages, the attention mechanism of EfficientNet mainly focuses on channel information and lacks explicit spatial modeling, and it does not capture long-range dependencies. These limitations reduce its effectiveness in microscopic medical image classification, where both fine-grained local details and global cellular distributions are important. Motivated by this, we proposed SLENet, which introduces the SECA module for joint channel–spatial attention and a non-local module for global context modeling, providing a more effective architecture for medical image classification.

2.4. Improved MBConv Module

Attention mechanisms are widely applied in computer vision [47], primarily functioning to mimic human attention by extracting key information and discarding irrelevant data, thus enhancing model performance. The original MBConv module incorporates the Squeeze-and-Excitation (SE) attention mechanism, which compresses spatial information through a Squeeze layer to generate channel descriptors. An Excitation layer then learns channel weight coefficients, and is applied to the original channel feature maps to improve network performance.
The classification of the estrous cycle does not rely on a single dimension: it is more about a combination of features such as cell morphology, types, and density. The channel attention mechanism SE can effectively differentiate the staining patterns of different cells; however, the average pooling function within the SE module compresses the spatial information of the image, which may diminish the attention to high-density cell clusters or characteristic morphological regions in microscopic images, potentially overlooking some important local spatial information.
In this study, the SE attention module in the MBConv module is replaced by a novel attention mechanism called SECA. The specific structure is illustrated in Figure 3. This attention mechanism utilizes local 1D convolutions instead of fully connected layers, reducing computational complexity compared to the SE module. Additionally, the new module introduces two convolutional layers with a scaling ratio of 4 and a kernel size of 7. A Sigmoid activation function is employed to learn the spatial weights of the images, which are then applied to the output. This mechanism effectively scales and extracts spatial information from the input feature maps, allowing MBConv to not only capture the channel information but also preserve spatial information, enhancing the model’s focus on locally salient features and thereby improving its accuracy.

2.5. Non-Local Attention Mechanism

The convolution operations used to process image data present local connectivity, meaning that the correlations of features extracted through convolution are limited to local regions. In microscopic images of rat cell smears, blank areas can occupy a significant portion of the image pixels. As a result, traditional convolutional layers with kernel sizes of 3 × 3 or 5 × 5 can effectively handle local information, but may lead to the majority of processing results being irrelevant to the task since these blank regions do not contain cellular information.
When classifying the estrous cycle in rats, it is essential not only to focus on the details of individual cells to determine their types and local spatial features but also to consider the overall quantity and distribution patterns of the cells, which means that global features in the images should be emphasized. Relying solely on traditional convolution operations may overlook these critical aspects, resulting in decreased model accuracy.
To address the issue of missing global information, this study introduces a non-local attention mechanism to capture longer-distance dependencies within the image. This mechanism utilizes self-attention to compute interactions between any two positions in the image, effectively extracting long-distance dependencies. The specific calculation is illustrated by Formula (4):
Q = X W q , K = X W k , V = X W v F A ( Q , K , V ) = δ Q K T d k V
where W q , W k , W v represent the parameter matrices, while X is the input matrix. F A is the attention layer, and δ is the Softmax activation function; d k is the dimension of k. This mechanism effectively expands the convolutional kernel to match the size of the image, allowing for the capture of more global information and improving the classification accuracy of the model. In this study, the non-local attention mechanism is introduced after the final convolutional layer of SLENet, and the output is directly fed into the pooling layer, providing richer global information for subsequent classification. The specific structure is illustrated in Figure 4.

2.6. Experimental Environment

To achieve optimal performance, this study determined reasonable ranges for key model hyperparameters, such as learning rate, batch size, and dropout rate before the fully connected layer, based on prior related literature and experience in the field of medical image classification. A grid search strategy was then employed to evaluate various combinations of these hyperparameters. Each combination was trained under identical conditions, and the classification accuracy on the validation set was recorded. The best-performing set of hyperparameters was selected as the default configuration for this study: a learning rate of 0.01 and batch size of 16. Additionally, we incorporate dropout layers as a regularization technique to address the risk of overfitting due to irrelevant or noisy features. By using the same optimization method, we confirmed the optimal dropout rate was 0.2. In addition, an early stopping mechanism was introduced during training, using validation accuracy as the evaluation criterion. The patience was set to 20, and the optimal number of training epochs was determined to be 130.
Regarding optimizer selection, we compared commonly used optimizers in image classification tasks, including SGD, standard Adam, and AdamW. Validation accuracy was used for evaluation, and AdamW was found to perform the best in this study. This is because AdamW improves parameter regularization by decoupling weight decay from the momentum update, thus avoiding the issues present in traditional Adam where L2 regularization is entangled with adaptive updates, an improvement that is especially beneficial when training deeper networks. Moreover, the cellular images in the dataset exhibit heterogeneity (e.g., variations in morphology and density), and compared to SGD, AdamW can adaptively adjust the step size based on different gradient scales, leading to faster convergence and better capture of fine-grained features, making it more suitable for this task.
The experiments related to this study were conducted on a Windows 11 operating system, with a CPU model of Intel(R) Core(TM) i5-13600KF. A graphics processing unit (GPU) was utilized to accelerate the model training efficiency, and all experiments were carried out on a NVIDIA Corporation 4070 GPU for model training and testing. The specific model settings are detailed in Table 1.

3. Result & Discussion

3.1. Performance Indicators

When evaluating network models, specific assessment criteria are typically required to quantify and compare the model’s performance. In this study, accuracy, precision, recall, and F1 score were selected as metrics to evaluate the model’s performance. The confusion matrix serves as an analytical tool that visualizes the model’s classification of the test samples, enabling the calculation of evaluation metrics based on these visualized data.
Accuracy is a key metric that measures how correctly a model performs its classification tasks. The specific formula for accuracy is given as follows:
A c c u r a c y = T P + T N T P + T N + F P + F N
Precision reflects the proportion of truly positive samples among those that the model predicts as positive. The specific formula for precision is given as follows:
P r e c i s i o n = T P T P + F P
Recall reflects how many actual positive samples are predicted as positive. The specific formula for recall is given as follows:
R e c a l l = T P T P + F N
where T P represents true positives, reflecting the number of positive samples correctly identified by the model; T N represents true negatives, reflecting the number of negative samples correctly identified; F P represents false positives, reflecting the number of negative samples incorrectly identified as positive; and F N represents false negatives, reflecting the number of positive samples incorrectly identified as negative.
The F1 score balances the model’s precision and recall, taking into account both the completeness (recall) and the correctness (precision) of the model’s predictions. This provides a more comprehensive evaluation metric. The specific formula for the F1 score is given as follows:
F 1 S c o r e = 2 ( P r e c i s i o n R e c a l l ) P r e c i s i o n + R e c a l l

3.2. Experimental Results

For the established training and validation sets, comparative experiments were conducted using seven models: EfficientNet, ResNet18, ResNet34, ResNet50, VGG16, MobileNetV2, GoogleNet, DenseNet, and the proposed SLENet. Additionally, we introduced the Vision Transformer (ViT) as a comparative model to explore the applicability of the transformer architecture in this task. Although transformer-based models have achieved excellent performance in various computer vision tasks, in our study, ViT showed a higher training loss and significantly lower validation accuracy compared to CNN-based models, as shown in Figure 5 and Figure 6. This may be because transformer architecture generally lacks the inductive biases inherent to CNNs, such as local receptive field and translation invariance, which may limit its ability to extract effective features from relatively small datasets. In our study, due to the limited number of experimental animal samples, the dataset was insufficient to support the effective training of transformer-based models. Furthermore, the self-attention mechanism employed in transformers has a quadratic computational complexity ( O ( n 2 ) ), leading to higher computational requirements during training. In contrast, convolutional operations are more efficient, making CNNs more suitable for conditions with limited computational resources. Therefore, the following analysis will focus on the performance of CNN-based models for this task.
Figure 5 indicates that the fluctuations of EfficientNet and ResNet50 in the early training stages are more pronounced compared to the other models. After approximately 100 epochs, the validation accuracy of all models shows reduced fluctuation and begins to converge. Notably, SLENet demonstrates a clear advantage after about 60 epochs, achieving a validation accuracy exceeding 96%. This indicates an improvement in the classification performance of SLENet compared to commonly used convolutional neural networks. Importantly, SLENet exhibits smaller fluctuations in the accuracy curve throughout the training process, suggesting better generalization ability and stability than the other models.
Figure 6 presents the loss curves for each model on the training set. It can be observed that EfficientNet has the highest loss in the first 50 training epochs, while ResNet18 and ResNet34 perform well with lower loss values compared to the other models. After about 120 epochs, the loss values for all models change minimally, indicating convergence. During the 120–130 epoch range, SLENet displays lower loss values compared to the other models.
To provide a more comprehensive evaluation of the effectiveness of SLENet compared to other convolutional neural networks in the classification task of the rat estrous cycle, we present the prediction results of each model on the test set using a confusion matrix, as shown in Figure 7.
The confusion matrix indicates that SLENet demonstrates the highest accuracy in identifying the estrous stages, correctly classifying all images from this phase. It also shows strong performance in recognizing the P phase, with only one image misclassified as the D phase. Additionally, there were 6 and 12 misclassifications for the D and M phases, respectively. Compared to other classification models, which had total misclassifications exceeding 20, SLENet shows its advantage.
Figure 8 shows normalized confusion matrix data in the form of a bar chart; as can be seen, the proposed model achieves highest accuracy across the E, D, and P stages, demonstrating a comparatively superior generalization and robustness. Notably, SLENet does not achieve the top accuracy in the M stage, though it still maintains competitive performance, closely following the best-performing model; this slight drop can be attributed not only to the transitional nature of the M stage but also to its mixed cytological complexity, while SECA emphasize local details and the non-local module captures global distributions. These mechanisms are not fully optimal for transitional phases where local and global features are inconsistent, explaining the relative difficulty in accurately classifying the M stage.
To ensure the statistical reliability of our experimental results, we conducted each experiment five times with different random seeds under the same environment, reporting the mean and 95% confidence intervals for the evaluation metric, and performed statistical significance testing between SLENet and the other models, as shown in Table 2, Table 3 and Table 4. Considering that precision and recall are often correlated in value, and F1 score provides a balanced measure between them, therefore, we applied paired t-tests to calculate the p-values of each class only based on the F1 score and report the average value as an overall indicator.
Based on the results, as shown in Figure 9, the average accuracy of SLENet is 96.31%, which is the highest among these models. Table 2, Table 3 and Table 4 show that SLENet’s overall average values on precision, recall, and F1 score achieved 96.27%, 96.30%, and 96.26%, respectively, with smaller confidence intervals (3.65%, 5.76%, and 4.18%), indicating excellent classification accuracy and robustness. Notably, in some cases, although SLENet shows slightly lower average precision and recall compared to the best-performing model, it consistently exhibits the smallest confidence intervals. This indicates that its predictions are more stable and reliable. More importantly, SLENet achieves the highest F1 score in all four classes, which means it can balance precision and recall more effectively, demonstrating better overall performance in this task. Additionally, the results show that SLENet achieves statistical significance (p < 0.05) when compared with most of these models. Although the comparison with EfficientNet results in a p-value of 0.13, which does not meet the significance threshold, the proposed model still outperformed EfficientNet numerically in all classes, showing an overall advantageous performance trend.
To further evaluate the performance of SLENet in this multi-class classification task, the Receiver Operating Characteristic (ROC) and precision–recall (PR) curves were generated. The ROC curves were constructed using the one-vs-rest (OvR) strategy. As shown in Figure 10 and Figure 11, the Area Under the Curve (AUC) and average precision (AP) for all classes in the figure exceed 0.99, showing that despite the overall accuracy implying a few errors in prediction, the model has excellent ranking capability and robust probability outputs. The few misclassifications did not significantly impact the model’s ability to distinguish between classes or to correctly identify positive samples.
Analyzing the result from a biological perspective, the estrous cycle in rat is a dynamic process; therefore, during the construction of the dataset, transitional phases are inevitable. During these periods, vaginal cytology often contains a diverse range of cell types in large quantities, resulting in images rich in detailed textures. Consequently, some issues such as cell overlap, blurred edges, and uneven staining may occur in the collected images. For these atypical images, in manual classification, experts may incorporate multidimensional information to make flexible judgments, such images can be classified as "transition phases" or "suspected stage", and multiple experts may review the samples to improve accuracy when it is necessary. However, network models are trained based on fixed labels; these factors present challenges for relatively simple models (e.g., those without integrated attention mechanisms), which may have limitations in effectively extracting such complex features, ultimately leading to performance differences.
According to the results, it is evident that the highest misclassification rates occur between the M and D stages. This is because the cytological composition during stages M and D is quite similar, with both containing a large number of leukocytes. For the model, subtle differences in leukocyte proportions are difficult to distinguish accurately. Additionally, we observed that the P stage is often misclassified as E. This is due to the gradual keratinization of nucleated epithelial cells on the vaginal smears during the late P stage, making the image features increasingly resemble those of the E stage and thus confusing the model. Lastly, it is noteworthy that stages E and D are generally well distinguished. This is because stage E is characterized by densely packed and orderly arranged keratinized cells, whereas stage D is dominated by small, round leukocytes. The distinct morphological features between these two stages make them relatively easier for the model to differentiate.
Generally, the experimental results suggest that SLENet is more suitable for the classification of specific medical images.

3.3. Ablation Study

This section uses ablation experiments to demonstrate the effectiveness of the modules introduced in the SLENet network. The control group consists of the SLENet network and EfficientNet. All other parameters and conditions are kept the same.
To ensure the reliability of the results, we use the same strategy to repeat the experiment five times and report the F1 score and overall accuracy as mean ± 95% confidence interval. Additionally, to evaluate whether SECA provides superior enhancement, we incorporated attention modules such as Convolutional Block Attention Module (CBAM) and Coordinate Attention (CA), which also emphasize joint modeling of spatial and channel features, and calculated their performance. The specific results are shown in Table 5.
As we can see from the result, incorporating SECA alone can increase the mean F1 score and accuracy, and reduce the confidence interval, but the overall improvement is quite limited. In contrast, when the non-local module is introduced alone, both F1 score and accuracy decrease. However, when SECA and non-local are combined, the model achieves the best performance on both metrics (F1 score = 96.26%, accuracy = 96.31%), with a reduction in confidence intervals, which means it not only enhances the model’s predictive performance but also improves its stability. This result can be explained as follows: In this task, vaginal smear microscopic images present both prominent local features (such as clusters of keratinized cell or leukocytes) and global features (such as the proportion and spatial arrangement of different cell types), relying solely on local details or global distribution that can easily lead to misclassification. For example, a local region might already show keratinized cells, but the overall distribution still resembles the previous stage; or certain areas are dominated by leukocyte but the global proportion is not fully changed. The SECA module enhances the model’s capability to capture critical local features, improving its sensitivity to the details of the images. Additionally, the non-local module strengthens the model’s ability to capture long-range dependencies, which is crucial for recognizing specific distribution patterns. However, when applied alone, the non-local module may overemphasize global context and suppress subtle but critical local features; moreover, the dataset size may be insufficient for it to effectively learn stable long-range dependencies, which can explain the observed performance degradation. Therefore, compared to introducing a single module, integrating both of them can improve the model’s classification performance more effectively.
The results also show that substituting the SECA module with the CA module leads to only minimal performance improvement, while substituting it with CBAM even degrades the model’s performance. This indicates that the fusion methods of these modules are not suitable for discriminating features in this task; therefore, incorporating SECA is better in order to classify the estrous cycle of rats.

3.4. Complexity Analysis

To assess the computational efficiency, we compared the inference time, number of parameters, and floating-point operations (FLOPS) between the baseline (EfficientNet) and SLENet. The inference time was measured over five runs and reported the average result under the same environment to ensure consistency.
As shown in Table 6, compared to the baseline, SLENet shows an increase in model complexity; the number of parameters increases from 4.01 M to 14.19 M, which is about 3.5 times larger. Additionally, the FLOPs increase from 6.58 G to 9.35 G, an increase of 42%, with both of them reflecting a higher computational complexity. This is primarily due to the introduction of SECA and non-local, which enhance the model’s feature extraction capability but inevitably add extra parameters and computational cost.
However, despite the increase in both parameters and FLOPs, the inference time only rises from 32.32 ms to 34.58 ms, representing only a 7% increase. This result suggests that, although SLENet introduces higher theoretical complexity, the practical computational overhead remains limited, and the added operations are efficiently handled. Thus, the design achieves a favorable balance between improved accuracy and computational efficiency.

4. Conclusions

In this work, we proposed SLENet, a CNN-based network which achieved state-of-the-art performance for classifying the estrous cycle of rats, reaching an average accuracy of 96.31%. Compared with EfficientNet, SLENet improved accuracy by 2.11% while also reducing the confidence interval by 0.56%, indicating a higher predictive accuracy and a more stable performance, which highlights the value of integrating both local feature sensitivity and global context modeling. However, the model still exhibits some limitations. In particular, its ability to distinguish between the D and M phases remains insufficient, partly due to the subtle morphological differences and the transitional nature of these stages. Moreover, the current approach relies solely on cytological images, which restricts the model’s capacity to integrate complementary biological information.
Future work may focus on incorporating temporal sequence information to better capture transitional dynamics across estrous stages, as well as extending the network to multi-modal inputs such as cytological images combined with hormone measurements or behavioral indicators. In addition, expanding the dataset and validating the method across different experimental conditions would further enhance the robustness and generalizability of SLENet, supporting its potential application in reproductive biology and preclinical research.

Author Contributions

Conceptualization, X.P.; methodology, Q.W.; software, Q.W.; validation, Q.W. and Y.Z.; formal analysis, Q.W.; investigation, Q.W.; resources, Q.W.; data curation, Q.W.; writing—original draft preparation, Q.W.; writing—review and editing, Y.Z.; visualization, Q.W.; supervision, X.P.; project administration, X.P.; funding acquisition, X.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

All experimental procedures for animal handling were approved by the Animal Ethics Committee of the Affiliated Hospital of Guizhou Medical University (Approval number: 2101344).

Data Availability Statement

Missing values, inconsistent data, and erroneous records may exist in the dataset. These issues may affect the accuracy of analysis results. Requests to access these datasets should be directed to: xiaodipu2022@163.com.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SLENetSpatial Long-distance EfficientNet
MBConvMobile Inverted Bottleneck Convolution
SECASpatial Efficient Channel Attention
SESqueeze Excitation
CIHRCanadian Institutes of Health Research
NIHNational Institutes of Health

References

  1. de Jong, T.V.; Pan, Y.; Rastas, P.; Munro, D.; Tutaj, M.; Akil, H.; Benner, C.; Chen, D.; Chitre, A.S.; Chow, W.; et al. A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats. Cell Genom. 2024, 4, 100527. [Google Scholar] [CrossRef] [PubMed]
  2. Sato, M.; Nakamura, S.; Inada, E.; Takabayashi, S. Recent Advances in the Production of Genome-Edited Rats. Int. J. Mol. Sci. 2022, 23, 2548. [Google Scholar] [CrossRef]
  3. Morris, C.J.; Rolf, M.G.; Starnes, L.; Villar, I.C.; Pointon, A.; Kimko, H.; Di Veroli, G.Y. Modelling hemodynamics regulation in rats and dogs to facilitate drugs safety risk assessment. Front. Pharmacol. 2024, 15, 1402462. [Google Scholar] [CrossRef]
  4. Soufizadeh, P.; Mansouri, V.; Ahmadbeigi, N. A review of animal models utilized in preclinical studies of approved gene therapy products: Trends and insights. Lab. Anim. Res. 2024, 40, 17. [Google Scholar] [CrossRef]
  5. National Institutes of Health. NIH Policy on Sex as a Biological Variable. 2015. Available online: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-15-102.html (accessed on 5 December 2024).
  6. El-Shafai, W.; Mahmoud, A.A.; Ali, A.M.; El-Rabaie, E.S.M.; Taha, T.E.; El-Fishawy, A.S.; Zahran, O.; El-Samie, F.E.A. Efficient classification of different medical image multimodalities based on simple CNN architecture and augmentation algorithms. J. Opt. 2024, 53, 775–787. [Google Scholar] [CrossRef]
  7. Freeman, A.; Stanko, P.; Berkowitz, L.; Parnell, N.; Zuppe, A.; Bale, T.; Ziolek, T.; Epperson, C. Inclusion of sex and gender in biomedical research: Survey of clinical research proposed at the University of Pennsylvania. Biol. Sex Differ. 2017, 8, 22. [Google Scholar] [CrossRef]
  8. Prendergast, B.; Liang, J. Female rats are not more variable than male rats: A meta-analysis of neuroscience studies. Biol. Sex Differ. 2016, 7, 34. [Google Scholar] [CrossRef]
  9. Ray, S.; Tzeng, R.Y.; DiCarlo, L.M.; Bundy, J.L.; Vied, C.; Tyson, G.; Nowakowski, R.; Arbeitman, M.N. An Examination of Dynamic Gene Expression Changes in the Mouse Brain During Pregnancy and the Postpartum Period. G3 Genes|Genomes|Genetics 2016, 6, 221–233. [Google Scholar] [CrossRef]
  10. Zenclussen, M.L.; Casalis, P.A.; Jensen, F.; Woidacki, K.; Zenclussen, A.C. Hormonal Fluctuations during the Estrous Cycle Modulate Heme Oxygenase-1 Expression in the Uterus. Front. Endocrinol. 2014, 5, 32. [Google Scholar] [CrossRef] [PubMed]
  11. Spencer, J.; Waters, E.; Bath, K.; Chao, M.; Mcewen, B.; Milner, T. Distribution of Phosphorylated TrkB Receptor in the Mouse Hippocampal Formation Depends on Sex and Estrous Cycle Stage. J. Neurosci. Off. J. Soc. Neurosci. 2011, 31, 6780–6790. [Google Scholar] [CrossRef] [PubMed]
  12. Xin, H.; Li, B.; Meng, F.; Hu, B.; Wang, S.; Wang, Y.; Li, J. Quantitative proteomic analysis and verification identify global protein profiling dynamics in pig during the estrous cycle. Front. Vet. Sci. 2023, 10, 1247561. [Google Scholar] [CrossRef]
  13. Jung, W.; Yoo, I.; Han, J.; Kim, M.; Lee, S.; Cheon, Y.; Hong, M.; Jeon, B.Y.; Ka, H. Expression of Caspases in the Pig Endometrium Throughout the Estrous Cycle and at the Maternal-Conceptus Interface During Pregnancy and Regulation by Steroid Hormones and Cytokines. Front. Vet. Sci. 2021, 8, 641916. [Google Scholar] [CrossRef]
  14. Zhao, W.; Li, Q.; Ma, Y.; Wang, Z.; Fan, B.; Zhai, X.; Hu, M.; Wang, Q.; Zhang, M.; Zhang, C.; et al. Behaviors Related to Psychiatric Disorders and Pain Perception in C57BL/6J Mice During Different Phases of Estrous Cycle. Front. Neurosci. 2021, 15, 650793. [Google Scholar] [CrossRef]
  15. Milad, M.; Igoe, S.; Lebron-Milad, K.; Novales, J. Estrous cycle phase and gonadal hormones influence conditioned fear extinction. Neuroscience 2009, 164, 887–895. [Google Scholar] [CrossRef]
  16. Lebron-Milad, K.; Milad, M. Sex differences, gonadal hormones and the fear extinction network: Implications for anxiety disorders. Biol. Mood Anxiety Disord. 2012, 2, 3. [Google Scholar] [CrossRef] [PubMed]
  17. Zhang, B.; Han, Y.; Cheng, M.; Yan, L.; Gao, K.; Zhou, D.; Wang, A.; Lin, P.; Jin, Y. Metabolomic effects of intrauterine meloxicam perfusion on histotroph in dairy heifers during diestrus. Front. Vet. Sci. 2025, 12, 1528530. [Google Scholar] [CrossRef]
  18. Abulaiti, A.; Nawaz, M.; Naseer, Z.; Ahmed, Z.; Liu, W.; Abdelrahman, M.; Shaukat, A.; Sabek, A.; Pang, X.; Wang, S. Administration of melatonin prior to modified synchronization protocol improves the productive and reproductive efficiency of Chinese crossbred buffaloes in low breeding season. Front. Vet. Sci. 2023, 10, 1118604. [Google Scholar] [CrossRef] [PubMed]
  19. Kulkarni, K.H.; Yang, Z.; Niu, T.; Hu, M. Effects of Estrogen and Estrus Cycle on Pharmacokinetics, Absorption, and Disposition of Genistein in Female Sprague–Dawley Rats. J. Agric. Food Chem. 2012, 60, 7949–7956. [Google Scholar] [CrossRef]
  20. Lovick, T.A.; Zangrossi Jr, H. Effect of estrous cycle on behavior of females in rodent tests of anxiety. Front. Psychiatry 2021, 12, 711065. [Google Scholar] [CrossRef]
  21. Byers, S.L.; Wiles, M.V.; Dunn, S.L.; Taft, R.A. Mouse Estrous Cycle Identification Tool and Images. PLoS ONE 2012, 7, e35538. [Google Scholar] [CrossRef] [PubMed]
  22. Goldman, J.M.; Murr, A.S.; Cooper, R.L. The rodent estrous cycle: Characterization of vaginal cytology and its utility in toxicological studies. Birth Defects Res. B Dev. Reprod. Toxicol. 2007, 80 2, 84–97. [Google Scholar] [CrossRef] [PubMed]
  23. Gal, A.; Lin, P.C.; Barger, A.M.; MacNeill, A.L.; Ko, C. Vaginal fold histology reduces the variability introduced by vaginal exfoliative cytology in the classification of mouse estrous cycle stages. Toxicol. Pathol. 2014, 42, 1212–1220. [Google Scholar] [CrossRef] [PubMed]
  24. MacDonald, J.K.; Pyle, W.G.; Reitz, C.J.; Howlett, S.E. Cardiac contraction, calcium transients, and myofilament calcium sensitivity fluctuate with the estrous cycle in young adult female mice. Am. J. Physiol. Heart Circ. Physiol. 2014, 306, H938–H953. [Google Scholar] [CrossRef]
  25. Cora, M.C.; Kooistra, L.; Travlos, G.S. Vaginal Cytology of the Laboratory Rat and Mouse. Toxicol. Pathol. 2015, 43, 776–793. [Google Scholar] [CrossRef]
  26. Matsuda, S.; Matsuzawa, D.; Ishii, D.; Tomizawa, H.; Sutoh, C.; Shimizu, E. Sex differences in fear extinction and involvements of extracellular signal-regulated kinase (ERK). Neurobiol. Learn. Mem. 2015, 123, 117–124. [Google Scholar] [CrossRef] [PubMed]
  27. Hubscher, C.; Brooks, D.; Johnson, J. A quantitative method for assessing stages of rat estrous cycle. Biotech. Histochem. Off. Publ. Biol. Stain Comm. 2005, 80, 79–87. [Google Scholar] [CrossRef]
  28. Li, G.; Yu, Z.; Yang, K.; Lin, M.; Chen, C.P. Exploring feature selection with limited labels: A comprehensive survey of semi-supervised and unsupervised approaches. IEEE Trans. Knowl. Data Eng. 2024, 36, 6124–6144. [Google Scholar] [CrossRef]
  29. Chen, W.; Yang, K.; Yu, Z.; Shi, Y.; Chen, C.P. A survey on imbalanced learning: Latest research, applications and future directions. Artif. Intell. Rev. 2024, 57, 137. [Google Scholar] [CrossRef]
  30. Hesamian, M.H.; Jia, W.; He, X.; Kennedy, P.J. Deep Learning Techniques for Medical Image Segmentation: Achievements and Challenges. J. Digit. Imaging 2019, 32, 582–596. [Google Scholar] [CrossRef]
  31. Salehi, A.W.; Khan, S.; Gupta, G.; Alabduallah, B.I.; Almjally, A.; Alsolai, H.; Siddiqui, T.; Mellit, A. A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 2023, 15, 5930. [Google Scholar] [CrossRef]
  32. Singh, S.P.; Wang, L.; Gupta, S.; Goli, H.; Padmanabhan, P.; Gulyás, B. 3D Deep Learning on Medical Images: A Review. Sensors 2020, 20, 5097. [Google Scholar] [CrossRef]
  33. Ding, R.; Zhou, X.; Tan, D.; Su, Y.; Jiang, C.; Yu, G.; Zheng, C. A deep multi-branch attention model for histopathological breast cancer image classification. Complex Intell. Syst. 2024, 10, 4571–4587. [Google Scholar] [CrossRef]
  34. You, H.; Yu, L.; Tian, S.; Cai, W. A stereo spatial decoupling network for medical image classification. Complex Intell. Syst. 2023, 9, 5965–5974. [Google Scholar] [CrossRef] [PubMed]
  35. Shin, H.C.; Roth, H.R.; Gao, M.; Lu, L.; Xu, Z.; Nogues, I.; Yao, J.; Mollura, D.; Summers, R.M. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning. IEEE Trans. Med Imaging 2016, 35, 1285–1298. [Google Scholar] [CrossRef]
  36. Yang, K.; Yu, Z.; Chen, W.; Liang, Z.; Chen, C.P. Solving the imbalanced problem by metric learning and oversampling. IEEE Trans. Knowl. Data Eng. 2024, 36, 9294–9307. [Google Scholar] [CrossRef]
  37. Haq, M.A.; Khan, I.; Ahmed, A.; Eldin, S.M.; Alshehri, A.; Ghamry, N.A. DCNNBT: A Novel Deep Convolution Neural Network-Based Brain Tumor Classification Model. Fractals 2023, 31, 2340102. [Google Scholar] [CrossRef]
  38. Liao, B.; Zuo, H.; Yu, Y.; Li, Y. GraphMriNet: A few-shot brain tumor MRI image classification model based on Prewitt operator and graph isomorphic network. Complex Intell. Syst. 2024, 10, 6917–6930. [Google Scholar] [CrossRef]
  39. The Canadian Institutes of Health Research. Health Portfolio Sex- and Gender-Based Analysis Plus Policy: Advancing Equity, Diversity and Inclusion. 2023. Available online: https://www.canada.ca/en/health-canada/corporate/transparency/heath-portfolio-sex-gender-based-analysis-policy.html (accessed on 8 December 2024).
  40. Sano, K.; Matsuda, S.; Tohyama, S.; Komura, D.; Shimizu, E.; Sutoh, C. Deep learning-based classification of the mouse estrous cycle stages. Sci. Rep. 2020, 10, 11714. [Google Scholar] [CrossRef] [PubMed]
  41. Wolcott, N.S.; Sit, K.K.; Raimondi, G.; Hodges, T.; Shansky, R.M.; Galea, L.A.; Ostroff, L.E.; Goard, M.J. Automated classification of estrous stage in rodents using deep learning. Sci. Rep. 2022, 12, 17685. [Google Scholar] [CrossRef]
  42. Babaev, B.; Goyal, S.; Arora, T.; Autry, A.; Ross, R.A. A novel method for estrous cycle staging using supervised object detection. NPP—Digital Psychiatry Neurosci. 2025, 3, 3. [Google Scholar] [CrossRef]
  43. Paccola, C.; Resende, C.; Stumpp, T.; Miraglia, S.; Cipriano, I. The rat estrous cycle revisited: A quantitative and qualitative analysis. Anim. Reprod. (AR) 2018, 10, 677–683. [Google Scholar]
  44. Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2019, arXiv:1905.11946. [Google Scholar] [CrossRef]
  45. Gu, F.; Dai, Y.; Fei, J.; Chen, X. Deepfake detection and localisation based on illumination inconsistency. Int. J. Auton. Adapt. Commun. Syst. 2024, 17, 352–368. [Google Scholar] [CrossRef]
  46. Xu, J.; Wang, G.; Zhou, T. Exposing deepfakes in online communication: Detection based on ensemble strategy. Int. J. Auton. Adapt. Commun. Syst. 2024, 17, 24–38. [Google Scholar] [CrossRef]
  47. Liu, C.; Wei, Z.; Zhou, L.; Shao, Y. Multidimensional time series classification with multiple attention mechanism. Complex Intell. Syst. 2024, 11, 14. [Google Scholar] [CrossRef]
Figure 1. Stages of the estrous cycle include the (a) proestrus, (b) estrus, (c) metestrus, and (d) diestrus.
Figure 1. Stages of the estrous cycle include the (a) proestrus, (b) estrus, (c) metestrus, and (d) diestrus.
Biology 14 01312 g001
Figure 2. The overall architecture of SLENet.
Figure 2. The overall architecture of SLENet.
Biology 14 01312 g002
Figure 3. The structure of SECA.
Figure 3. The structure of SECA.
Biology 14 01312 g003
Figure 4. The structure of non-local.
Figure 4. The structure of non-local.
Biology 14 01312 g004
Figure 5. Validation accuracy of each model.
Figure 5. Validation accuracy of each model.
Biology 14 01312 g005
Figure 6. Training loss of each model.
Figure 6. Training loss of each model.
Biology 14 01312 g006
Figure 7. The confusion matrix of each model.
Figure 7. The confusion matrix of each model.
Biology 14 01312 g007
Figure 8. The bar chart of the confusion matrix.
Figure 8. The bar chart of the confusion matrix.
Biology 14 01312 g008
Figure 9. The overall accuracy of each model.
Figure 9. The overall accuracy of each model.
Biology 14 01312 g009
Figure 10. The ROC curve of SLENet.
Figure 10. The ROC curve of SLENet.
Biology 14 01312 g010
Figure 11. The PR curve of SLENet.
Figure 11. The PR curve of SLENet.
Biology 14 01312 g011
Table 1. The hyperparameters of model training.
Table 1. The hyperparameters of model training.
ParametersOperator
Learning rate0.01
Epochs130
Batch size16
Image size224 × 224
Loss functionCross-entropy loss
Optimization algorithmAdamW
Table 2. Precision of each model (% ↑).
Table 2. Precision of each model (% ↑).
ModelEstrusMetestrusDiestrusProestrusAverage
EfficientNet99.56 ± 0.5087.91 ± 3.8394.87 ± 0.8495.36 ± 2.1394.43 ± 7.68
ResNet1896.01 ± 0.9489.49 ± 4.6887.55 ± 2.8495.28 ± 2.0092.08 ± 6.68
ResNet3499.40 ± 0.7991.36 ± 1.2890.56 ± 1.8993.34 ± 1.7693.67 ± 6.36
ResNet5099.26 ± 0.6490.14 ± 2.6491.62 ± 1.6586.51 ± 3.6991.99 ± 8.85
VGG1696.63 ± 0.6688.92 ± 1.4689.65 ± 3.4394.52 ± 1.7292.43 ± 5.96
MobileNetV298.21 ± 0.9690.96 ± 2.6693.83 ± 2.5893.71 ± 1.7193.93 ± 4.92
GoogleNet98.48 ± 0.6993.75 ± 2.5387.46 ± 2.8495.90 ± 1.9993.90 ± 7.49
DensNet98.38 ± 1.5769.42 ± 3.3593.50 ± 2.5495.03 ± 2.7489.08 ± 21.11
SLENet99.69 ± 0.5295.50 ± 1.2194.85 ± 0.5795.05 ± 1.1696.27 ± 3.65
Note: In all tables, bold values indicate the highest mean, underlined values indicate the second-highest mean.
Table 3. Recall of each model (% ↑).
Table 3. Recall of each model (% ↑).
ModelEstrusMetestrusDiestrusProestrusAverage
EfficientNet99.54 ± 0.5291.59 ± 2.2290.61 ± 3.1995.62 ± 1.8094.34 ± 6.51
ResNet1898.95 ± 1.0485.84 ± 5.3095.09 ± 2.9089.48 ± 3.6595.52 ± 9.45
ResNet3497.88 ± 0.7186.27 ± 3.1594.35 ± 1.5195.90 ± 2.1393.60 ± 8.11
ResNet5098.05 ± 1.6379.84 ± 4.6593.29 ± 3.5698.10 ± 0.9193.32 ± 13.72
VGG1699.12 ± 1.1284.83 ± 5.3193.31 ± 2.4791.20 ± 0.6492.34 ± 9.95
MobileNetV299.08 ± 0.4788.65 ± 1.6192.65 ± 3.1795.61 ± 2.6694.00 ± 7.05
GoogleNet98.78 ± 1.1183.42 ± 4.4196.98 ± 1.3094.84 ± 1.2993.51 ± 11.00
DensNet90.30 ± 3.9391.92 ± 2.6175.36 ± 2.2589.98 ± 2.5186.89 ± 12.31
SLENet99.56 ± 0.8191.65 ± 1.3995.25 ± 1.1598.75 ± 0.5096.30 ± 5.76
Table 4. F1 score and p-value of each model (% for F1 score ↑).
Table 4. F1 score and p-value of each model (% for F1 score ↑).
ModelEstrusMetestrusDiestrusProestrusAveragep-Value (↓)
EfficientNet99.55 ± 0.3889.68 ± 1.8592.67 ± 1.3495.49 ± 1.7794.35 ± 6.690.13
ResNet1897.46 ± 0.3986.56 ± 4.2091.48 ± 2.4191.96 ± 1.5991.87 ± 7.091.7 × 10 3
ResNet3498.63 ± 0.4888.69 ± 1.2292.41 ± 1.5094.59 ± 1.0893.58 ± 6.610.038
ResNet5098.86 ± 1.0886.63 ± 6.0791.64 ± 1.6392.53 ± 3.7292.41 ± 7.990.043
VGG1698.29 ± 0.3486.75 ± 3.5891.42 ± 1.9592.82 ± 0.8492.32 ± 7.560.043
MobileNetV298.64 ± 0.5089.79 ± 1.9493.22 ± 2.4094.13 ± 1.4393.95 ± 5.800.023
GoogleNet98.47 ± 0.4687.99 ± 2.9691.96 ± 1.8695.36 ± 1.2593.45 ± 7.170.016
DensNet94.14 ± 2.0579.08 ± 2.5883.20 ± 1.5992.43 ± 2.5687.21 ± 11.537 × 10 4
SLENet99.63 ± 0.3593.53 ± 0.8895.03 ± 0.7896.86 ± 1.1596.26 ± 4.18\
Note: p-values are shown in decimal; values smaller than 0.01 are presented in scientific notation.
Table 5. Classification performance table in ablation experiments (% ↑).
Table 5. Classification performance table in ablation experiments (% ↑).
BaselineSECANon-LocalCBAMCAF1Accuracy
94.35 ± 6.6994.20 ± 0.99
94.78 ± 5.5294.55 ± 0.86
93.82 ± 4.9594.12 ± 0.56
91.29 ± 9.3691.29 ± 2.15
94.50 ± 5.5194.39 ± 0.65
96.26 ± 4.1896.31 ± 0.43
Table 6. Model complexity metrics.
Table 6. Model complexity metrics.
ModelParametersFLOPsInference Time (ms)
Baseline4.01 M6.58 G32.32
SLENet14.19 M9.35 G34.58
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Q.; Zhao, Y.; Pu, X. Advanced Multiscale Attention Network for Estrous Cycle Stage Identification from Rat Vaginal Cytology. Biology 2025, 14, 1312. https://doi.org/10.3390/biology14101312

AMA Style

Wang Q, Zhao Y, Pu X. Advanced Multiscale Attention Network for Estrous Cycle Stage Identification from Rat Vaginal Cytology. Biology. 2025; 14(10):1312. https://doi.org/10.3390/biology14101312

Chicago/Turabian Style

Wang, Qinyang, Yihong Zhao, and Xiaodi Pu. 2025. "Advanced Multiscale Attention Network for Estrous Cycle Stage Identification from Rat Vaginal Cytology" Biology 14, no. 10: 1312. https://doi.org/10.3390/biology14101312

APA Style

Wang, Q., Zhao, Y., & Pu, X. (2025). Advanced Multiscale Attention Network for Estrous Cycle Stage Identification from Rat Vaginal Cytology. Biology, 14(10), 1312. https://doi.org/10.3390/biology14101312

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop