An Imbalanced Image Classiﬁcation Method for the Cell Cycle Phase

: The cell cycle is an important process in cellular life. In recent years, some image processing methods have been developed to determine the cell cycle stages of individual cells. However, in most of these methods, cells have to be segmented, and their features need to be extracted. During feature extraction, some important information may be lost, resulting in lower classiﬁcation accuracy. Thus, we used a deep learning method to retain all cell features. In order to solve the problems surrounding insufﬁcient numbers of original images and the imbalanced distribution of original images, we used the Wasserstein generative adversarial network-gradient penalty (WGAN-GP) for data augmentation. At the same time, a residual network (ResNet) was used for image classiﬁcation. ResNet is one of the most used deep learning classiﬁcation networks. The classiﬁcation accuracy of cell cycle images was achieved more effectively with our method, reaching 83.88%. Compared with an accuracy of 79.40% in previous experiments, our accuracy increased by 4.48%. Another dataset was used to verify the effect of our model and, compared with the accuracy from previous results, our accuracy increased by 12.52%. The results showed that our new cell cycle image classiﬁcation system based on WGAN-GP and ResNet is useful for the classiﬁcation of imbalanced images. Moreover, our method could potentially solve the low classiﬁcation accuracy in biomedical images caused by insufﬁcient numbers of original images and the imbalanced distribution of original images. WGAN-GPs are used to solve the problems of an insufﬁcient number of cell images and the imbalanced distribution of images in order to reduce the impact caused by the imbalance of images. A new cell cycle classiﬁcation architecture using WGAN-GP and ResNet is proposed, and better results are obtained compared with previous methods.


Introduction
The cell cycle is an important process in cellular life. The accurate classification of a cell's stage in its cycle is essential for determining cell changes and cellular behavior in different cell stages, as well as for clarifying the principles and regulatory mechanisms of a cell's cycle. The stages of a cell cycle are determined by changes in DNA content and levels of cell-cycle-specific proteins in different cell stages. At present, the most widely used method in cell cycle analysis is flow cytometry [1]. However, flow cytometry only determines the proportion of cells in a certain stage in a group of cells, and it is difficult to track individual cells. Moreover, relevant information pertaining to cell morphology is not obtained through this method.
According to Roukos et al. [2] and Damian et al. [3], the cell cycle stage of a single cell can be determined by calculating its DNA content; however, these methods rely on accurate results from the segmentation of the nucleus. Schönenberger et al. [4] studied the cell cycle by labeling proliferating cell nuclear antigen (PCNA). The fluorescent ubiquitination-based cell cycle indicator (FUCCI) technology proposed by Sakaue-Sawano et al. [5] enables the accurate distinguishing of cells in the G1 phase or S/G2/M phase using two fusion fluorescent proteins. Bajar et al. [6] proposed a method for analyzing the four different cell cycle stages using four-color fluorescence channels based on FUCCI. However, by labeling specific cyclins, it is usually possible to accurately classify a specific cell cycle Information 2021, 12, 249 2 of 16 stage. A complete analysis of all cell cycle stages requires a combination of multiple staining methods. Ferro et al. [7] performed feature extraction on a fluorescence image of a cell nucleus and then clustered the various cell forms using the K-means algorithm; finally, they divided the cell cycle into G1, G2, and S phases. Blasi et al. [8] extracted 213 features from an acquired image of a single cell and used the Boosting algorithm for machine learning; this predicted DNA content without a fluorescent label and determined the mitotic cell cycle stages. Traditional image processing methods first need to extract features, and the selection of features also affects the accuracy of subsequent classification algorithms. Therefore, feature extraction is almost the most difficult and important part of the entire algorithm.
In recent years, deep learning technology has been more widely used in the field of cell biology. For instance, Khan et al. [9], Wang et al. [10], and Kurnianingsih et al. [11] all used deep learning to segment and classify cell images. Dürr O et al. [12] used convolutional neural networks to achieve the high-content screening-based phenotype classification of single-cell images. The classification of cellular images has become more popular; however, there are a number of applications that are related to the use of deep learning for the classification of a cell cycle. Nagao et al. [13] obtained cell images by staining subcellular structures such as the nucleus, the Golgi apparatus, and the microtubule cytoskeleton, and then used convolutional neural networks to classify the cell cycle. Eulenberg et al. [14] used deep learning to classify the cell cycle of single-cell images acquired by imaging flow cytometry into seven different stages, including phases of interphase (G1, G2, and S) and phases of mitosis (prophase, anaphase, metaphase, and telophase). They used deep neural networks instead of traditional machine learning methods for classification, obtaining an accuracy of 79.40%. The results of deep learning are better than those of traditional machine learning methods. However, the accuracy of the classification of the seven stages still needs to be improved. In addition, the number of images in some stages is too low, the amount of data samples for different cell cycle stages varies, and the distribution of images is particularly uneven. These shortcomings all affect the final result of classification, at least to some degree.
Since the duration of each cell cycle phase is different, it is difficult to obtain a balanced data set when collecting cell cycle data. Therefore, it is important to process these original images and make them more balanced. The main problem pertaining to imbalanced classification is that there are too few samples in the minority class, and the information contained in the samples is limited. It is difficult for the neural network to fully learn the characteristics of the samples through training, which will make it difficult to identify the minority class. Sampling is the most popular method for the processing of imbalanced data sets. There are several methods of over-sampling, under-sampling, and combined sampling [15,16]. Over-sampling augments the categories with fewer images and increases the number of images. Under-sampling reduces the number of images for those categories with more images. Combined sampling uses over-sampling and under-sampling simultaneously. The generative adversarial network (GAN) is an oversampling method that has seen a great deal of recent use in biomedical research. Frid-Adar et al. [17] used GANs for data augmentation of a liver lesion image dataset. Saini et al. [18] used a deep convolution generative adversarial network (DCGAN) for the data augmentation of the minority class in a breast cancer dataset. Moran et al. [19] proposed a model called the transferring of the pre-trained generative adversarial network (TOP-GAN) to solve the problem of small training datasets and applied the model for the classification of cancer cells. Zheng et al. [20] used CWGAN-GP for data augmentation and to solve the problem of classification in relation to imbalanced datasets.
In order to solve the problems of an insufficient number of images and the extremely imbalanced distribution of images, we proposed a new cell cycle classification system that is based on a generative adversarial network-gradient penalty (WGAN-GP) [21] and a residual network (ResNet) [22]. The new cell images generated by WGAN-GP and the original cell images are processed by ResNet together in order to classify the cell cycle stage. The rest of the paper is organized as follows. The method of data augmentation and the deep neural networks for cell cycle classification are introduced in Section 2. Then, the dataset used for the experiment and the parameters are shown in Section 3. The results of WGAN-GP for data augmentation and the experimental results of cell cycle classification are shown in Section 4. The results are discussed in Section 5. The conclusion is in Section 6.

Method
The contribution of the cell cycle classification method proposed in this paper is as follows: WGAN-GPs are used to solve the problems of an insufficient number of cell images and the imbalanced distribution of images in order to reduce the impact caused by the imbalance of images. A new cell cycle classification architecture using WGAN-GP and ResNet is proposed, and better results are obtained compared with previous methods. Figure 1 shows the overview structure of our system. residual network (ResNet) [22]. The new cell images generated by WGAN-GP a original cell images are processed by ResNet together in order to classify the ce stage.
The rest of the paper is organized as follows. The method of data augmentati the deep neural networks for cell cycle classification are introduced in Section 2. Th dataset used for the experiment and the parameters are shown in Section 3. The re WGAN-GP for data augmentation and the experimental results of cell cycle classi are shown in Section 4. The results are discussed in Section 5. The conclusion is in 6.

Method
The contribution of the cell cycle classification method proposed in this pap follows: WGAN-GPs are used to solve the problems of an insufficient number of ages and the imbalanced distribution of images in order to reduce the impact cau the imbalance of images. A new cell cycle classification architecture using WGAN-ResNet is proposed, and better results are obtained compared with previous m Figure 1 shows the overview structure of our system.

WGAN-GP
The generative adversarial network (GAN) was proposed by Goodfellow et GAN contains two different networks, namely discriminator and generator. The d inator is used to distinguish the original image from the generated image, and the the generator is to try to make the discriminator unable to recognize the generated The Wasserstein generative adversarial network (WGAN) is a new GAN-based n structure that was proposed by Arjovsky et al. [21]. The Wasserstein distance was calculate the distance between the original image distribution and the generated distribution in WGAN. The problem of the unstable training of GAN was basically by WGAN. Gulrajani et al. [24] proposed a gradient penalty (WGAN-GP) to so problems of vanishing and exploding gradients. WGAN-GP possesses a faster c gence rate and more stable training compared to WGAN, leading to higher sampl ity.

WGAN-GP
The generative adversarial network (GAN) was proposed by Goodfellow et al. [23]. GAN contains two different networks, namely discriminator and generator. The discriminator is used to distinguish the original image from the generated image, and the role of the generator is to try to make the discriminator unable to recognize the generated image. The Wasserstein generative adversarial network (WGAN) is a new GAN-based network structure that was proposed by Arjovsky et al. [21]. The Wasserstein distance was used to calculate the distance between the original image distribution and the generated image distribution in WGAN. The problem of the unstable training of GAN was basically solved by WGAN. Gulrajani et al. [24] proposed a gradient penalty (WGAN-GP) to solve the problems of vanishing and exploding gradients. WGAN-GP possesses a faster convergence rate and more stable training compared to WGAN, leading to higher sample quality.
At present, WGAN has been successfully applied in the classification of imbalanced biomedical images. For example, Ma et al. [25] used a deep convolutional generative adversarial network (DC-GAN) for the data augmentation of white blood cells. Additionally, Information 2021, 12, 249 4 of 16 classification accuracy was improved by DC-GAN. Dimitrakopoulos et al. [26] proposed a new GAN-based model for data augmentation that is suitable for the simultaneous production of synthetic cell images with their segmentation maps. In addition, Chen et al. [27] used WGAN to denoise cell images and obtained cell images with clear features, providing a certain practical basis for generating cell cycle images with WGAN-GP.

ResNet
ResNet was proposed by He et al. [22]. By adding direct connections to the network to skip certain layers, the problem of vanishing gradients caused by the increase in network depth was resolved. Based on ResNet, the best results of the ImageNet Large Scale Visual Recognition Challenge 2015 (ILSVRC 2015) and the breakthrough for improving its performance in many fields were achieved; these included image recognition, image detection, and image localization. ResNet has been widely applied in the field of biomedical imaging, having been used for cell classification [28,29], cell detection [30,31], early cancer detection [32,33], etc.
In this work, a 41-layer structure of ResNet was used to classify the cell cycle stage. Our structure was based on the model created by He et al. [22] and the residual module proposed by He et al. [34]. Figure 2 shows the model's structure. In the residual module, the first CONV had filters of 1 × 1, the second CONV had filters of 3 × 3, and the third CONV had filters of 1 × 1. At present, WGAN has been successfully applied in the classification of imbalanced biomedical images. For example, Ma et al. [25] used a deep convolutional generative adversarial network (DC-GAN) for the data augmentation of white blood cells. Additionally, classification accuracy was improved by DC-GAN. Dimitrakopoulos et al. [26] proposed a new GAN-based model for data augmentation that is suitable for the simultaneous production of synthetic cell images with their segmentation maps. In addition, Chen et al. [27] used WGAN to denoise cell images and obtained cell images with clear features, providing a certain practical basis for generating cell cycle images with WGAN-GP.

ResNet
ResNet was proposed by He et al. [22]. By adding direct connections to the network to skip certain layers, the problem of vanishing gradients caused by the increase in network depth was resolved. Based on ResNet, the best results of the ImageNet Large Scale Visual Recognition Challenge 2015 (ILSVRC 2015) and the breakthrough for improving its performance in many fields were achieved; these included image recognition, image detection, and image localization. ResNet has been widely applied in the field of biomedical imaging, having been used for cell classification [28,29], cell detection [30,31], early cancer detection [32,33], etc.
In this work, a 41-layer structure of ResNet was used to classify the cell cycle stage. Our structure was based on the model created by He et al. [22] and the residual module proposed by He et al. [34]. Figure 2 shows the model's structure. In the residual module, the first CONV had filters of 1 × 1, the second CONV had filters of 3 × 3, and the third CONV had filters of 1 × 1. Our ResNets were fabricated by stacking residual modules on top of one another. The numbers of residual modules were 3, 3, and 4. First, there were three residual modules, and the three CONV layers learned 32, 32, and 128 filters. Second, there were three residual modules, and the three CONV layers learned 64, 64, and 256 filters. Finally, there were four residual modules and the three CONV layers learned 128, 128, and 512 filters. The dimensions were reduced when the residual modules were stacked every time. Moreover, one CONV layer was added to the model before the residual modules, and one FC layer was added at the end of the model. As a result, the structure of our ResNet had a depth of 41 layers. The depth of our model could be changed by the number of residual modules. The structure of our model is shown in Figure 3. Our ResNets were fabricated by stacking residual modules on top of one another. The numbers of residual modules were 3, 3, and 4. First, there were three residual modules, and the three CONV layers learned 32, 32, and 128 filters. Second, there were three residual modules, and the three CONV layers learned 64, 64, and 256 filters. Finally, there were four residual modules and the three CONV layers learned 128, 128, and 512 filters. The dimensions were reduced when the residual modules were stacked every time. Moreover, one CONV layer was added to the model before the residual modules, and one FC layer was added at the end of the model. As a result, the structure of our ResNet had a depth of 41 layers. The depth of our model could be changed by the number of residual modules. The structure of our model is shown in Figure 3. Information 2021, 12, x FOR PEER REVIEW 5 of 16 Figure 3. The structure of our ResNet model.

Experiment
The whole experiment included two parts, namely, dataset generation and model training.

Dataset
A total of 32,266 original images of Jurkat cells were collected by imaging flow cytometry [14] (Jurkat dataset). The dataset was divided into seven different stages, including phases of interphase (G1, G2, and S) and phases of mitosis (prophase, anaphase, metaphase, and telophase). Figure 4 shows the original images of different cell cycle stages. The study [14] showed the G1, G2, and S phases as the same stage. Then phases of interphase (G1/G2/S) and phases of mitosis (prophase, anaphase, metaphase, and telophase) were classified, and the accuracy of the five stages of classification was 98.73% ± 0.16%. However, when the phases of interphase (G1/G2/S) stages were separated and regarded as one stage (G1, G2, S), then seven stages of the image were classified, leading to an accuracy of 79.40% ± 0.77%. Although the G1, G2, and S phases were combined into one stage and higher classification accuracy of five stages was obtained, the accurate classification of cell cycle stages was not achieved. For the classification of a cell cycle, it was necessary not only to separate stages with excessively different morphological details, such as phases of mitosis (prophase, anaphase, metaphase, and telophase), from the other stages, but also to separate phases of interphase (G1, G2, and S) with similar morphological details.
In addition, it was clear from the original images that the number of images in the anaphase, metaphase, prophase, and telophase stages was too low, and the amount of data in different periods varied greatly, leading to inaccurate classification results. Therefore, based on the distribution of the original images of each stage, WGAN-GPs were used to increase the amounts of anaphase, metaphase, prophase, and telophase tenfold by us. In order to achieve a relative balance for the number of images in each cell cycle stage, random under-sampling was used for the G1 phase; 8610 images of the G1 stage were

Experiment
The whole experiment included two parts, namely, dataset generation and model training.

Dataset
A total of 32,266 original images of Jurkat cells were collected by imaging flow cytometry [14] (Jurkat dataset). The dataset was divided into seven different stages, including phases of interphase (G1, G2, and S) and phases of mitosis (prophase, anaphase, metaphase, and telophase). Figure 4 shows the original images of different cell cycle stages.

Experiment
The whole experiment included two parts, namely, dataset generation and model training.

Dataset
A total of 32,266 original images of Jurkat cells were collected by imaging flow cytometry [14] (Jurkat dataset). The dataset was divided into seven different stages, including phases of interphase (G1, G2, and S) and phases of mitosis (prophase, anaphase, metaphase, and telophase). Figure 4 shows the original images of different cell cycle stages. The study [14] showed the G1, G2, and S phases as the same stage. Then phases of interphase (G1/G2/S) and phases of mitosis (prophase, anaphase, metaphase, and telophase) were classified, and the accuracy of the five stages of classification was 98.73% ± 0.16%. However, when the phases of interphase (G1/G2/S) stages were separated and regarded as one stage (G1, G2, S), then seven stages of the image were classified, leading to an accuracy of 79.40% ± 0.77%. Although the G1, G2, and S phases were combined into one stage and higher classification accuracy of five stages was obtained, the accurate classification of cell cycle stages was not achieved. For the classification of a cell cycle, it was necessary not only to separate stages with excessively different morphological details, such as phases of mitosis (prophase, anaphase, metaphase, and telophase), from the other stages, but also to separate phases of interphase (G1, G2, and S) with similar morphological details.
In addition, it was clear from the original images that the number of images in the anaphase, metaphase, prophase, and telophase stages was too low, and the amount of data in different periods varied greatly, leading to inaccurate classification results. Therefore, based on the distribution of the original images of each stage, WGAN-GPs were used to increase the amounts of anaphase, metaphase, prophase, and telophase tenfold by us. In order to achieve a relative balance for the number of images in each cell cycle stage, random under-sampling was used for the G1 phase; 8610 images of the G1 stage were The study [14] showed the G1, G2, and S phases as the same stage. Then phases of interphase (G1/G2/S) and phases of mitosis (prophase, anaphase, metaphase, and telophase) were classified, and the accuracy of the five stages of classification was 98.73% ± 0.16%. However, when the phases of interphase (G1/G2/S) stages were separated and regarded as one stage (G1, G2, S), then seven stages of the image were classified, leading to an accuracy of 79.40% ± 0.77%. Although the G1, G2, and S phases were combined into one stage and higher classification accuracy of five stages was obtained, the accurate classification of cell cycle stages was not achieved. For the classification of a cell cycle, it was necessary not only to separate stages with excessively different morphological details, such as phases of mitosis (prophase, anaphase, metaphase, and telophase), from the other stages, but also to separate phases of interphase (G1, G2, and S) with similar morphological details.
In addition, it was clear from the original images that the number of images in the anaphase, metaphase, prophase, and telophase stages was too low, and the amount of data in different periods varied greatly, leading to inaccurate classification results. Therefore, based on the distribution of the original images of each stage, WGAN-GPs were used to increase the amounts of anaphase, metaphase, prophase, and telophase tenfold by us. In order to achieve a relative balance for the number of images in each cell cycle stage, random under-sampling was used for the G1 phase; 8610 images of the G1 stage were used for classification. The number of generated images and the number of images used for classification are shown in Table 1. After the seven stages of cell cycle images were classified, these images were divided into four stages-phases of interphase (G1, G2, S) and phases of mitosis (M). The images of anaphase, metaphase, prophase, and telophase stages were combined into one stage, namely M, and the dataset obtained is shown in Table 2. It can be seen from Table 2 that the number of images of each stage reached a balance.

Model Training
The WGAN-GP was used to train the four stages of anaphase, metaphase, prophase, and telophase, and the batch sizes were set to 4, 16, 16, and 4, respectively, according to the number of original images. In all, 5000 epochs were set for each training process. Subsequently, the WGAN-GP model was used to generate 150, 680, 6060, and 270 images for the four stages of the cell cycle, meaning that the images for each stage were increased tenfold. For the four-stage classification of the cell cycle, the WGAN-GP model was used to generate 7,160 images of the M stage. During the process of training for WGAN-GP, the batch size was 16, and the training epoch was 5000.
The parameters of the network for classification were randomly initialized. The original size of the images was 66 × 66 × 1. All of the images were resized to 64 × 64 × 1 and divided into mini-batches for training. During the process of training for classification, the batch size was 32, the initial learning rate was 0.01, and the momentum was 0.9. The optimization strategy was the stochastic gradient descent method, and the default activation function was ReLU in the entire network.
During the classification, 60% of the images were used as the training set, 20% of the images were used as validation, and the rest of the images were used as the testing set. For the four-stages classification, the number of original images used for classification was 33,427. Moreover, 20,657 images were used for training, 6885 images were used for validation, and 6885 images were used for testing.
The environment for the experiments was Python 3.6, and the operating system was Linux with an Intel (R) Xeon (R) CPU E5-2682 v4 @ 2.50GHz processor, 32GB memory, and a Tesla P100-PCIE-16GB graphics card. The experiments were based on the open-source deep learning framework TensorFlow-gpu 2.0.0a0 and Keras 2.3.1.

Results of Generated Images by WGAN-GP
According to the four stages of cell cycle images-anaphase, metaphase, prophase, and telophase-generated by WGAN-GP, the generated images could be used for subsequent cell cycle classification, as they were almost the same as the original cell cycle images. As such, Figure 5 shows the images generated by WGAN-GP. and a Tesla P100-PCIE-16GB graphics card. The experiments were based on the opensource deep learning framework TensorFlow-gpu 2.0.0a0 and Keras 2.3.1.

Results of Generated Images by WGAN-GP
According to the four stages of cell cycle images-anaphase, metaphase, prophase, and telophase-generated by WGAN-GP, the generated images could be used for subsequent cell cycle classification, as they were almost the same as the original cell cycle images. As such, Figure 5 shows the images generated by WGAN-GP.  Tables 3 and 4. The results for the original images, the images generated by WGAN-GP, the original images after under-sampling, and the images generated by WGAN-GP after under-sampling were compared with each other. Table 3. Seven-stage classification results of the original images, the images generated by WGAN-GP, the original images after under-sampling, and the images generated by WGAN-GP after under-sampling.

Cell Cycle Stages
The

Number of Original Images
The Number of Images Generated by WGAN-GP  In order to verify the effectiveness of WGAN-GP, the following sets of classification experiments were conducted. The compared results are shown in Tables 3 and 4. The results for the original images, the images generated by WGAN-GP, the original images after under-sampling, and the images generated by WGAN-GP after under-sampling were compared with each other. Table 3. Seven-stage classification results of the original images, the images generated by WGAN-GP, the original images after under-sampling, and the images generated by WGAN-GP after under-sampling.

Cell Cycle Stages The Number of Original Images
The Number of Images Generated by WGAN-GP  Table 4. Four-stage classification results for the original images, the images generated by WGAN-GP, the original images after under-sampling, and the images generated by WGAN-GP after under-sampling.

Cell Cycle Stages The Number of Original Images
The Number of Images Generated by WGAN-GP As shown in Tables 3 and 4, the seven-stage classification accuracy of the original images and the four-stage classification accuracy of the original images were 78.37% and 78.35%, respectively. The seven-stage classification accuracy of images generated by WGAN-GP and the four-stage classification accuracy of images generated by WGAN-GP were 82.25% and 82.10%, respectively. The seven-stage classification accuracy and the four-stage classification accuracy improved by 3.88% and 3.75%, respectively.
In order to obtain balanced images, random under-sampling was used for the stage of G1. The seven-stage classification accuracy of original images after under-sampling and the four-stage classification accuracy of original images after under-sampling were 78.32% and 77.16%, respectively. The seven-stage classification accuracy of images generated by WGAN-GP after under-sampling and the four-stage classification accuracy of images generated by WGAN-GP after under-sampling were 83.60% and 83.88%, respectively. The seven-stage classification accuracy and the four-stage classification accuracy were improved by 5.28% and 6.72%, respectively. From these results, it was clear that the sevenstage classification accuracy and the four-stage classification accuracy were improved by WGAN-GP.
Moreover, the four-stage classification accuracy was reduced by about 1.15% when the images of the M stage were original, and the images of the G1 stage were under-sampled. Additionally, when WGAN-GP was used to augment the images of the M stage, the G1 stage was under-sampled. In other words, the number of images for each stage was basically balanced, and the four-stage classification accuracy was almost unaffected. When compared, the result showed that the classification accuracy could be effectively improved by using WGAN-GP for data augmentation.

Results of Classification
For imbalanced image classification, it was difficult to accurately reflect the performance of the classifier by using accuracy alone. It was necessary to combine other evaluation indicators, such as F-Score, G-means metric, and the receiver operating characteristic (ROC) curve [35,36]. The F-Score is directly related to recall and precision. This method was mainly to maximize recall and precision as much as possible so the classification performance for majority categories and minority categories could be correctly evaluated.

Accuracy = (TP + TN)/(TP + TN + FP + FN)
(1) The ROC curve was drawn with the classification error rate of the majority class as the abscissa and the classification accuracy rate of the minority class as the ordinate. The ROC curve is currently one of the commonly used methods to evaluate the performance of classifiers on imbalanced data sets. Table 5 shows the seven-stage classification result for the original images. Additionally, Table 6 shows the seven-stage classification result after using WGAN-GP. In Table 5, the precision of anaphase, metaphase, and telophase was almost 0. One of the reasons was that the number of original images for these stages was too low, and the number of images in the test set was too low. Another reason might be the acquisition of original images. The process by which we obtained the original images was dynamic and changed over time. When the images of a certain stage were acquired, the cells might be dynamically changing, which would not only cause images to have the characteristics of this stage, but they might also contain the characteristics of other stages. This made the original images difficult to classify correctly. The obtained results had large deviations, and the weighted average accuracy of the classification was 78.35%. The accuracy of each stage was 0, 83.16%, 84.56%, 0, 85.21%, 67.65%, and 0. In Table 6, the images generated by WGAN-GP of anaphase, metaphase, prophase, and telophase were used for classification, and the weighted average accuracy of the classification was 82.10%. Compared with the accuracy of original images, the average accuracy increased by 3.75%. Additionally, the accuracy of each stage was  Tables 5  and 6 that the classification accuracies of the anaphase, the metaphase, the prophase, and the telophase stages were significantly improved by WGAN-GP. In addition, the combined dataset (G1, G2, M, and S phase) was used for classification. The classification results for the original images of the M stage and the classification results for the generated images of the M stage are shown in Tables 7 and 8, respectively. In Table 7, the weighted average accuracy of the classification was 77.16%, and the accuracy of each stage was 82.47%, 82.05%, 64.83%, and 67.99%. In Table 8, the generated images of the M stage were used for classification, and the weighted average accuracy of the classification was 83.88%. Compared with the accuracy of the original images, the weighted average accuracy has increased by 6.72%. Additionally, the accuracy of each stage was 82.44%, 84.92%, 99.94%, and 68.25%. It can be seen from Tables 7 and 8 that the classification accuracy of the M stage was significantly improved by WGAN-GP. Figure 6 shows the training and validation accuracy with training epochs. Figure 7 is the result represented by a confusion matrix. Figure 8 shows the ROC curve for four-stage classification.

Verification of Results with New Dataset
In order to verify the effectiveness of our model, another cell cycle data set was used. Nagao et al. [13] collected fluorescence microscope images of different cell cycles containing subcellular structures, such as the nucleus, the Golgi apparatus, and the microtubule cytoskeleton (HeLa dataset). The classification of different cycle stages could be carried out by extracting the characteristics of these subcellular structures. The data set contained only two categories, namely G2 and non-G2. The cell cycle images of the G2 phase were regarded as one class, and the images of the G1 phase and the S phase were regarded as one class. The images of the M phase were not in this data set. The numbers of images in the G2 class and the non-G2 class were each 922. The original images of the G2 class and the non-G2 class are shown in Figure 9. The WGAN-GP was used to generate images for the G2 class and the non-G2 class, and the generated images are shown in Figure 9.
Although the original images of this data set were balanced, data augmentation was carried out on this data set to verify the effects of WGAN-GP. Each class used WGAN-GP to generate 10,000 images and used random under-sampling to obtain 9220 images. Then, the 9220 images were used for classification. The results of the classification for the original images and generated images are shown in Table 9.

Verification of Results with New Dataset
In order to verify the effectiveness of our model, another cell cycle data set was used. Nagao et al. [13] collected fluorescence microscope images of different cell cycles containing subcellular structures, such as the nucleus, the Golgi apparatus, and the microtubule cytoskeleton (HeLa dataset). The classification of different cycle stages could be carried out by extracting the characteristics of these subcellular structures. The data set contained only two categories, namely G2 and non-G2. The cell cycle images of the G2 phase were regarded as one class, and the images of the G1 phase and the S phase were regarded as one class. The images of the M phase were not in this data set. The numbers of images in the G2 class and the non-G2 class were each 922. The original images of the G2 class and the non-G2 class are shown in Figure 9. The WGAN-GP was used to generate images for the G2 class and the non-G2 class, and the generated images are shown in Figure 9.  As shown in Table 9, the average accuracy of classification for original images was 87.63%, and the average accuracy of classification for generated images was 97.65%. Compared with the accuracy of original images, the average accuracy increased by 10.02%. The classification accuracy was significantly improved by WGAN-GP. Figure 10 shows the training and validation accuracy with training epochs. Figure 11 is the result as represented by a confusion matrix. Figure 12 shows the ROC curve for classification. Although the original images of this data set were balanced, data augmentation was carried out on this data set to verify the effects of WGAN-GP. Each class used WGAN-GP to generate 10,000 images and used random under-sampling to obtain 9220 images. Then, the 9220 images were used for classification. The results of the classification for the original images and generated images are shown in Table 9. As shown in Table 9, the average accuracy of classification for original images was 87.63%, and the average accuracy of classification for generated images was 97.65%. Compared with the accuracy of original images, the average accuracy increased by 10.02%. The classification accuracy was significantly improved by WGAN-GP. Figure 10 shows the training and validation accuracy with training epochs. Figure 11 is the result as represented by a confusion matrix. Figure 12 shows the ROC curve for classification. Figure 9. The images of the G2 class and the non-G2 class. (a) Images of G2 class [13]; (b) images of non-G2 class [13]; (c) images generated by WGAN-GP of G2 class; (d) images generated by WGAN-GP of non-G2 class. As shown in Table 9, the average accuracy of classification for original images was 87.63%, and the average accuracy of classification for generated images was 97.65%. Compared with the accuracy of original images, the average accuracy increased by 10.02%. The classification accuracy was significantly improved by WGAN-GP. Figure 10 shows the training and validation accuracy with training epochs. Figure 11 is the result as represented by a confusion matrix. Figure 12 shows the ROC curve for classification.

Discussion
To verify the effect of our method, the classification results for the same data set were compared with those of the existing methods in the literature, and the results are shown in Tables 10-12. In Table 10, the classification accuracy of anaphase, metaphase, prophase, and telophase improved by 80%, 69.49%, 38.62%, and 3.71%, respectively. In Table 11, the classification accuracy of the M phase improved by 55.9%. In Table 12, the classification accuracy improved by 12.52%. In fact, the classification accuracy was significantly improved by WGAN-GP. Therefore, WGAN-GP can be used to improve the classification of imbalanced cell cycle phases.

Discussion
To verify the effect of our method, the classification results for the same data set were compared with those of the existing methods in the literature, and the results are shown in Tables 10-12. In Table 10, the classification accuracy of anaphase, metaphase, prophase, and telophase improved by 80%, 69.49%, 38.62%, and 3.71%, respectively. In Table 11, the classification accuracy of the M phase improved by 55.9%. In Table 12, the classification accuracy improved by 12.52%. In fact, the classification accuracy was significantly improved by WGAN-GP. Therefore, WGAN-GP can be used to improve the classification of imbalanced cell cycle phases.  According to the original images of the Jurkat dataset, it was apparent from their characteristics that, except for the phase of mitosis (M), the cell cycle images of the other stages were difficult to distinguish, even for experts in the field of cell cycles. If the grayscale images of different stages were placed in the order of a cell cycle phase and the experts were informed, the stages might be classified by some morphological features. However, if the grayscale images of different stages were randomly placed, it was difficult for experts to classify the different stages. This was precise because the differences between images of different cell cycle stages were not obvious. It was also difficult to use ResNet to further enhance classification accuracy.
In general, determining the cell cycle phase requires the fluorescent labeling of cells, and fluorescent staining is a very complicated process. In this study, we used a deep learning framework to classify the brightfield images without fluorescent staining to easily recognize the cells in the different stages; this was a process that was important for reducing the difficulty of operations in cell cycle classification. Furthermore, different phases of the cell cycle lasted for different durations, which inevitably led to an imbalance in the number of acquired images at different stages. Therefore, the use of a WGAN-GP could solve problems related to imbalanced cell cycle images. Additionally, from the perspective of practical applications in the field, the use of a WGAN-GP was of great significance for the classification of the cell cycle.
These problems also reflected the difficulty in obtaining biomedical images. In some cases, time and money were required to obtain sufficient images; without high-quality images, it might be difficult to perform subsequent experiments. Follow-up experiments would certainly benefit if they were to use our method for data augmentation.

Conclusions
In this paper, deep learning technology was applied to the field of cell cycle classification, and a cell cycle classification framework based on the combination of WGAN-GP and ResNet was used. This combination yielded better classification results than the original classification framework. The WGAN-GP was used for data augmentation, and the ResNet was used for classification. The Jurkat dataset was used for the seven-stage and four-stage classification of the cell cycle, and better classification results were obtained than those found in previous papers. Additionally, another dataset (HeLa dataset) was used to validate the results of our model. By introducing the WGAN-GP network to generate additional cell cycle images, the problem of insufficient original images was solved. The imbalance between different cell cycle stages was reduced, and classification accuracy was improved.
In the future, we will continue to improve the structure of the network for classification, and we will try to use a network other than WGAN-GP for data augmentation. We will use other methods to obtain cell cycle images without a fluorescent label, and we will classify them in this framework to further improve the classification accuracy of the cell cycle, finally achieving the label-free classification of cell cycle images. Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.