Optical Machine Learning Using Time-Lens Deep Neural NetWorks

: As a high-throughput data analysis technique, photon time stretching (PTS) is widely used in the monitoring of rare events such as cancer cells, rough waves, and the study of electronic and optical transient dynamics. The PTS technology relies on high-speed data collection, and the large amount of data generated poses a challenge to data storage and real-time processing. Therefore, how to use compatible optical methods to ﬁlter and process data in advance is particularly important. The time-lens proposed, based on the duality of time and space as an important data processing method derived from PTS, achieves imaging of time signals by controlling the phase information of the timing signals. In this paper, an optical neural network based on the time-lens (TL-ONN) is proposed, which applies the time-lens to the layer algorithm of the neural network to realize the forward transmission of one-dimensional data. The recognition function of this optical neural network for speech information is veriﬁed by simulation, and the test recognition accuracy reaches 95.35%. This architecture can be applied to feature extraction and classiﬁcation, and is expected to be a breakthrough in detecting rare events such as cancer cell identiﬁcation and screening. plots within this paper and other ﬁndings of this study are available from the corresponding author upon reasonable request. Conﬂicts of Interest: The authors declare no conﬂict of interest.


Introduction
Recently, artificial neural networks (ANNs) have achieved significant developments rapidly and extensively. As the fastest developing computing method of artificial intelligence, deep learning has made remarkable achievements in machine vision [1], image classification [2], game theory [3], speech recognition [4], natural language processing [5], and other aspects. The use of elementary particles for data transmission and processing can lead to smaller equipment, greater speed, and lower energy consumption. The electron is the most widely used particle to date, and has become the cornerstone of the information society in signal transmission (cable) and data processing (electronic computer). Artificial intelligence chips represented by graphics processing units (GPUs), application-specific integrated circuits (ASICs), and field programmable gate arrays (FPGAs) have enabled electronic neural networks (ENNs) to achieve high precision, high convergence regression, and predict task performance [6]. When dealing with tasks with high complexity and high data volume, insurmountable shortcomings have emerged in ENNs, such as long time delay and low power efficiency caused by the interaction of many parameters in the network with the storage modules of electronic devices. the optical pulse caused by the dispersion effect are equivalent), the imaging of the time signal can be realized by controlling the phase information of the timing signal, namely the time-lens. We establish a numerical model for simulation analysis to verify the feasibility of this architecture. By training 20,000 sets of speech data, we obtained a stable 98% recognition accuracy within one training cycle, which has obvious advantages of faster convergence and stable recognition accuracy compared with a deep neural network (DNN) with the same number of layers. This architecture implemented with all-optical components will offer outstanding improvements in biomedical science, cell dynamics, nonlinear optics, green energy, and other fields.
Here, we first introduce the architectural composition of the proposed TL-ONN, and then combine the time-lens principle with the neural network to drive the forward propagation and reverse optimization process. Finally, we use a speech dataset to train the proposed TL-ONN, and use numerical calculation to verify the classification function of this architecture.

Materials and Methods
The proposed ONN combines the conventional neural network with time stretch, realizing the deep learning function based on optics. As shown in Figure 1, two kinds of operations-time-lens transform and matrix multiplication-must be performed in each layer. The core optical structure which adapts the time-lens method is used to implement the first linear computation process. After that, the results are modulated by a weights matrix. Finally, the outputs serve as the input vector in the next layer. After calculation by the neural network composed of multiple time-lens layers, all input data are probed by a detector in the output layer. The prediction data and the target output are calculated by the cost function, and the gradient descent algorithm is carried out for each weights matrix (W2) from backward propagation to achieve the optimal neural network structure. The input data of this network structure are generally one-dimensional time data. In the input layer, the physical information at each point in the time series is transferred to the neurons in each layer. Through the optical algorithm, each neuron between the layers is transmitted to realize the information processing behavior of the neural network. The input data first pass through the first part of the dispersion fiber, undergoing phase modulation W1, W2 after the dispersion Fourier transform; the modulator reaches the optimal solution of the network after deep learning, and finally passes through the second segment of the dispersion fiber to complete the data transmission of each time-lens layer.
, -the groupvelocity dispersion of fiber 1 and fiber 2, respectively. W1, W2-the phase modulations. (b) TL-ONN structure. It comprises multiple time-lens layers. All time points on one layer can be regarded as neurons, and the neurons are transmitted through dispersion. L1, L2, …, Ln-layers. D1, D2, …, Dn-detectors. The input data first pass through the first part of the dispersion fiber, undergoing phase modulation W 1 , W 2 after the dispersion Fourier transform; the modulator reaches the optimal solution of the network after deep learning, and finally passes through the second segment of the dispersion fiber to complete the data transmission of each time-lens layer. β 2a , β 2b -the groupvelocity dispersion of fiber 1 and fiber 2, respectively. W 1 , W 2 -the phase modulations. (b) TL-ONN structure. It comprises multiple time-lens layers. All time points on one layer can be regarded as neurons, and the neurons are transmitted through dispersion. L 1 , L 2 , . . . , L n -layers. D 1 , D 2 , . . . , D n -detectors.
Like the diffraction of space light, the time-lens plays a role of dispersion in time. As a result, the time-lens [32] can realize the imaging of the light pulse on the time scale. This is similar to the idea that the neurons in each layer of the neural network are derived from each neuron in the previous layer through a specific algorithm. The amplitude and phase of each point of the pulse after the time-lens is derived from the previous pulse calculated for each point. Based on this algorithm, an optical neural network based on the time lens is designed. Each neural network layer is formed by two segments of dispersive fiber and a second-order phase factor. The two layers are transmitted through intensity or a phase modulator. After backward propagation, each modulation factor is optimized by the gradient descent algorithm to obtain the best architecture.

Time-Lens Principle and Simulation Results
Analogous to the process by which a thin lens can image an object in space, a time-lens can image sequences in the time domain, such as laser pulses and sound sequences. In this section, we will introduce the principle of a time-lens starting from the propagation of narrow-band light pulses.
Assuming that the propagation area is infinite, the electric field envelop → E (x, y, z, t) of a narrow-band laser pulse with a center frequency of ω 0 propagation in space coordinates (x, y, z) and time t satisfies where → A(x, y) is the electric field envelope of the input light pulse, β(ω 0 ) is the dispersion coefficient, and ω represents the angular frequency. Expanding the dispersion coefficient β(ω) with Taylor series and retaining it to the second order, the frequency spectrum Λ(z, ω) after Fourier transformation can be described as Then, we perform the inverse Fourier transform on (2) to obtain the time domain pulse envelope: ∂A(z, t) ∂z where V g is the group velocity, V g = dω dβ . If we establish a new coordinate whose frame moves at the speed of the group velocity of light, the corresponding transformation can be described as where t 0 and z 0 are the time and the space initial points, respectively. Under this circumstance, (3) can be simplified as Then, we can get the spectrum of the signal envelope by Fourier transform: where τ is the time variable in frequency domain, i is the imaginary number. It can be seen from the time domain envelope equation that the second order phase modulation of the independent variable T is carried out in the time-lens algorithm. Like the space lens, the space diffraction equation of a paraxial beam and the propagation equation of a narrow-band optical pulse in the dispersive medium both modulate the independent variable (x, y, and t) second order. The time lens mainly comprises three parts-the second-order phase modulator and the dispersion medium before and after the modulator (Figure 2a). In the dispersion medium part, the pulse passing through the long distance dispersion fiber is equivalent to the pulse being modulated in the frequency domain by a factor determined by the fiber length and the second-order dispersion coefficient, which can be expressed as where Z i and β 2i represent the length of fiber i and the second-order dispersion coefficient, respectively. When passing through the time domain phase modulator, the phase factor satisfying the imaging condition of the time-lens is the quadratic function of time τ by ϕ timelens (τ) = exp i τ 2 2D f , and D f is the focal length of the time-lens satisfying the imaging conditions of the time-lens. With respect to analog space-lens imaging conditions, the time-lens imaging condition is the space diffraction equation of a paraxial beam and the propagation equation of a narrow-band optical pulse in the dispersive medium both modulate the independent variable (x, y, and t) second order. The time lens mainly comprises three parts-the second-order phase modulator and the dispersion medium before and after the modulator (Figure 2a). In the dispersion medium part, the pulse passing through the long distance dispersion fiber is equivalent to the pulse being modulated in the frequency domain by a factor determined by the fiber length and the second-order dispersion coefficient, which can be expressed as G ( , ) = exp (− ), where Z and represent the length of fiber i and the second-order dispersion coefficient, respectively. When passing through the time domain phase modulator, the phase factor satisfying the imaging condition of the time-lens is the quadratic function of time by φ ( ) = exp ( ), and D is the focal length of the time-lens satisfying the imaging conditions of the time-lens. With respect to analog space-lens imaging conditions, the time-lens imaging condition is Its magnification can be expressed as M = − ⁄ (see Appendix A). Figure  2b shows a comparison of the duration of a group of soliton pulses and their output of the time lens at M = 2.5; the peak position and normalized intensity of the pulse are marked to verify its magnification. In summary, after passing through the time-lens, the pulse is √ times larger in amplitude and M times larger in duration, and a second order phase modulation is added in phases.

Mathematical Analysis of TL-ONN
In this section, we will analyze the transmission process of input data in two adjacent time-lens layers. Suppose that the input pulse can be expressed as A(0, t), that is, the initial intensity in time of the pulse into the first dispersion fiber of the time lens. The intensity of the input data at each time point will be mapped to all time points according to a specific algorithm after two segments of the dispersion fiber in the time lens and secondorder phase modulation in the time domain. Equation (9) shows the algorithm results; its derivation can be found in Appendix A. In the neural network based on this algorithm, Its magnification can be expressed as Figure 2b shows a comparison of the duration of a group of soliton pulses and their output of the time lens at M = 2.5; the peak position and normalized intensity of the pulse are marked to verify its magnification. In summary, after passing through the time-lens, the pulse is 1 √ M times larger in amplitude and M times larger in duration, and a second order phase modulation is added in phases.

Mathematical Analysis of TL-ONN
In this section, we will analyze the transmission process of input data in two adjacent time-lens layers. Suppose that the input pulse can be expressed as A(0, t), that is, the initial intensity in time of the pulse into the first dispersion fiber of the time lens. The intensity of the input data at each time point will be mapped to all time points according to a specific algorithm after two segments of the dispersion fiber in the time lens and secondorder phase modulation in the time domain. Equation (9) shows the algorithm results; its derivation can be found in Appendix A. In the neural network based on this algorithm, represents the magnification factor of the time-lens, β b and β f are the second-order dispersion coefficients of the two segments of the dispersion fiber, Z 1 and Z 2 are the lengths of the two segments of the dispersion fiber, l represents the layer number, t k represents all neurons that contribute to the neuron t i in the lth layer.
The intensity and phase of the neuron t i in the L layer are determined by both the input pulse in the L − 1 layer and the modulation coefficient in the L layer. For the Lth layer of the network, the information on each neuron can be expressed by where m l t i = ∑ k n l−1 k,t i is the input pulse to neuron t i of layer l, n l−1 k,t i represents the contribution of the k-th neuron of the layer l − 1 to the neuron t i of the layer l. h l t i is the modulation coefficient of the neuron t i in layer l; the modulation coefficient of a neuron comprises amplitude and phase items, i.e., h l t i = a l t i exp jφ l t i . The forward model of our TL-ONN architecture is illustrated in Figure 1 and notated as follows: where t i refers to a neuron of the lth layer, and k refers to a neuron of the previous layer, connected to neuron t i by optical dispersion. The input pulse n 0 k , which is located at layer 0 (i.e., the input plane), is in general a complex-valued quantity and can carry information in its phase and/or amplitude channels.
Assuming that the TL-ONN design is composed of N layers (excluding the input and output planes), the data transmitted through the architecture are finally detected by PD, and detectors are placed at the output plane to measure the intensity of the output data. If the bandwidth of the PD is much narrower than the output signal bandwidth, the PD will serve not only as an energy transforming device but also as a pulse energy accumulator. The final output of the architecture can be expressed as where n N t i represents the neuron t i of the output layer (N), and w t i is the energy accumulation coefficient of PD on the time axis of the data.
To train a TL-ONN design, we used the error back-propagation algorithm along with the stochastic gradient descent optimization method. A loss function was defined to evaluate the performance of the network parameters to minimize the loss function. Without loss of generality, here we focus on our classified architecture and define the loss function (E) using the cross-entropy error between the output plane intensity s N+1 and the target g N+1 : In the network based on a time-lens algorithm consisting of N time-lens layers, the data characteristics in the previous layer with α neurons are extracted into neurons in the current layer with β neurons, where β = α·k L−1,L and k L−1,L represents the scaling multiples between the (L − 1)th layer and the Lth layer. The time-lens algorithm has a similar function of removing the redundant information and compressing the features as the pooling layer in a conventional ANN. The characteristics carried by the input data will emerge and be highlighted through each layer after being transmitted through this classification architecture, and finally evolve into the labels of the corresponding category.

Results
In order to verify the effectiveness of the system in the time-domain information classification, we used numerical methods to simulate the TL-ONN to realize the recognition of specific sound signals. We used a dataset containing 18,000 training data and 2000 test data picked from intelligent speech database [33] to evaluate the performances of TL-ONN. The content in the speech dataset is the wake phrase "Hi, Miya!" in English and Chinese collected in the actual home environment using a microphone array and Hi-Fi microphone. The test subset provides paired target/non-target answers to evaluate verification results. In general, we used the dichotomy problem to test the classification performances of two kinds of systems including the TL-ONN and the conventional DNN.
We first constructed a TL-ONN composed of five time-lens layers to verify the classification feasibility of this architecture. Figure 3a shows the training results of TL-ONN in the cases of k L−1,L = 0.6. The accuracy of the TL-ONN for a total of 2000 test samples is above 98% (Figure 3a top), which is close to the accuracy for the DNN (Figure 3a bottom). The horizontal axis represents the number of training steps in one training batch (batch size = 50). The accuracy of this test fluctuates greatly in the first few steps, and then reaches over 98% at about 17 steps and remains stable. In contrast, it was difficult for a five-layer DNN network under the same conditions to achieve stable accuracy and training loss in one epoch (Figure 3a). When the training epoch was set to 10, it was found that the test accuracy and training loss still changed suddenly at the 10th training epoch, which might be due to gradient explosion, overfitting, or another reason. We define the accuracy as the proportion of the number of output labels that are the same as the target label to the total number of test sets. Using the same 2000 test set to test the two networks' architecture, the accuracy rates reached 95.35% (Figure 3b) and 93.2% (Figure 3c). In general, TL-ONN has significant advantages over DNN in verifying classification performance.
To easily see the changes of the two types of voice information in each layer of TL-ONN, we extracted two sets of input with typical characteristics for observation. Figure 4a shows the layer structure of this network, which contains multiple time-lens layers, where each time point on a given layer acts as a neuron with a complex dispersion coefficient. Figure 4b,c shows the data evolution of each layer when two types of speech are input to the network. From the input layer, we can distinguish the differences between the two types of input data from the shape of the waveform. The waveform containing "Hi, Miya!" has a higher continuity, while the waveform of random speech has quantized characteristics and always has a value on the time axis. On the second layer of the network, the "Hi, Miya!" input will change into several sets of pulses through the time-lens layer and another type of information will spread all over the time. After being transmitted by multiple time-lens layers, the two inputs will eventually change to the shape in Layer 6, and the two types of speech will eventually evolve into the shape of the impact function at different time points. As shown in Figure 4b,c, D1 and D2 correspond to detectors of different input types. The random speech eventually responds at D1, while the input containing "Hi, Miya!" responds at D2. one epoch (Figure 3a). When the training epoch was set to 10, it was found that the test accuracy and training loss still changed suddenly at the 10th training epoch, which might be due to gradient explosion, overfitting, or another reason. We define the accuracy as the proportion of the number of output labels that are the same as the target label to the total number of test sets. Using the same 2000 test set to test the two networks' architecture, the accuracy rates reached 95.35% (Figure 3b) and 93.2% (Figure 3c). In general, TL-ONN has significant advantages over DNN in verifying classification performance.  number of correct (green squares) and incorrect (grey squares) output label after the training of the two networks' architecture is completed. We define the accuracy rate as the percentage of the correct result in the total test set data (2000).
To easily see the changes of the two types of voice information in each layer of TL-ONN, we extracted two sets of input with typical characteristics for observation. Figure  4a shows the layer structure of this network, which contains multiple time-lens layers, where each time point on a given layer acts as a neuron with a complex dispersion coefficient. Figure 4b,c shows the data evolution of each layer when two types of speech are input to the network. From the input layer, we can distinguish the differences between the two types of input data from the shape of the waveform. The waveform containing "Hi, Miya!" has a higher continuity, while the waveform of random speech has quantized characteristics and always has a value on the time axis. On the second layer of the network, the "Hi, Miya!" input will change into several sets of pulses through the time-lens layer and another type of information will spread all over the time. After being transmitted by multiple time-lens layers, the two inputs will eventually change to the shape in Layer 6, and the two types of speech will eventually evolve into the shape of the impact function at different time points. As shown in Figure 4b,c, D1 and D2 correspond to detectors of different input types. The random speech eventually responds at D1, while the input containing "Hi, Miya!" responds at D2.  To eliminate the contingency of the experiment, we set up a series of networks consisting of 3-8 layers to test the influence of different numbers of time-lens layers on classification performance. Figure 5 shows the test results of the TL-ONN architecture composed of different numbers of time-lens layers-33, 30, and17 steps are needed in the TL-ONN with three, four, and five layers, respectively, to reach an accuracy of 98% ( Figure  5a). When the number of time-lens layers is increased to six or more, the accuracy can be stabilized at 98-99% after about 10 training steps; however, an unlimited increase in the number of time-lens layers does not make the results of network training infinitely better. For example, we can see that compared with a network with six, seven, or eight layers, TL-ONN requires more steps to achieve stable accuracy. Overall, the network with six To eliminate the contingency of the experiment, we set up a series of networks consisting of 3-8 layers to test the influence of different numbers of time-lens layers on classification performance. Figure 5 shows the test results of the TL-ONN architecture composed of different numbers of time-lens layers-33, 30, and17 steps are needed in the TL-ONN with three, four, and five layers, respectively, to reach an accuracy of 98% (Figure 5a). When the number of time-lens layers is increased to six or more, the accuracy can be stabilized at 98-99% after about 10 training steps; however, an unlimited increase in the number of time-lens layers does not make the results of network training infinitely better. For example, we can see that compared with a network with six, seven, or eight layers, TL-ONN requires more steps to achieve stable accuracy. Overall, the network with six time-lens layers has the best classification performance. All the results discussed above occur in one training epoch. At least a few epochs were needed to achieve stable classification accuracy for conventional DNN with the same dataset. TL-ONN has obvious advantages of faster convergence speed and stable classification accuracy. Similarly, we reverse the order of the phase modulator W1 and W2, and use the same training set for training. Figure 5b shows the classification results under this architecture, and the time-scaling multiple between two layers is still 0.6. Under the same conditions, a series of networks consisting of three-eight layers were constructed to test the classification performance. To achieve an accuracy of 98%, 55 and 12 steps are needed in the TL-ONN with three and four layers, respectively. The accuracy can be stabilized at 98-99% after about 10 training steps when the number of time-lens layers is increased to five or more. As with the previous results, compared with a network with six, seven, or eight layers, TL-ONN requires more steps to achieve stable accuracy. Overall, the network structure with six time-lens layers has the best classification performance, and it is consistent with the results of the former architecture.
At the detector/output plane, we measured the intensity of the network output, and as a loss function to train the classification TL-ONN, we used its mean square error (MSE) against the target output. The classification of TL-ONN was trained using a modulator (W2), where we aimed to maximize the normalized signal of each target's corresponding detector region, while minimizing the total signal outside of all the detector regions. We used the stochastic gradient descent algorithm, Adam [34], to back-propagate the errors and update the layers of the network to minimize the loss function. The classifier TL-ONN was trained with speech datasets [33], and achieved the desired mapping functions between the input and output planes after five steps. The training batch size was set to be 50 for the speech classifier network. To verify the feasibility of the TL-ONN architecture, we used the python language to establish a simulation model for theoretical analysis. The networks were implemented using Python version 3.8.0. and PyTorch version 1.4.0. Using a desktop computer (GeForce GTX 1060 Graphical Processing Unit, GPU and Intel(R) Core (TM) i7-8700 CPU @3.20GHz and 64GB of RAM, running a Windows 10 operating system, Similarly, we reverse the order of the phase modulator W 1 and W 2 , and use the same training set for training. Figure 5b shows the classification results under this architecture, and the time-scaling multiple between two layers is still 0.6. Under the same conditions, a series of networks consisting of three-eight layers were constructed to test the classification performance. To achieve an accuracy of 98%, 55 and 12 steps are needed in the TL-ONN with three and four layers, respectively. The accuracy can be stabilized at 98-99% after about 10 training steps when the number of time-lens layers is increased to five or more. As with the previous results, compared with a network with six, seven, or eight layers, TL-ONN requires more steps to achieve stable accuracy. Overall, the network structure with six time-lens layers has the best classification performance, and it is consistent with the results of the former architecture.
At the detector/output plane, we measured the intensity of the network output, and as a loss function to train the classification TL-ONN, we used its mean square error (MSE) against the target output. The classification of TL-ONN was trained using a modulator (W 2 ), where we aimed to maximize the normalized signal of each target's corresponding detector region, while minimizing the total signal outside of all the detector regions. We used the stochastic gradient descent algorithm, Adam [34], to back-propagate the errors and update the layers of the network to minimize the loss function. The classifier TL-ONN was trained with speech datasets [33], and achieved the desired mapping functions between the input and output planes after five steps. The training batch size was set to be 50 for the speech classifier network. To verify the feasibility of the TL-ONN architecture, we used the python language to establish a simulation model for theoretical analysis. The networks were implemented using Python version 3.8.0. and PyTorch version 1.4.0. Using a desktop computer (GeForce GTX 1060 Graphical Processing Unit, GPU and Intel(R) Core (TM) i7-8700 CPU @3.20GHz and 64GB of RAM, running a Windows 10 operating system, Microsoft), the above-outlined PyTorch-based design of a TL-ONN architecture took approximately 26 h to train for the classifier networks.
Compared with conventional DNNs, TL-ONN is not only a physical and optical neural network but also has some unique architecture. First, the time-lens algorithm applied at each layer of the network can refine the features of the input data, similar to what is used as a pooling layer, remove redundant information, and compress features. The time-lens method can be regarded as the pooling element in the photon. Second, TL-ONN can handle complex values, such as complex nonlinear dynamics in passively mode-locked lasers. The phase modulators can respectively modulate different physical parameters, and as long as the modulator parameters are determined, a passive all-optical neural network can be basically realized. Third, the output of each neuron is coupled to the neurons in the next layer through a certain weight relationship through the dispersion effect of the optical fiber, thereby providing a unique interconnection from within the network.

Discussion
In this paper, we proposed a new optical neural network based on the time-lens method. The forward transmission of the neural network can be realized by the time lens to enlarge or reduce the data in the time dimension, and the characteristics of the signal extracted by the time-lens algorithm are modulated with the amplitude or phase modulator to realize the weight matrix optimization process in linear operation. After the time signal is compressed and modulated by the multilayer based on the time-lens method, it will eventually evolve into the corresponding target output, so as to realize the classification function of the optical neural network. To verify the feasibility of the network, we used the speech data set to train it and got a test accuracy of 95.35%. The accuracy is obviously more stable and has faster convergence compared with the same number of layers in a DNN.
Our optical architecture implements a feedforward neural network through a timestretching method; thus, when completing high-throughput data processing and largescale tasks, it basically proceeds at the speed of light in the optical fiber, and requires little additional power consumption. The system has a clear correspondence between the theoretical neural network and the actual optical component parameters; thus, once each parameter in the network can be optimized, it can basically be realized completely by optical devices, which provides the possibility of building an all-optical neural network test system composed of optical fibers, electro-optic modulators, etc.
Here, we verify the feasibility of the proposed TL-ONN by numerical simulation, and we will work to build a test system to realize all-optical TL-ONN in the future. It is often accompanied by noise and loss in experiments. We conservatively speculate that such noise may reduce the classification accuracy of the architecture. On the other hand, in order to solve the influence of loss on the experiment, an optical amplifier is generally added to improve the signal-to-noise ratio. The non-linear effects of the optical amplifier have similar functions to the activation function in the neural network, and it may play an important role in all-optical neural networks in the future.
The emergence of ONNs provides a solution for real-time online processing of highthroughput timing information. By fusing the ONN with the photon time stretching test system, not only can real-time data processing be achieved, but also the system's dependence on broadband high-speed electronic systems can be significantly reduced. In addition, cost and power consumption can be reduced, and the system can be used in medicine and biology, green energy, physics, and optical communication information extraction, having more extensive applications. This architecture is expected to provide breakthroughs in the identification of rare events such as the initial screening of cancer cells and be widely used in high-throughput data processing such as early cell screening [22], drug development [23], cell dynamics [21], and environmental improvement [35,36], as well as in other fields.