An Improved Algorithm of Drift Compensation for Olfactory Sensors

: This research mainly studies the semi-supervised learning algorithm of different domain data in machine olfaction, also known as sensor drift compensation algorithm. Usually for this kind of problem, it is difficult to obtain better recognition results by directly using the semi-supervised learning algorithm. For this reason, we propose a domain transformation semi-supervised weighted kernel extreme learning machine (DTSWKELM) algorithm, which converts the data through the domain and uses SWKELM algorithmic classification to transform the semi-supervised classification problem of different domain data into a semi-supervised classification problem of the same domain data.


Introduction
Machine olfaction is widely used in gas classification and calibration of accurate concentration estimation. For example, in terms of food safety, it is used to detect the purity and quality of food [1][2][3]. In terms of environmental protection, it is used for a wide range of air quality monitoring [4,5]. In medical cases, it is used to detect diseases [6]. The research on sensor drift compensation can effectively improve detection accuracy.
In the face of sensor drift, in order to avoid tedious calibration tasks and save costs, many researchers have studied drift compensation algorithms for many years, proposing different solutions [7][8][9][10][11][12]. The most important methods can be divided into three types: the first is the component correction method; the second is the adaptive method; the third is the machine learning method.
For example, Wold S et al. proposed an orthogonal signal correction method (orthogonal signal correction, OSC). This method makes the corrected signal retain as much useful information as possible by removing the linearly irrelevant part of the domain target matrix in the original signal [13]. Feng et al. used the OSC method to preprocess the data, and then optimized the RBF network through the particle swarm optimization algorithm. In order to detect wound infection, good results were obtained [14]. Artursson et al. proposed a component correction principal components analysis (CCPCA) algorithm based on the OSC algorithm [15]. This algorithm first assumes that the drift has a preferred direction in the measurement space rather than a random distribution and finds the direction of the drift through the method of principal component analysis (PCA). The drift direction is removed by the measurement matrix, and irrelevant information in the data is removed, thereby increasing the stability and increasing the generalization ability of the classification model.
The adaptive method is a passive drift compensation method that matches the trained model to the current sensor output by modifying the parameters in the classification algorithm [16,17]. There are two main methods: adaptive resonance theory (ART) and selforganizing feature mapping (SOM). Distante C. et al. combined adaptive resonance theory with neural networks for gas identification. For overlapping clusters and non-overlapping clusters, different approaches to the drift problem have been proposed [18]. Distante C. et al.

Datasets
The dataset used in this study is publicly available data collected by A. Vergara et al. using sensors and published on the UCI Machine Learning Repository [22,28]. This dataset has 6 different gases and 13,910 data samples, which is very suitable for the study of related algorithms for classification. In addition, the most important point is that the data collection work is divided into different batches at different times. Due to the characteristics of the sensor, it is prone to aging, poisoning, and other factors, resulting in the drift problem of the sensor. Therefore, the collected 10 batches of data are prone to different data distributions. The algorithm DTSWKELM introduced in this study is to solve the problem of long-term drift of the sensor, so this dataset is selected to test the effect of the DTSWKELM algorithm. Table 1 shows the details of the dataset.  Month  Acetone  Acetaldehyde Ethanol  Ethylene  Ammonia  Toluene  Total   Batch 1  1-2  90  98  83  30  70  74  445  Batch 2  3-10  164  334  100  109  532  5  1244  Batch 3  11-13  365  490  216  240  275  0  1586  Batch 4  14,15  64  43  12  30  12  0  161  Batch 5  16  28  40  20  46  63  0  197  Batch 6  17-20  514  574  110  29  606  467  2300  Batch 7  21  649  662  360  744  630  568

Maximum Average Discrepancy
Maximum mean discrepancy (MMD) is a very efficient measure of the distance between two distributions. We mainly use it to compare the difference of target domain data and source domain data, and then find a new domain by minimizing MMD. The maximum mean discrepancy is based on the idea that we need to identify a function that takes distinct assumptions about two different distributions. By looking for a continuous function F in the sample space, finding the mean of the function values of samples with different distributions on F , and by taking the difference of the two means, the average deviation of the two distributions corresponding to F can be obtained. Find an F such that the deviation has a maximum value, and the MMD is obtained. Finally, MMD is taken as the test statistic to judge whether the two distributions are the same. If this value is small enough, the two distributions are considered the same. At the same time, this value is also used to judge the similarity between the two distributions. In transfer learning, this F is generally used as the RBF kernel function, and MMD can be expressed by Equation (1) [29].
where F is the desired function, x, y is the sample of two random variables, p is the distribution of x, and q is the distribution of y. If and only if p = q, MMD 2 (F , p, q) = 0. For the unsupervised domain adaptation problem, two different domains are considered: the source domain S and the target domain T, whose probability distributions are P S and P T , respectively. The source domain data X S = [x 1 , x 2 , . . . , x Si ] and the source domain label Y S = [y 1 , y 2 , . . . , y Si ] and the unlabeled target domain data X T = x 1 , x 2 , . . . , x Tj , where N S and N T are the number of samples in the source domain and the number of samples in the target domain, respectively. Generally speaking, the probability distributions P S and P T are different. The Euclidean distance between the source domain and the target domain after a specific function ϕ(·) is mapped to the reproducing kernel Hilbert space (RKHS), as shown in Equation (2):

Sensor Drift Compensation Algorithm
This section proposes the DTSWKELM algorithm, which transforms the source domain data and the target domain data so that the two sets of data distributions are close. The semi-supervised classification problem of different domain data is converted into a semisupervised classification problem of the same domain, and then the semi-supervised classification task is carried out through the SWKELM algorithm. The algorithm has the advantages of a good classification effect, strong generalization ability, and no need for labeled target domain data.
In the dataset, labeled source domain data are accessible, so only unlabeled target domain data and labeled source domain data can be used to build data reconstruction models. Through data transformation, it is desirable to keep the source domain data unchanged as much as possible, while making the distribution of the drifting target data close to the distribution of the source data. Figure 1 is a flow chart of the DTSWKELM algorithm:

Sensor Drift Compensation Algorithm
This section proposes the DTSWKELM algorithm, which transforms the source domain data and the target domain data so that the two sets of data distributions are close. The semi-supervised classification problem of different domain data is converted into a semi-supervised classification problem of the same domain, and then the semi-supervised classification task is carried out through the SWKELM algorithm. The algorithm has the advantages of a good classification effect, strong generalization ability, and no need for labeled target domain data.
In the dataset, labeled source domain data are accessible, so only unlabeled target domain data and labeled source domain data can be used to build data reconstruction models. Through data transformation, it is desirable to keep the source domain data unchanged as much as possible, while making the distribution of the drifting target data close to the distribution of the source data. Figure 1 is a flow chart of the DTSWKELM algorithm: As can be seen from Figure 1, the source domain data and the target domain data are obtained through kernel mapping to obtain the hidden layer. Under the constraints of two conditions, two sets of new domain data are obtained, which are sent to the SWKELM classifier for model training, and finally the new target domain data are predicted. The calculation flow of the specific algorithm is given below. The algorithm in this paper can be defined as the following optimization problem, as shown in Equation (3): The first term in the formula is used to represent the distribution difference between the source domain data and the target domain data, and ∅(·) represents the correlation mapping. The second term is the loss function, which is used to prevent the loss of useful information in the source domain data in the process of data transformation. The third normal form is the regularization term used to avoid overfitting. This paper chooses the maximum average difference to describe the distribution difference between the source domain and the target domain, as shown in Equation (4): The second loss function can be expressed as Equation (5): As can be seen from Figure 1, the source domain data and the target domain data are obtained through kernel mapping to obtain the hidden layer. Under the constraints of two conditions, two sets of new domain data are obtained, which are sent to the SWKELM classifier for model training, and finally the new target domain data are predicted. The calculation flow of the specific algorithm is given below. The algorithm in this paper can be defined as the following optimization problem, as shown in Equation (3): The first term in the formula is used to represent the distribution difference between the source domain data and the target domain data, and ∅(·) represents the correlation mapping. The second term is the loss function, which is used to prevent the loss of useful information in the source domain data in the process of data transformation. The third normal form is the regularization term used to avoid overfitting. This paper chooses the maximum average difference to describe the distribution difference between the source domain and the target domain, as shown in Equation (4): The second loss function can be expressed as Equation (5): The domain transformation algorithm can be defined as Equation (6): where β represents the output layer matrix, C and λ are the trade-off parameters for adjusting the model, and h(x Si ) is the i − th point obtained by the source data through a single hidden layer neuron. h(x Ti ) is the i − th point obtained by the source data through a single hidden layer neuron. x T Si represents the transpose of the source domain data samples, N S and N T are the number of source domain samples and the number of target domain samples, respectively.
Transforming the constrained optimization problem in Equation (6) into an unconstrained optimization problem, Equation (7) can be obtained In the formula Λ C is a diagonal matrix, the element on the main diagonal is the parameter C, and Tr represents the trace of a matrix.
Domain adaptation algorithms that minimize reconstruction error are different from traditional autoencoders or Boltzmann machines, which use backpropagation to update parameters to learn the weights of the input and output layers. Domain adaptation algorithms focus on the mapping of source and target domains to new domains rather than feature extraction.
Equation (4) can be calculated from Equation (8): Defining H = H S H T , Equation (8) can be rewritten as Equation (9): where D ∈ R (N S +N T )×(N S +N T ) is the matrix of MMD, which can be defined in the form of Equation (10): To sum up, Equation (7) can be rewritten as Equation (11): For convenience, the source domain data and the target domain data are combined as X * . The first N S is the source domain data X S , and the last N T is 0. Λ = dig(C, C, · · · , C, 0, · · · , 0), where the number of C is the number of source domain data, and the number of 0 is the number of target domain data. In this way, Equation (11) can be transformed into Equation (12): Obviously, Equation (12) is a convex optimization problem, finding its gradient and letting it equal to 0, we can get: Finally, we get the output layer matrix, as shown in Equation (14): Because the input data are mapped to the hidden layer using the kernel function, the mapped data H cannot be directly and explicitly obtained, and the output layer matrix β cannot be directly calculated. However, since the kernel matrix K = HH T , it can be directly calculated. The data after domain transformation, as shown in Equations (15) and (16): where X = X T S , X T T T is the combination of source domain data and target domain data.
The obtained new source domain data and target domain data are sent as input into the SWKELM model, the semi-supervised classifier is trained, and the trained classifier is used to predict the X_IT data, and finally the accuracy is calculated.

Results
The algorithm DTSWKELM introduced in this paper is to solve the problem of longterm drift of the sensor, so this dataset is selected to test the effect of the DTSWKELM algorithm. This part can be mainly divided into three experimental analyses. The first comparative experiment is the analysis of data distribution, comparing the distribution before and after data conversion, and observing its changes. The second experiment is to compare the recognition effects of different algorithms in the dataset. The third experiment is to analyze the hyperparameters. The whole experiment was carried out in a Window10 system, and Pycharm2020.1.3 was selected as the platform for algorithm implementation.

Experimental Data Distribution Analysis
The difference in the data distribution of different data is the best manifestation of the sensor drift problem. In order to more intuitively reflect the different distributions of different batches of data, the PCA method is used to reduce the dimensionality of the data. This part reduces the data to two dimensions and presents the dimensionality-reduced data points by means of a dot plot. Figure 2 is a dot plot of all batches of data in the dataset after PCA dimensionality reduction: It can clearly be seen from Figure 2 that the uneven distribution of data in different batches of datasets is caused by sensor drift; especially when comparing Batch1 with Batch4, Batch5, Batch8, and Batch9, it is found that this situation is more obvious. In addition, we also found that there is no certain law in the change of distribution. It can be seen that the sensor drift is random, not in a fixed direction. It is for these reasons that the classification model that has been trained on one dataset often has a poor recognition effect on the new dataset.
The main idea of the DTSWKELM algorithm proposed in this study is to find a new domain for mapping between the source domain data and the target domain data by calculating the MMD. Make the new source domain data more similar to the target domain data and then perform model training and recognition on the new data. Figure 3 shows the distribution of different batches after domain transformation, where Batch1 is selected as the source domain data, and other batch data are respectively used as the target domain data. It can clearly be seen from Figure 2 that the uneven distribution of data in different batches of datasets is caused by sensor drift; especially when comparing Batch1 with Batch4, Batch5, Batch8, and Batch9, it is found that this situation is more obvious. In addition, we also found that there is no certain law in the change of distribution. It can be seen that the sensor drift is random, not in a fixed direction. It is for these reasons that the classification model that has been trained on one dataset often has a poor recognition effect on the new dataset.
The main idea of the DTSWKELM algorithm proposed in this study is to find a new domain for mapping between the source domain data and the target domain data by calculating the MMD. Make the new source domain data more similar to the target domain data and then perform model training and recognition on the new data. Figure 3 shows the distribution of different batches after domain transformation, where Batch1 is selected as the source domain data, and other batch data are respectively used as the target domain data. In the DTSWKELM algorithm, in the process of domain conversion, the information of the source domain data is preserved as much as possible. In this experiment, Batch1 is all used as the source domain data. Therefore, its distribution has not changed much, and the above figure only shows a diagram of Batch1. From the figure, we can see that the distribution of Batch2-10 data changed significantly after domain transformation, and it In the DTSWKELM algorithm, in the process of domain conversion, the information of the source domain data is preserved as much as possible. In this experiment, Batch1 is all used as the source domain data. Therefore, its distribution has not changed much, and the above figure only shows a diagram of Batch1. From the figure, we can see that the distribution of Batch2-10 data changed significantly after domain transformation, and it is closer to the distribution of Batch1, in which Batch2 and Batch8 are more obvious. It can be seen from this that the domain transformation part of the algorithm plays a role. It can effectively reduce the distribution difference between different batches of data, so that the new source domain data are more similar to the target domain data. Thus, the semi-supervised learning problem in different domains caused by the sensor drift problem is transformed into a semi-supervised learning problem in the same domain.

Sensor Drift Algorithm Comparison Experiment
In the comparative experiment, the comparison of the recognition effect of the DTSWKELM algorithm with other algorithms on this dataset is shown to verify the effectiveness of the DTSWKELM algorithm. Two different sets of comparative experiments are set up: the first set of experiments uses Batch1 data as the source domain data, and Batch2-9 data as the target domain data. The second set of experiments uses adjacent batch data as two datasets, that is, using Batch N−1 data as the source domain data and Batch N data as the target domain data. At the same time, seven commonly used sensor drift compensation algorithms were selected for the comparison of recognition effects, namely SVM-rbf algorithm, SVM-comgfk algorithm, ML-comgfk algorithm, ELM-rbf algorithm, DAELM-S (5) algorithm, domain transfer broad learning system, SWKELM algorithm, DTBLS algorithm, and TDACNN algorithm [30,31]. Table 2 shows the recognition effects of eight different algorithms in Experiment 1. The bold data is the highest recognition effect of each Batch. In order to better display the comparison results between different algorithms, the data in Table 2 are converted into a histogram in Figure 4. Observing Table 1 and Figure 4, first compare the recognition effects of DTSWKELM and 6 common sensor drift compensation algorithms. Under the conditions set in Experiment 1, the DTSWKELM proposed in this paper achieves the best recognition effect in the four groups of tasks and has the highest recognition accuracy. Especially when the target domain data is Batch6 data, the recognition accuracy of DTSWKELM reaches 96.31%, which is 13.11% higher than DTBLS and 68.05% higher than SVM-rbf. Although DTSWKELM achieves the highest recognition accuracy only in Batch5, Batch6, and Batch10, it is not too different from the algorithm with the best recognition effect. In addition, from the average of the recognition accuracy in the 9 tasks, DTSWKELM has the best average recognition accuracy, so it can be seen that on the whole, DTSWKELM performs better. Next, we observe SWKELM, from which we can see that SWKELM also achieved relatively good results, and the overall average recognition accuracy is lower than that of TDACNN, DTBLS, and DTSWKELM. Moreover, it shows better results than DTBLS and TDACNN in Batch6. It can be seen that in some scenarios, traditional semi-supervised learning algorithms can also achieve better results in semi-supervised classification problems between different domain data.  Observing Table 1 and Figure 4, first compare the recognition effects of DTSWKELM and 6 common sensor drift compensation algorithms. Under the conditions set in Experiment 1, the DTSWKELM proposed in this paper achieves the best recognition effect in the four groups of tasks and has the highest recognition accuracy. Especially when the target domain data is Batch6 data, the recognition accuracy of DTSWKELM reaches 96.31%, which is 13.11% higher than DTBLS and 68.05% higher than SVM-rbf. Although DTSWKELM achieves the highest recognition accuracy only in Batch5, Batch6, and Batch10, it is not too different from the algorithm with the best recognition effect. In addition, from the average of the recognition accuracy in the 9 tasks, DTSWKELM has the best average recognition accuracy, so it can be seen that on the whole, DTSWKELM performs  Table 3 below is the recognition effect of eight different algorithms in Experiment 2. Bold data also represents the highest recognition effect of each Batch. In order to better display the comparison results between different algorithms, the data in Table 3 are also converted into a histogram in Figure 5. Looking at Table 3 and Figure 5, similar conclusions can be drawn as in Experiment 1. Compared with the seven commonly used sensor drift compensation algorithms, DTSWKELM performs better overall, with an average accuracy of 88.30%, which is 6.82% higher than TDACNN. Compared with the SWKELM algorithm, it has a better recognition effect, and the average accuracy is 7.25% higher, which reflects the effectiveness of the domain conversion process. In addition, it can be seen that the recognition effect of each algorithm in different tasks in Experiment 2 is generally higher than that in Experiment 1. This is mainly because the data of adjacent batches are relatively less affected by sensor drift, and the distribution difference between the data is relatively small. In summary, through two different sets of experiments verify the effect of the DTSWKELM algorithm, the same conclusion is obtained, and DTSWKELM shows the best recognition effect.

Parameter Influence and Analysis
In the DTSWKELM algorithm, MMD is used to describe the distance between two sets of data distributions, and popular regularization is used to correlate labeled data with unlabeled data. These two parts play a crucial role in this algorithm. In this section, the trade-off parameters and of these two parts in the optimization problem are analyzed and discussed. In this paper, the random search method is used to determine the optimal hyperparameters in the DTSWKELM model, and then the two hyperparameters are analyzed while other hyperparameters are fixed. Figure 6 below shows the influence of two hyperparameters on the recognition effect of the algorithm under the conditions of Experiment 1. Figure 6a shows the influence of the trade-off parameter of the MMD part on the recognition effect of the algorithm when other hyperparameters are fixed. Take ( ) = [−4, −3, −2, −1,0,1,2,3,4]. As can be seen from the figure, ( ) in [ −4,0] is relatively stable in this range, and the trade-off parameter should be selected within this range. However, when increases, the recognition effect of the algorithm decreases. We speculate that this may be due to the fact that this part accounts for too much in the optimization problem, resulting in the loss of too much information in the domain-transformed data. Figure 6b is the influence of the trade-off parameter of the popular regularization part on the recognition effect of the algorithm when other hyperparameters are fixed. Similarly, taking ( ) = [−4, −3, −2, −1,0,1,2,3,4], the trade-off parameter is not as stable as . However, it can be seen that ( ) can achieve better results on [−2,0]. When the trade-off parameter is too large, it can be found that the accuracy of the algorithm is low. This is because when the manifold regularization part occupies a large proportion in the optimization problem, the useful label information will be weakened, and semisupervised learning will degenerate into unsupervised learning, resulting in low recognition accuracy. In summary, through two different sets of experiments verify the effect of the DTSWKELM algorithm, the same conclusion is obtained, and DTSWKELM shows the best recognition effect.

Parameter Influence and Analysis
In the DTSWKELM algorithm, MMD is used to describe the distance between two sets of data distributions, and popular regularization is used to correlate labeled data with unlabeled data. These two parts play a crucial role in this algorithm. In this section, the trade-off parameters λ 1 and λ 2 of these two parts in the optimization problem are analyzed and discussed. In this paper, the random search method is used to determine the optimal hyperparameters in the DTSWKELM model, and then the two hyperparameters are analyzed while other hyperparameters are fixed. Figure 6 below shows the influence of two hyperparameters on the recognition effect of the algorithm under the conditions of Experiment 1. Figure 6a shows the influence of the trade-off parameter λ 1 of the MMD part on the recognition effect of the algorithm when other hyperparameters are fixed. Take lg(λ 1 ) = [−4, −3, −2, −1, 0, 1, 2, 3, 4]. As can be seen from the figure, lg(λ 1 ) in [−4, 0] is relatively stable in this range, and the trade-off parameter λ 1 should be selected within this range. However, when λ 1 increases, the recognition effect of the algorithm decreases. We speculate that this may be due to the fact that this part accounts for too much in the optimization problem, resulting in the loss of too much information in the domain-transformed data. Figure 6b is the influence of the trade-off parameter λ 2 of the popular regularization part on the recognition effect of the algorithm when other hyperparameters are fixed. Similarly, taking lg(λ 2 ) = [−4, −3, −2, −1, 0, 1, 2, 3, 4], the trade-off parameter λ 2 is not as stable as λ 1 . However, it can be seen that lg(λ 2 ) can achieve better results on [−2, 0]. When the trade-off parameter λ 2 is too large, it can be found that the accuracy of the algorithm is low. This is because when the manifold regularization part occupies a large proportion in the optimization problem, the useful label information will be weakened, and semi-supervised learning will degenerate into unsupervised learning, resulting in low recognition accuracy.

Discussion
The sensor drift is caused by the sensor's own material, processing method, or external environment. In this case, the semi-supervised classification problem of different domain data, that is, the sensor drift compensation problem, is studied. This paper proposes a domain shift semi-supervised weighted kernel extreme learning machine (DTSWKELM) algorithm, which defines the benchmark dataset as the source domain data and the drift dataset as the target domain data. By mapping the source domain data and the target domain data to the new domain, and finally performing semi-supervised learning on the new domain data set, the target domain data are predicted. The algorithm transforms the semi-supervised classification problem of different domain data into a semi-supervised classification problem of the same domain data through the method of domain transformation. Compared with the DAELM algorithm, the problem of requiring a certain amount of labeled target domain data and the instability problem caused by random hidden layer mapping is improved. Experiments show that the proposed algorithm can effectively compensate for the long-term sensor drift problem.
The DTSWKELM algorithm is a sensor compensation algorithm for single-source domain data. Although it has achieved good results, in some cases, there will be multiple source domains, and the algorithm cannot combine multiple source domains together. Reasonable and effective use of multiple source domain data can better learn the characteristics of the data and solve the problem of sensor drift, which is also an important problem in future research on olfactory machines.

Conclusions
Inspired by the DAELM algorithm, this study combines the domain transformation algorithm with the semi-supervised learning algorithm and proposes the DTSWKELM algorithm to compensate for sensor drift. First, by using MMD to represent the distance between two distributions, by minimizing MMD, a new domain is found, and the source domain data and the target domain data are mapped, thereby reducing the source domain data and the target domain data. The distribution difference between the data, the obtained new domain data is sent to the SWKELM model, the semi-supervised classifier is trained, and finally the target domain data is identified.

Discussion
The sensor drift is caused by the sensor's own material, processing method, or external environment. In this case, the semi-supervised classification problem of different domain data, that is, the sensor drift compensation problem, is studied. This paper proposes a domain shift semi-supervised weighted kernel extreme learning machine (DTSWKELM) algorithm, which defines the benchmark dataset as the source domain data and the drift dataset as the target domain data. By mapping the source domain data and the target domain data to the new domain, and finally performing semi-supervised learning on the new domain data set, the target domain data are predicted. The algorithm transforms the semi-supervised classification problem of different domain data into a semi-supervised classification problem of the same domain data through the method of domain transformation. Compared with the DAELM algorithm, the problem of requiring a certain amount of labeled target domain data and the instability problem caused by random hidden layer mapping is improved. Experiments show that the proposed algorithm can effectively compensate for the long-term sensor drift problem.
The DTSWKELM algorithm is a sensor compensation algorithm for single-source domain data. Although it has achieved good results, in some cases, there will be multiple source domains, and the algorithm cannot combine multiple source domains together. Reasonable and effective use of multiple source domain data can better learn the characteristics of the data and solve the problem of sensor drift, which is also an important problem in future research on olfactory machines.

Conclusions
Inspired by the DAELM algorithm, this study combines the domain transformation algorithm with the semi-supervised learning algorithm and proposes the DTSWKELM algorithm to compensate for sensor drift. First, by using MMD to represent the distance between two distributions, by minimizing MMD, a new domain is found, and the source domain data and the target domain data are mapped, thereby reducing the source domain data and the target domain data. The distribution difference between the data, the obtained new domain data is sent to the SWKELM model, the semi-supervised classifier is trained, and finally the target domain data is identified.
In the analysis stage of the experimental results, three groups of comparative experiments are mainly set up. First, the PCA method is used to compare and analyze the distribution of different batches of data before and after domain conversion, which more intuitively shows the impact of the sensor drift problem and verifies the effectiveness of the domain conversion process. Next, when testing the performance of the DTSWKELM algorithm, two experiments were set up. The first experiment is to set Batch1 as the source domain data, the data of Batch2-Batch10 are set as the target domain data and predict it. The second experiment is to set BatchN−1 as the source domain data and BatchN as the target domain data and make predictions on it. In these two groups of experiments, seven commonly used sensor drift compensation algorithms and SWKELM algorithm are used as control algorithms. Compared with other algorithms, the DTSWKELM algorithm proposed in this study has better recognition effect and can better deal with the long-term sensor drift problem. The last part is an analysis of the hyperparameter settings in the model. By setting different hyperparameters for comparative experiments, it shows the importance of hyperparameters to the model.