Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data

The Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified samples rather than samples of minority classes. To better process imbalanced data, this paper introduces the indicator Area Under Curve (AUC) which can reflect the comprehensive performance of the model, and proposes an improved AdaBoost algorithm based on AUC (AdaBoost-A) which improves the error calculation performance of the AdaBoost algorithm by comprehensively considering the effects of misclassification probability and AUC. To prevent redundant or useless weak classifiers the traditional AdaBoost algorithm generated from consuming too much system resources, this paper proposes an ensemble algorithm, PSOPD-AdaBoost-A, which can re-initialize parameters to avoid falling into local optimum, and optimize the coefficients of AdaBoost weak classifiers. Experiment results show that the proposed algorithm is effective for processing imbalanced data, especially the data with relatively high imbalances.


Introduction
Since imbalanced data can be found in any area, effective classification of imbalanced data has become critical for many applications. The classification results of imbalanced data generated by existing classification algorithms are usually significantly affected by the majority class, resulting in low accuracy in classification of the minority class. For example, the sensor network can accurately achieve target recognition under the assumption of data distribution equilibrium. However, in practical applications, the filed environment is complex and variable, and the difficulty of obtaining samples is different, which results in imbalanced data. It is easy to ignore samples of minority class in this case, resulting in incorrect classification. In the intrusion alarm application, misclassification of samples of minority class means false alarm of system, which will cause very serious consequences.
Existing approaches processing imbalanced data can be generally divided into two categories [1,2]. The first category is based on resampling at the data level, which either (i) increases the number of samples using upsampling by synthesizing new data or copying the original data, or (ii) reduces the number of samples using subsampling by extracting a small amount of data. Although resampling can improve the accuracy of minority class classification, there are some challenges. It is impossible to properly interpret the synthetic new data generated by upsampling. In addition, important information may be lost during the subsampling process. The second category is based on the ensemble and iteration, the weights of samples vary with the coefficients of weak classifiers, and the coefficients of the classifiers are calculated by the error. As a result, the AdaBoost algorithm can increase the weight of the misclassified samples and decrease the weight of the correctly classified samples. In the next iteration, the classifier will focus the misclassified samples more. Finally, all the generated weak classifiers are merged using linear combination to form a strong classifier. The steps of the AdaBoost algorithm [19] are as follows: Input: Training data set T = {(x 1 , y 1 ), (x 2 , y 2 ), · · · , (x N , y N )}, where x i ∈ R n , y ∈ Y = {−1, +1}, and a weak learning algorithm.

1.
Initialize the weight distribution of the training samples following Equation (1).
where N represents the number of samples.
a. Following Equation (2), get the weak classifier based on weight distribution D m . b.
Calculate the classification error rate of G m (x) on the training data set following Equation (3). c.
Calculate the coefficient of G m (x) following Equation (4). d.
Build a linear combination of basic classifiers and get the final classifier G(x) following Equations (8) and (9).
G(x) = sign(f(x)) = sign( M ∑ m=1 α m G m (x)) (9) Sensors 2019, 19, 1476 4 of 18 The advantages of the AdaBoost algorithm are summarized as follows. (1) The AdaBoost algorithm can use various weak classifiers without filtering features. In addition, it delivers high execution efficiency, and can avoid overfitting issues. (2) The AdaBoost algorithm trains the weak classifiers without knowing the prior knowledge. The synthetic strong classifier can significantly improve the classification accuracy, and it is suitable for classification of most types of data. (3) The training of rough weak classifiers is much easier than training of the accurate strong classifiers. It trains multiple weak classifiers to form a strong classifier with better classification performance.

PSO
PSO was proposed by James Kenney and Russ Eberhart in 1995 [20]. The algorithm is derived from the study of predation behavior of birds, and it is a method based on iteration. Imagine a scene where there is a piece of food in a certain area and a group of randomly distributed birds are searching for the food. They obtain their distances from the food, but do not get the specific location of the food. The best way to solve this problem is to change the flight path based on the current location of the bird closest to the food and flight experience of each bird, to locate the food.
The PSO algorithm considers each solution as a bird, called a particle. Each particle has an adaptive value that represents the current state of its own solution. In each iteration, each particle adjusts its moving direction and velocity based on the global optimal solution and the optimal solution found by the particle itself, and gradually approaches the optimal particle.
The basic principle of the standard particle swarm algorithm is as follows [21].
Suppose that there are m particles searching for the optimal solution in an N-dimensional target space and randomly initialize the position and velocity of each particle following Equations (10)- (12). Where the vector U i represents the position of particle i, and the vector V i represents the flight speed of particle i.
As Equation (13) shows, the current best position P i found by particle i is: As Equation (14) shows, the current best location P gbest found by all particles is: The position and velocity of particle i is then updated following Equations (15) and (16).
where ω is the inertia weight, c 1 , c 2 , two positive constant, are the acceleration factors, v k+1 in represents the nth-dimensional velocity component generated by the (k+1)th iteration of the ith particle, and u k+1 in represents the nth-dimensional position component generated by the (k+1)th iteration of the ith particle. The position and velocity update formula is divided into three parts. The first part is the inertia part, which indicates the particle's degree of trust in its own speed. The second part is the self-cognitive part, which indicates the particle's degree of trust in its own experience. The third part is the social cognitive part, which indicates the degree of trust in the best adaptive particle [22].
Characteristics of PSO algorithm can be summarized as [23]:

1.
It is possible to quickly approximate the optimal solution and achieve effective optimization of parameters.

2.
It is suitable for searching within the scope of continuity and solving the maximum and minimum problems of continuous functions.

3.
It is easy to implement with low complexity and requires a small number of parameters.

4.
It is easy to fall into local optimum.

Area Under Curve (AUC)
Confusion matrix is the common method to reflect performance of classification model. Taking a two-class model as an example, the confusion matrix of this model is calculated as shown in Table 1.
where TP is the number of true positives, which represents cases that the positive class are correctly classified. Where FN is the number of false negatives, which represents cases that the positive class are classified as negative. Where TN is the number of true negatives, which represents cases that the negative class are correctly classified. Where FP is the number of false positives, which represents cases that negative class are classified as positive. The TP, FP, TN, and FN measures can be collected to construct a plot, which is a Receiver Operating Characteristic (ROC) curve, which the true positive rate (TPR) as the ordinate and the false positive rate (FPR) as the abscissa. The calculation formula TPR and FPR are shown in Equation (21).
The value of AUC is the area under the ROC curve. Suppose 1 − s and r are the probabilities of FP and TP, respectively. The AUC is estimated by Equation (22) AUC is a comprehensive evaluation of classification models, which can provide more useful information than accuracy measurement.

The AdaBoost-A Algorithm
Although the AdaBoost algorithm can be directly applied to imbalanced data, the ensemble algorithm pays more attention to the misclassified samples, rather than samples of minority class. According to the error calculation formula of the weak classifier of AdaBoost, the error is only related to the weight and the number of misclassified samples. There is no additional processing for the misclassified samples of minority class, so the AdaBoost ensemble algorithm is not well suited for processing imbalanced data [24]. To solve this challenge, we propose an improved AdaBoost algorithm (AdaBoost-A) that introduces the AUC [25] into the error function calculation. At the algorithm level, the error rate metric cannot properly reflect the performance of the classifier. For example, there are 90 samples in class A and 10 samples in class B. If classifier divides all test samples into class A, the error rate of classifier is 10%. However, it is clear that this classifier makes no sense. As the area under the ROC curve, AUC can effectively reflect the comprehensive performance of the classifier. If the classifier is biased towards majority class classification, the AUC of the classifier will be very small, and 1-AUC will be very large. The error is determined by combining the product of classification error rate and 1-AUC, which can effectively improve the classification accuracy of AdaBoost. The improved error calculation is shown in Equation (23).
where e m represents error rate of the mth weak classifer, G m (x) is the mth weak classifer, y i represents the actual label of the ith sample, w mi represents the weight corresponding to the ith sample in the mth iteration.

The PSOPD-AdaBoost-A Ensemble Algorithm
Although the AdaBoost algorithm can combine multiple weak classifiers into one strong classifier, the coefficients of the weak classifiers are determined in the iteration process. These coefficients cannot be changed later, so it is inevitable to generate redundant or useless weak classifiers that have large weights. This can significantly affect the readability of the classifiers and increase system overhead. To overcome these shortcomings, our approach uses the PSO algorithm to optimize the weights of the weak classifiers of AdaBoost-A. This algorithm assigns large weights to the weak classifiers with high accuracy, and small weights to the redundant or useless weak classifiers, further improving the accuracy and readability of AdaBoost classifier.
PSO is an optimization algorithm with a small number of parameters and fast convergence, but it is easy to fall into local optimum [26]. Therefore, this paper proposes an ensemble algorithm by improved PSO based on population diversity optimizing AdaBoost-A (PSOPD-AdaBoost-A). It can further optimize the coefficient weights of the weak classifiers of AdaBoost-A by performing re-initialization when it falls into in local optimum. The proposed improvements focus on using the error function of AdaBoost-A as the fitness function, and adopting the standard PSO algorithm to optimize the weights of the weak classifiers of AdaBoost-A. If the optimal particle does not change for ten consecutive iterations, the optimal particle is retained, and the position and velocity of other particles are reinitialized. The iteration is continued until the configured number of iterations is reached. The optimal particle does not change in multiple iterations, and it is likely to fall into local optimum. By re-initialization, the search range of the particle is enlarged, and the population diversity is enhanced. At the same time, the optimal particle is retained during re-initialization to avoid loss of the optimal solution of the population.
The PSOPD-AdaBoost-A ensemble algorithm is described as follows: 1.
Use the AdaBoost-A algorithm to generate several (M) weak classifiers, and the coefficients of the weak classifiers are expressed following Equation (24).
where a k represents the weight coefficient of the kth weak classifier.

2.
Set the population size to m and randomly initialize the position and velocity of each particle following Equations (25)-(27). 3.
Use the position component of each particle as the weight coefficient of the weak classifier of AdaBoost-A. As Equation (28) shows, the error rate e i of AdaBoost-A is calculated as the fitness value of each particle.
where Q represents the number of samples, e i represents the error rate of the ith particle, and y s represents the true class label of the sth sample. 4.
For each particle, the fitness value generated by each iteration is compared with the fitness value of the optimal position passed by the particle. If the fitness value is greater than the fitness value of the optimal position, the current position is taken as the optimal location passed by the particle, recorded as P i . 5.
For each particle, the fitness value generated by each iteration is compared with the fitness value of the optimal position passed by all particles. If the fitness value is greater than the fitness value of the optimal position of all particles, the current position is taken as the global optimal location, recorded as P gbest .

6.
Update the position and velocity of the particle in the following iteration based on the Equations (15) and (16). 7.
When the maximum number of iterations is reached or the error is small enough, the iteration stops. Otherwise, check the number of consecutive times that the optimal particle remains unchanged. If it reaches the threshold (10 is used in our configuration), the optimal particle is retained, and the position and velocity of other particles are reinitialized. If it is less the threshold, no action is performed. Then continue to execute steps 4-6.

Test Data
We evaluate the proposed algorithm using the Vehicle, Horse Colic, Ionosphere and Statlog imbalanced datasets from UCI repository and KC1, JM1, PC3, PC5, CM1 imbalanced datasets from NASA. In addition, the weak classifiers are generated by Decision-Stump. Table 2 lists the details of the nine imbalanced datasets used in the evaluation. The label bad in Ionosphere is considered to be a minority class, and the label good in Ionosphere is considered to be a majority class. The label 1 in Statlog is considered to be a minority class, and other labels in Statlog are considered as a majority class. The label van in Vehicle is considered to be a minority class, and labels saab, bus, and opel in Vehicle are considered as a majority class.

Analysis of the AdaBoost-A Algorithm
The Vehicle dataset is split into training and test sets at a ratio of 7:3. The standard AdaBoost algorithm is used to classify the samples in the training set. As the number of weak classifiers increases, the growth trend of AUC is shown in Figure 1. When the number of weak classifiers reaches 10, the increase of the evaluation index AUC significantly slows down, indicating that increasing the number of weak classifiers hardly improves the AUC. Therefore, the number of weak classifiers in the experiments is set to 10. Figure 2 shows the comparison of accuracy, precision, recall, and F1 value of the standard AdaBoost algorithm and the AdaBoost-A algorithm on the Vehicle test set. Results show that the AdaBoost-A algorithm achieves 92.9% accuracy, 84.8% precision, 83% recall, and 83.8% F1 value, and the standard AdaBoost algorithm achieves 91.0% accuracy, 83.4% precision, 79.5% recall, and 81.4% F1 value. The proposed algorithm not only improves the overall accuracy, but also reduces the error of minority class classification.

Analysis of the AdaBoost-A Algorithm
The Vehicle dataset is split into training and test sets at a ratio of 7:3. The standard AdaBoost algorithm is used to classify the samples in the training set. As the number of weak classifiers increases, the growth trend of AUC is shown in Figure 1. When the number of weak classifiers reaches 10, the increase of the evaluation index AUC significantly slows down, indicating that increasing the number of weak classifiers hardly improves the AUC. Therefore, the number of weak classifiers in the experiments is set to 10. Figure 2 shows the comparison of accuracy, precision, recall, and F1 value of the standard AdaBoost algorithm and the AdaBoost-A algorithm on the Vehicle test set. Results show that the AdaBoost-A algorithm achieves 92.9% accuracy, 84.8% precision, 83% recall, and 83.8% F1 value, and the standard AdaBoost algorithm achieves 91.0% accuracy, 83.4% precision, 79.5% recall, and 81.4% F1 value. The proposed algorithm not only improves the overall accuracy, but also reduces the error of minority class classification.  To eliminate the impact of data division and guarantee valid results, the 10-fold CV is employed to evaluate the classification performance. The detailed comparison results for the AdaBoost-A algorithm and the AdaBoost algorithm on Vehicle dataset in terms of the error and AUC are showed through box plots in Figures 3 and 4, respectively. Figure 3 shows that the maximum, minimum, and average of AdaBoost-A algorithm is lower than the AdaBoost algorithm in terms of error.

Analysis of the AdaBoost-A Algorithm
The Vehicle dataset is split into training and test sets at a ratio of 7:3. The standard AdaBoost algorithm is used to classify the samples in the training set. As the number of weak classifiers increases, the growth trend of AUC is shown in Figure 1. When the number of weak classifiers reaches 10, the increase of the evaluation index AUC significantly slows down, indicating that increasing the number of weak classifiers hardly improves the AUC. Therefore, the number of weak classifiers in the experiments is set to 10. Figure 2 shows the comparison of accuracy, precision, recall, and F1 value of the standard AdaBoost algorithm and the AdaBoost-A algorithm on the Vehicle test set. Results show that the AdaBoost-A algorithm achieves 92.9% accuracy, 84.8% precision, 83% recall, and 83.8% F1 value, and the standard AdaBoost algorithm achieves 91.0% accuracy, 83.4% precision, 79.5% recall, and 81.4% F1 value. The proposed algorithm not only improves the overall accuracy, but also reduces the error of minority class classification.  To eliminate the impact of data division and guarantee valid results, the 10-fold CV is employed to evaluate the classification performance. The detailed comparison results for the AdaBoost-A algorithm and the AdaBoost algorithm on Vehicle dataset in terms of the error and AUC are showed through box plots in Figures 3 and 4, respectively. Figure 3 shows that the maximum, minimum, and average of AdaBoost-A algorithm is lower than the AdaBoost algorithm in terms of error.   To eliminate the impact of data division and guarantee valid results, the 10-fold CV is employed to evaluate the classification performance. The detailed comparison results for the AdaBoost-A algorithm and the AdaBoost algorithm on Vehicle dataset in terms of the error and AUC are showed through box plots in Figures 3 and 4, respectively. Figure 3 shows that the maximum, minimum, and average of AdaBoost-A algorithm is lower than the AdaBoost algorithm in terms of error. Figure 4 shows that the maximum, minimum, and average of AdaBoost-A algorithm is higher than the AdaBoost algorithm in terms of AUC. shows that the maximum, minimum, and average of AdaBoost-A algorithm is higher than the AdaBoost algorithm in terms of AUC.  The KC1 dataset is split into training and test sets at a ratio of 7:3. The standard AdaBoost algorithm is used to classify the samples in the training set. As the number of weak classifiers increases, the growth trend of AUC is shown in Figure 5. When the number of weak classifiers reaches 10, the increase of the evaluation index AUC significantly slows down. Therefore, the number of weak classifiers in this experiment is set to 10. Figure 6 shows the comparison of accuracy, precision, recall, and F1 value of the standard AdaBoost algorithm and the AdaBoost-A algorithm on the KC1 test set. Results show that the AdaBoost-A algorithm achieves 76.2% accuracy, 45.8% precision, 30.2% recall, and 35.3% F1 value, and the standard AdaBoost algorithm achieves 74.9% accuracy, 58.2%precision, 17.2% recall, and 26% F1 value. shows that the maximum, minimum, and average of AdaBoost-A algorithm is higher than the AdaBoost algorithm in terms of AUC.  The KC1 dataset is split into training and test sets at a ratio of 7:3. The standard AdaBoost algorithm is used to classify the samples in the training set. As the number of weak classifiers increases, the growth trend of AUC is shown in Figure 5. When the number of weak classifiers reaches 10, the increase of the evaluation index AUC significantly slows down. Therefore, the number of weak classifiers in this experiment is set to 10. Figure 6 shows the comparison of accuracy, precision, recall, and F1 value of the standard AdaBoost algorithm and the AdaBoost-A algorithm on the KC1 test set. Results show that the AdaBoost-A algorithm achieves 76.2% accuracy, 45.8% precision, 30.2% recall, and 35.3% F1 value, and the standard AdaBoost algorithm achieves 74.9% accuracy, 58.2%precision, 17.2% recall, and 26% F1 value. The KC1 dataset is split into training and test sets at a ratio of 7:3. The standard AdaBoost algorithm is used to classify the samples in the training set. As the number of weak classifiers increases, the growth trend of AUC is shown in Figure 5. When the number of weak classifiers reaches 10, the increase of the evaluation index AUC significantly slows down. Therefore, the number of weak classifiers in this experiment is set to 10. Figure 6 shows    The detailed comparison results of the 10-fold CV for the AdaBoost-A algorithm and the AdaBoost algorithm on KC1 dataset in terms of the error and AUC are showed through box plots in Figures 7 and 8, respectively. Figure 7 shows that the maximum, minimum, and average of AdaBoost-A algorithm is lower than the AdaBoost algorithm in terms of error. Figure 8 shows that the maximum, minimum, and average of AdaBoost-A algorithm is higher than the AdaBoost algorithm in terms of AUC.  The detailed comparison results of the 10-fold CV for the AdaBoost-A algorithm and the AdaBoost algorithm on KC1 dataset in terms of the error and AUC are showed through box plots in Figures 7 and 8, respectively. Figure 7 shows that the maximum, minimum, and average of AdaBoost-A algorithm is lower than the AdaBoost algorithm in terms of error. Figure 8 shows that the maximum, minimum, and average of AdaBoost-A algorithm is higher than the AdaBoost algorithm in terms of AUC. The detailed comparison results of the 10-fold CV for the AdaBoost-A algorithm and the AdaBoost algorithm on KC1 dataset in terms of the error and AUC are showed through box plots in Figures 7 and 8, respectively. Figure 7 shows that the maximum, minimum, and average of AdaBoost-A algorithm is lower than the AdaBoost algorithm in terms of error. Figure 8 shows that the maximum, minimum, and average of AdaBoost-A algorithm is higher than the AdaBoost algorithm in terms of AUC.    Through the above experiments, it is proved that the proposed AdaBoost-A algorithm is more effective than AdaBoost algorithm.

Analysis of the PSOPD-AdaBoost-A Ensemble Algorithm
The coefficients of AdaBoost-A weak classifiers are optimized by the improved PSO based on population diversity and the standard PSO on the five imbalanced datasets, respectively. We compare classification performance of them by performing 5-fold CV. The detailed classification results of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A algorithms based on the average of 100 runs are showed in Figures 9-13. Through the above experiments, it is proved that the proposed AdaBoost-A algorithm is more effective than AdaBoost algorithm.

Analysis of the PSOPD-AdaBoost-A Ensemble Algorithm
The coefficients of AdaBoost-A weak classifiers are optimized by the improved PSO based on population diversity and the standard PSO on the five imbalanced datasets, respectively. We compare classification performance of them by performing 5-fold CV. The detailed classification results of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A algorithms based on the average of 100 runs are showed in Figures 9-13.     As shown in Figures 9-13, the classification performance of the PSO-AdaBoost-A and PSOPD-AdaBoost-A ensemble algorithms is much higher than the AdaBoost algorithm. It illustrates that optimizing the weight coefficients of AdaBoost weak classifiers can significantly improve the performance of the classifiers. The PSOPD-AdaBoost-A algorithm achieves 80.4% accuracy, 63.2% precision, 84.1% recall, and 72.1% F1 value on the Horse Colic dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 92.0% accuracy, 80.2% precision, 65.8% recall, and 72.2% F1 value on the Ionosphere dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 82.3% accuracy, 84.2% precision, 99.0% recall, and 91.0% F1 value on the JM1 dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 77.5% accuracy, 50.6% precision, 35.3% recall, and 41.6% F1 value on the KC1 dataset, which is higher than that of the PSO-AdaBoost-A classifier in terms of accuracy, recall, and F1 value. The PSOPD-AdaBoost-A algorithm achieves 98.9% accuracy, 99.5% precision, 99.7% recall, and 99.3% F1 value on the Statlog dataset, which is higher than that of the PSO-AdaBoost-A classifier in terms of precision, recall, and F1 value. The experimental results presented above show that the improved PSO algorithm based on population  As shown in Figures 9-13, the classification performance of the PSO-AdaBoost-A and PSOPD-AdaBoost-A ensemble algorithms is much higher than the AdaBoost algorithm. It illustrates that optimizing the weight coefficients of AdaBoost weak classifiers can significantly improve the performance of the classifiers. The PSOPD-AdaBoost-A algorithm achieves 80.4% accuracy, 63.2% precision, 84.1% recall, and 72.1% F1 value on the Horse Colic dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 92.0% accuracy, 80.2% precision, 65.8% recall, and 72.2% F1 value on the Ionosphere dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 82.3% accuracy, 84.2% precision, 99.0% recall, and 91.0% F1 value on the JM1 dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 77.5% accuracy, 50.6% precision, 35.3% recall, and 41.6% F1 value on the KC1 dataset, which is higher than that of the PSO-AdaBoost-A classifier in terms of accuracy, recall, and F1 value. The PSOPD-AdaBoost-A algorithm achieves 98.9% accuracy, 99.5% precision, 99.7% recall, and 99.3% F1 value on the Statlog dataset, which is higher than that of the PSO-AdaBoost-A classifier in terms of precision, recall, and F1 value. The experimental results presented above show that the improved PSO algorithm based on population As shown in Figures 9-13, the classification performance of the PSO-AdaBoost-A and PSOPD-AdaBoost-A ensemble algorithms is much higher than the AdaBoost algorithm. It illustrates that optimizing the weight coefficients of AdaBoost weak classifiers can significantly improve the performance of the classifiers. The PSOPD-AdaBoost-A algorithm achieves 80.4% accuracy, 63.2% precision, 84.1% recall, and 72.1% F1 value on the Horse Colic dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 92.0% accuracy, 80.2% precision, 65.8% recall, and 72.2% F1 value on the Ionosphere dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 82.3% accuracy, 84.2% precision, 99.0% recall, and 91.0% F1 value on the JM1 dataset, which is higher than that of the PSO-AdaBoost-A classifier. The PSOPD-AdaBoost-A algorithm achieves 77.5% accuracy, 50.6% precision, 35.3% recall, and 41.6% F1 value on the KC1 dataset, which is higher than that of the PSO-AdaBoost-A classifier in terms of accuracy, recall, and F1 value. The PSOPD-AdaBoost-A algorithm achieves 98.9% accuracy, 99.5% precision, 99.7% recall, and 99.3% F1 value on the Statlog dataset, which is higher than that of the PSO-AdaBoost-A classifier in terms of precision, recall, and F1 value. The experimental results presented above show that the improved PSO algorithm based on population diversity can effectively avoid falling into local optimum and achieve higher classification accuracy, and prove that the PSOPD-AdaBoost-A algorithm is effective in processing imbalanced data.

Comparison the PSOPD-AdaBoost-A and Other Improved Algorithms
To solve the imbalance problem, researchers have proposed many approaches to improve the ensemble algorithms, but most of the improved methods are still sensitive to the relatively high imbalance rate. Next, we compare classification performance of our PSOPD-AdaBoost-A approach and boosting algorithms including G-AdaBoost based on genetic algorithm [17], B-AdaBoost based on label modification and D-AdaBoost based on weight limitation [18], bagging algorithms including Random Forest and Extra Trees, sampling method including Smote-based AdaBoost by performing 5-fold CV on the Vehicle, PC3, PC5, and CM1 datasets. For a fair comparison, the number of weak classifiers of algorithms for experiment mentioned above is set to 10, and the weak classifier is generated by Decision-Stump. Results show that the PSOPD-AdaBoost-A ensemble algorithm is effective on datasets with relatively high imbalance rates.
The mean of Accuracy, Precision, Recall, F1, AUC, and Error of the four datasets are summarized in Tables 3-6, respectively. The largest values are highlighted in bold for each performance measure in each table. To further verify the effectiveness of PSOPD-AdaBoost-A ensemble algorithm for processing imbalanced data, the AUC values of each run are showed through box plots in Figures 14-17. Table 3 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of accuracy, F1 value, and AUC classifying the Vehicle dataset, its precision is slightly lower than the G-AdaBoost algorithm, and its recall is slightly lower than the Smote method. Figure 14 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of the PSOPD-AdaBoost-A algorithm in classifying the Vehicle dataset. To solve the imbalance problem, researchers have proposed many approaches to improve the ensemble algorithms, but most of the improved methods are still sensitive to the relatively high imbalance rate. Next, we compare classification performance of our PSOPD-AdaBoost-A approach and boosting algorithms including G-AdaBoost based on genetic algorithm [17], B-AdaBoost based on label modification and D-AdaBoost based on weight limitation [18], bagging algorithms including Random Forest and Extra Trees, sampling method including Smote-based AdaBoost by performing 5-fold CV on the Vehicle, PC3, PC5, and CM1 datasets. For a fair comparison, the number of weak classifiers of algorithms for experiment mentioned above is set to 10, and the weak classifier is generated by Decision-Stump. Results show that the PSOPD-AdaBoost-A ensemble algorithm is effective on datasets with relatively high imbalance rates.
The mean of Accuracy, Precision, Recall, F1, AUC, and Error of the four datasets are summarized in Tables 3-6, respectively. The largest values are highlighted in bold for each performance measure in each table. To further verify the effectiveness of PSOPD-AdaBoost-A ensemble algorithm for processing imbalanced data, the AUC values of each run are showed through box plots in Figures  14-17. Table 3 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of accuracy, F1 value, and AUC classifying the Vehicle dataset, its precision is slightly lower than the G-AdaBoost algorithm, and its recall is slightly lower than the Smote method. Figure 14 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of the PSOPD-AdaBoost-A algorithm in classifying the Vehicle dataset.   Table 5 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of precision, F1 value, and AUC classifying the PC5 dataset, its accuracy is slightly lower than the Extra Trees algorithm, and its recall is slightly lower than the Smote method. Figure 16 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of PSOPD-AdaBoost-A in classifying the PC5 dataset.    Table 6 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of accuracy, precision, F1 value and AUC classifying the CM1 dataset, and its recall is lower than the Smote method. Figure 17 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of PSOPD-AdaBoost-A in classifying the CM1 dataset.  Through the above comparative experiments, it is proved that the PSOPD-AdaBoost-A ensemble algorithm is more effective in processing imbalanced data compared to many improved algorithms.

Conclusions
Traditional AdaBoost algorithm focuses on the misclassified samples instead of the samples of minority class. In this paper, we propose an improved AdaBoost algorithm (AdaBoost-A). Since the AUC can effectively reflect the performance of the classifier, we introduce the AUC into error calculation, making the AdaBoost focus more on the classification accuracy of the minority. Furthermore, the AdaBoost algorithm may generate redundant or useless weak classifiers, significantly affecting the readability of the classifier. We propose an ensemble algorithm, PSOPD-AdaBoost-A, which can further optimize the weight of the weak classifiers. Experimental results show that the AdaBoost-A and PSOPD-AdaBoost-A ensemble algorithms can effectively classifying imbalanced datasets, Vehicle, KC1, Horse Colic, Ionosphere, JM1, and Statlog. Next, we compare the imbalanced data classification performance of PSOPD-AdaBoost-A with ensemble algorithms including G-AdaBoost, B-AdaBoost, D-AdaBoost, Random Forest, and Extra Trees, sampling method including Smote, and four datasets with relatively high imbalance rate, Vehicle, PC3, PC5, and CM1 are used in the comparison. The results show that the PSOPD-AdaBoost-A ensemble algorithm is effective in processing data with relatively high imbalance rate compared to other improved algorithms. Our future work is dedicated to applying the proposed algorithm to the field of sensors, accurately achieving classification of targets by processing imbalanced data acquired by sensors.    Table 4 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of accuracy, precision, F1 value and AUC classifying the PC3 dataset, and its recall is lower than the Smote method. Figure 15 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of PSOPD-AdaBoost-A in classifying the PC3 dataset. Table 5 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of precision, F1 value, and AUC classifying the PC5 dataset, its accuracy is slightly lower than the Extra Trees algorithm, and its recall is slightly lower than the Smote method. Figure 16 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of PSOPD-AdaBoost-A in classifying the PC5 dataset. Table 6 shows that the PSOPD-AdaBoost-A method achieves the highest performance of the seven comparison algorithms in terms of accuracy, precision, F1 value and AUC classifying the CM1 dataset, and its recall is lower than the Smote method. Figure 17 shows that the maximum, minimum, and average of PSOPD-AdaBoost-A algorithm is the highest among seven algorithms in terms of AUC, demonstrating the effectiveness of PSOPD-AdaBoost-A in classifying the CM1 dataset.
Through the above comparative experiments, it is proved that the PSOPD-AdaBoost-A ensemble algorithm is more effective in processing imbalanced data compared to many improved algorithms.

Conclusions
Traditional AdaBoost algorithm focuses on the misclassified samples instead of the samples of minority class. In this paper, we propose an improved AdaBoost algorithm (AdaBoost-A). Since the AUC can effectively reflect the performance of the classifier, we introduce the AUC into error calculation, making the AdaBoost focus more on the classification accuracy of the minority. Furthermore, the AdaBoost algorithm may generate redundant or useless weak classifiers, significantly affecting the readability of the classifier. We propose an ensemble algorithm, PSOPD-AdaBoost-A, which can further optimize the weight of the weak classifiers. Experimental results show that the AdaBoost-A and PSOPD-AdaBoost-A ensemble algorithms can effectively classifying imbalanced datasets, Vehicle, KC1, Horse Colic, Ionosphere, JM1, and Statlog. Next, we compare the imbalanced data classification performance of PSOPD-AdaBoost-A with ensemble algorithms including G-AdaBoost, B-AdaBoost, D-AdaBoost, Random Forest, and Extra Trees, sampling method including Smote, and four datasets with relatively high imbalance rate, Vehicle, PC3, PC5, and CM1 are used in the comparison. The results show that the PSOPD-AdaBoost-A ensemble algorithm is effective in processing data with relatively high imbalance rate compared to other improved algorithms. Our future work is dedicated to applying the proposed algorithm to the field of sensors, accurately achieving classification of targets by processing imbalanced data acquired by sensors.
Author Contributions: K.L. and G.Z. proposed the ensemble algorithm, conceived and designed the experiments; G.Z. performed the experiments; J.Z., F.L. and M.S. analyzed the data; G.Z. wrote the paper. K.L. and J.Z. contributed to manuscript definition of important intellectual content and manuscript revision; K.L. approved the final version of the manuscript.
Funding: This work was also supported by grants from the National Natural Science Foundation of China, with No.61673396, and the Natural Science Foundation of Shandong Province, China, with No.ZR2017MF032.