A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

: Classiﬁcation algorithms have shown exceptional prediction results in the supervised learning area. These classiﬁcation algorithms are not always efﬁcient when it comes to real-life datasets due to class distributions. As a result, datasets for real-life applications are generally imbalanced. Several methods have been proposed to solve the problem of class imbalance. In this paper, we propose a hybrid method combining the preprocessing techniques and those of ensemble learning. The original training set is undersampled by evaluating the samples by stochastic measurement (SM) and then training these samples selected by Multilayer Perceptron to return a balanced training set. The MLPUS (Multilayer perceptron undersampling) balanced training set is aggregated using the bagging ensemble method. We applied our method to the real-life Niger_Rice dataset and forty-four other imbalanced datasets from the KEEL repository in this study. We also compared our method with six other existing methods in the literature, such as the MLP classiﬁer on the original imbalance dataset, MLPUS, UnderBagging (combining random undersampling and bagging), RUSBoost, SMOTEBagging (Synthetic Minority Oversampling Technique and bagging), SMOTEBoost. The results show that our method is competitive compared to other methods. The Niger_Rice real-life dataset results are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC with our proposed method. In contrast, the MLP classiﬁer on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, Conceptualization, M.D. S.X.; methodology, M.A.E.; software, E.D.E.; validation, A.F., M.D. S.X.; formal analysis, M.D.; investigation, A.O.A.; resources, M.D.; data curation, S.X.; writing—original draft preparation, M.D.; writing—review and editing, S.X.; visualiza-tion, M.A.E.; supervision, S.X.; project administration, E.D.E.; funding acquisition, A.F.,


Introduction
Demographic growth in West Africa in general and Mali, in particular, requires abundant agricultural production to cope with this demographic growth. However, agricultural production in this region is traditional, i.e., linked to the weather. Climate change has a considerable impact on agricultural production in this region due to high temperatures [1]. Exploring machine learning technologies to predict agricultural production is an exciting challenge in this climatically unstable region [2]. Since machine learning gives significant results in prediction in certain areas, including recommendation systems, social media, finance, image processing, spam, anti-spam filtering, text classification, speech recognition, medicine, and environment [3], we explore those technologies. Crop production Predicting by machine learning prediction methods using features such as climate data can be a significant challenge. Rice is the most produced and consumed cereal in Mali. In this paper, we study the prediction of rice production using climate data in Mali in the irrigated area called the Niger office. We use the prediction methods of the classification algorithms for rice production from the Niger office. Classifications methods are more often known for solving qualitative problems, while rice production is quantitative. However, this adaptation of the solution is since the Niger office Company uses a threshold to qualify whether rice production is good or bad. This threshold is 6.2 tones per hectare. Thus, if the production is below this threshold, then it is qualified as bad, and if it is greater than or equal to this threshold, then the production is qualified as good. Classification algorithms are used for supervised problems. Traditional classification algorithms are efficient when the training dataset has certain representativeness and balance between the labels [4]. However, these algorithms are not efficient in the case of an imbalanced dataset [4]. After constructing our real-life dataset Niger_Rice dataset of rice production qualification using climatic data, it appears that this dataset is imbalanced according to [5]. The paper [5] defines an imbalanced dataset as being a dataset in which some observations are little compared to others. We use the Niger_Rice real-life dataset in this study and forty-four other imbalance datasets.
Other methods have been used to overcome the limitations of traditional classification algorithms on imbalanced datasets. The paper [6] groups them into four categories: data-level, algorithm-level, cost-sensitive, and ensemble learning methods. The data-level method is a preprocessing method that uses techniques to balance (under-sampling, oversampling, or hybrid) the training set before using traditional classification algorithms. The algorithm-level method is a technique of modifying a particular algorithm to adapt it to the imbalanced dataset. As for the cost-sensitive methods, they are the link between the data-level methods and the algorithm-level methods. The cost of misclassification is modified according to the learning algorithm for cost-sensitive. The ensemble learning methods are a combination of traditional classification algorithms.
This study uses a hybrid method combining a preprocessing technique (sub-sampling) and ensemble learning. The undersampling technique is MLPUS (Multilayer perceptron Under-sampling), which comprises three key steps: clustering, SM (stochastic measurement) evaluation, and training MLP on the evaluated samples. Clustering is the grouping of samples from the majority class to select the essential samples. The stochastic measurement evaluation is used for sample selection, and the last step is training the MLP on the samples selected by SM. The ensemble learning method used is bagging, which takes as training data the datasets balanced by MLPUS. To evaluate the performance of our method, we use forty-four other datasets and six other methods in this paper.
The contributions made in this paper are as follows: (1) The collection of climatic and rice production data from 1990 to 2020 for the Niger officearea and their fusion to make the Niger_Rice dataset. (2) The MLPUS and Bagging methods are combined to make a hybrid method of solving imbalanced dataset problems.
We combine MLPUS and Boosting methods to make a hybrid method of solving imbalanced dataset problems.
The remainder body of this paper is organized as follows: Section 2 gives the related works. The method used is detailed in Section 3. Section 4 provides the elements of our experiment, and finally, we conclude with Section 5.

Related Works
The details of two categories (data-level and ensemble learning) are presented in this section because the outcome of this paper is based on data-level and ensemble learning. However, the cost-sensitive algorithms are detailed in [7] with AdaCost and in [8] with AdaC1, AdaC2, and AdaC3. As for the algorithm level category, its details are reported in [6]. The algorithm level category is detailed in [9,10].

Sampling Methods
Sampling or data-level methods are used to have a certain balance of classes in the training set. These methods can be under-sampling, oversampling, or hybrid methods (combining the previous two ways).
The under-sampling methods remove instances of the majority class by following a particular technique. For example, the removing observations of the majority class randomly [11], the under-sampling based on ENN [12], and that based on the Tomek Link distance method [13].
Re-sampling methods based on over-sampling the minority class are the most common in the data-level category. In this category, we have the over-sampling proposed by [14], the over-sampling based on the cluster [15]. The SMOTE (Synthetic Minority Oversampling Technique) is the most widely used literature [10]. SMOTE creates the synthetic instances in the minority class based on the nearest neighbors of that class's given sample [16]. Several techniques have been proposed to improve SMOTE, such as Safe-level SMOTE [17], or Borderline-SMOTE [18], or even ADASYN [19]. The paper [20] reports 85 variants of SMOTE.
Hybrid re-sampling methods combine over-sampling and under-sampling techniques [21]. In Paper [22], a mixed re-sampling method has been proposed using SMOTE and an under-Sampling algorithm to solve the noise problem. Another hybrid re-sampling approach combines spatiotemporal over-sampling and selective un-der-sampling to align foreground and background pixels in a video [23]. Two separate and parallel particle swarm optimization processes used a mixed re-sampling method [24].

Ensemble Methods
According to the paper [25], sampling methods and ensemble methods have effectively resolved class imbalance in recent years. The ensemble methods, often also called ensemble solutions, combine several basic classifiers to integrate their results and generate a single classifier to improve performance. Ensemble solutions generally give better performance compared to the individual classifier [26]. In the literature, they are Bagging and boosting come up most often. This category can also merge the previous categories to be effective in the problem of class imbalance. With a random sampling of the training data, bagging [27] obtains a basic classifier. The ensemble methods are often not adapted to the problem of class imbalance [28]. Combining these methods with the other methods (data-level, algorithm-level, and cost-sensitive) is used to adapt them to the specific problem of class imbalance. It is in this context that bagging with under-sampling methods has been proposed in [11]. However, often ensemble methods are only implemented to solve the class imbalance problem such as: SMOTEBagging [29], SMOTEBoost [30], RAMOBoost [31], RUSBoost [32], EUSBoost [33], EasyEnsemble [34], Random balance-boost [35].
The SMOTEBagging [29] is a mix of the Bagging and SMOTE techniques. SMOTE generates artificial instances of the positive class to build a dataset with balanced categories. SMOTEBoost [30] is a combination of SMOTE and AdaBoostM2. Every turn after boosting, SMOTE is used to generate new synthetic cases from the minority class. These synthetic data have the same weights as the original data, but the original data's weights have been changed. However, the information that has significant weights are those that are difficult for the previous classifiers. The RAMOBoost [31] is the combination of ADASYN [19] and AdaBoostM2. The only difference between RAMOBoost [31] and SMOTEBoost [30] is the algorithm used to create synthetic instances.
In contrast, SMOTEBoost [30] uses SMOTE for these instances, ROMOBoost uses ADASYN [19]. These artificial data are created based on underlying data distribution. RUS-Boost is the combination of random under-Sampling and Ada-BoostM2. After each round of boost, this time, it is the random under-Sampling that is applied. EUSBoost [33] is the union of AdaBoostM2 and an evolutionary algorithm. In this technique, the under-Sampling uses an evolutionary algorithm to remove instances of the majority class. EUSBoost uses different subsets of majority class instances to promote diversity on each iteration to train Information 2021, 12, 291 4 of 21 each classifier. EasyEnsemble [34] subdivides the majority class into several subsets, then introduces an AdaBoost set by taking each subsets' and mixing the classifiers' outputs. Random balance-boost is a combination of SMOTEBoost [30] and RUSBoost [32]. In other words, after each boost, a hybrid re-sampling is performed. The SMOTE technique does the oversampling for the minority class, and the under-sampling is done randomly for the majority class. The paper [36] categorizes these approaches into four principal families UnderBagging [11], OverBagging [29], (hybrid) UnderOverBagging [29] and IIVotes [37].

Research Materials Proposed Hybrid MLPUS with Bagging Methods
As we have already mentioned in the introduction section, our proposed hybrid method uses re-sampling and ensemble learning methods. We use the hybrid combination of MLP under-sampling and bagging. The MLPUS is an under-sampling method [38] that brings together three key concepts: clustering (grouping) majority class samples, using stochastic measurement (SM) evaluation to select large samples, and training the MLP the examples set from SM evaluation. Algorithm 1 shows the pseudo-code of the existing method MLPUS [38]. The clustering method used by MLPUS is K-means, the number of clusters n is determined by √ N p for each class. N p denotes the number of samples of the minority class. The selection closest to the cluster's center of gravity is estimated and then added to the training set. Since the number of collected samples is equal, in the initial training of MLP, we have an equal number of samples for each class. The value p is a constant for each iteration. The majority of class samples are grouped into N p clusters so that only the most significant representatives participate in the sub-sampling and the distribution of the data is then preserved to get the same number N p of samples for each class and perform SM on each class. The samples close to the centers are chosen among the N p clusters, and their SM is calculated. After this calculation, only the n examples are selected, those with a high SM. The same procedure is calculated for the samples of the minority class. The MLP obtains a balanced dataset of these 2n samples for training. We get 2in of samples where i is the number of iterations; on the other hand, i cannot be greater than n. The samples will be iteratively removed from the original imbalanced dataset until the minority samples are more significant than n.
The MLP (Multi-Layer Perceptron) has a standard neural network architecture using backpropagation to train its model. It has at least three layers: the output layer, the input layer, and one or more intermediate (hidden) layers. In this architecture, the initial weights of the connections are random, and a learning rate is chosen as a function of a constraint affecting the MLP. If this rate is lower, MLP learning is slow, and if it is high, MLP learning will not go well. We define the inputs as x 1 , x 2 , . . . , x n and the corresponding weights w 1 , w 2 , . . . , w n , then the outputs of each neuron are calculated as follows x 1 w 1 + x 2 w 2 + . . . + x n w n . For each layer unit, the outcome is propagated, and its error is calculated as error = predicted_output − actual _output. The function which defines the MLP is as follows: where w ki is the connection of input neuron k and the hidden layer neuron i, w ij is the connection of hidden layer neuron i and output neuron j, m is the number of hidden neurons [38], and f (x) is the activation function. Activation functions are functions used to calculate the weighted sum of inputs and biases in neural networks. AFs are used to decide whether a neuron can be triggered or not. AFs can be the main functions or their variants. Here are some main activation functions: • The sigmoid AF, also called logistic function [39], is defined as follows: Information 2021, 12, 291 5 of 21

•
The AF tanh is the smoother hyperbolic tangent function centered on zero with a range between −1 and 1 [40] and given by: • The rectified linear unit (ReLU) AF [41] determines the threshold operation on each input element and sets negative values to zero. The formula of ReLU is defined by: • The Swish AF [42] is defined by • The exponential linear unit (ELU) AF [43] is given by The Exponential linear Squashing (ELiSH) AF [44] is given by: In the paper [38], the calculation of SM for the MLP is the main criterion for undersampling. The SM is the square of the difference between the output of the future sample and the original dataset. Thus, the greatest value is assigned to the hard-to-learn samples and added iteratively to the training set. The MLP will not misclassify these samples. The following formula gives the calculation of x samples by the SM: where is the Halton point and is defined as: where m is the number of hidden neurons, et f (x) is the sigmoid function. We use the bootstrap aggregation method, also called bagging. The bootstrap aggregation method was introduced by Breiman [27] to construct bagging sets for the first time. This method relies on the idea of training different classifiers with random replicates (the size of the original training dataset is kept) of the original training dataset. Different subsets of data are used to achieve diversity with re-sampling. The deduction of a class for an unknown opinion of each singular classifier is obtained by majority or weighted vote. Its simplicity and its good generalizability have enabled bagging methods to deal with data imbalance problems with many approaches. Algorithm 2 shows how the pseudo-code of our method will process. The framework of our proposed method is shown in Figure 1. input Imbalanced Training Set D output Balanced Training set D
D maj is majority class sample 3.
D min is minority class sample 4. n = √ N p N p is the number of clusters for both D min and D maj 5.
G 0 centroid of D min 6.
H 0 centroid of D maj 7.
Step 2 Train MLP using D 10. While N p > n do 11.
Step 3: most essential samples from D maj 12. N p become number of the cluster for D maj 13 Step 5 add sample from C get largest SM to set G d and H d respectively 23.
Step 6: Step 7 Train MLP using D for t = 1 to T do 2: Since in the paper [36], bagging associated with re-sampling methods has provided approaches that can address imbalanced data. Our proposed method consists of combining a re-sampling method (under-sampling) with bagging to deal with data imbalance. Our proposed method is from the UnderBagging family. It consists of using MLP to undersample the initially imbalanced training dataset to balance and then use bagging. Our approach consists of three main steps, as shown in Figure 1. The first step is to initialize and then train the MLP on the imbalanced training set to find the most representative samples. The second step is to assess the SM of the most representative samples to provide a training set balance. The third and last step is to bootstrap the dataset provided after the training set. Since in the paper [36], bagging associated with re-sampling methods has provided approaches that can address imbalanced data. Our proposed method consists of combining a re-sampling method (under-sampling) with bagging to deal with data imbalance. Our proposed method is from the UnderBagging family. It consists of using MLP to under-sample the initially imbalanced training dataset to balance and then use bagging. Our approach consists of three main steps, as shown in Figure 1. The first step is to initialize and then train the MLP on the imbalanced training set to find the most representative samples. The second step is to assess the SM of the most representative samples to provide a training set balance. The third and last step is to bootstrap the dataset provided after the training set.
The balanced training set was obtained after applying MLPUS postulates for bootstrapping. As the EL method is bagging, it results in the different classifiers in sequential order, as shown in Figure 1. Each primary bagging classifier replaces the original training set with bootstrapping. That ensures that all the base classifiers are not affected by the imbalance. In the end, each base classifier provides a Bagged classifier: ( ). The balanced training set was obtained after applying MLPUS postulates for bootstrapping. As the EL method is bagging, it results in the different classifiers in sequential order, as shown in Figure 1. Each primary bagging classifier replaces the original training set with bootstrapping. That ensures that all the base classifiers are not affected by the imbalance. In the end, each base classifier provides a Bagged classifier: H(x).

Datasets
This paper uses forty-five imbalanced datasets, of which one is a real-life Niger_Rice dataset, and the other forty-four datasets come from the KEEL dataset repository [45]. The real-life dataset (Niger_Rice) is a rice production dataset available in the URL (https: //github.com/moussdiall/Imbalanced_dataset_Niger_Rice, accessed on 21 July 2021). This dataset has as attributes the total precipitation, the average of maximum, average and minimum temperature of six months (from June to November) of a regular season, according to the Niger Office in Mali. The Niger Office is a Malian parastatal company that manages one of the largest and oldest irrigated areas in West Africa. Table 1 describes the features of the Niger_Rice dataset.  The datasets coming from the KEEL repository are not wholly independent. Several of them are only variants of the original datasets. For example, we have twenty variants of the yeast dataset, fourteen variants of the glass dataset, eight variants of the ecoli dataset, four variants of the vehicle dataset, two variants of the new-thyroid, shuttle, Abalone, page-blocks dataset, and the other datasets are present with only one variant. In Table 2, we give the details of each dataset. The header Att designates the number of attributes of the dataset. The header NI indicates the number of instances contained in the dataset. P and N represent respectively the number of positive and negative class samples (minority and majority). IR. (Imbalanced Ratio) designates the quotient between the majority class (negative class) and the minority class (positive class). The IR of the Niger_Rice dataset is 3.43, while the IR of other datasets ranges from 1.82 to 129.44. The following subsections provide details of the design of the Niger_Rice dataset.

Niger_Rice Dataset Study Area
The research area is in Mali, in the Niger River's inner delta. Mali, like the other Sahel nations, has a diverse environment, according to [2]. With over 100,000 ha, the Niger Office comprises seven irrigated areas: Kolongo, Niono, N'Debougou, M'Bewani, Macina, Molodo, and Kouroumari.

Niger_Rice Dataset Data Collection
Climate data and agricultural yields from the records of Mali's National Meteorological Agency (MALI METEO) and the Niger Office company make up the Niger_Rice dataset. Rice data for the two Areas (named Casier and Hors-Casier) were collected from the Niger Office's Planning and Statistics Department from 1990 to 2020. In addition, the Mali Météo provided climate data that could affect agricultural production over the same period from 1990 to 2020. Precipitation, minimum, average, and maximum temperatures are among the climate records. The quantity of agricultural output per cultivation area is used to measure agricultural yield. Agricultural production is measured in tones, and cultivation area is measured in hectares. The ratio of agricultural output to the cultivated area (tones/hectare) is known as yield.

Niger_Rice Dataset Preprocessing
To make this data usable by machine learning technologies, we preprocess the data with the following steps: Step 1: Collect monthly climate data (precipitation, maximum, minimum, and average temperatures) for the regular agricultural season (June to November) recorded at the Niger office area with the Mali Météo from 1990 to 2020.
Step 2: Calculate the total precipitation, the average maximum, minimum, and average temperature for the season (June to November) in the two zones (Locker and non-locker) of the Niger Office from 1990 to 2020 Step 3: Collect data on the area under cultivation, production, and agricultural yield from 1990 to 2020, with the planning and statistics service of the Niger Office.
Step 4: We have gathered this raw data on a Microsoft Excel sheet composed of the headings of the following columns: No, year, name of the zone, precipitation, maximum temperature, average temperature, minimum temperature, crop area, production of the site, and the yield.
Step 5: For proper data preparation to apply data mining technologies, non-demanding columns have been removed. These columns are: No, year, name of the zone.
Step 6: As the yield is calculated according to the crop area and the production, these two columns have been deleted. The yield determines the output quality (good or bad); we define the yield as the class label.
Step 7: The dataset is sorted by performance to rank "good" or "bad" records. The bad yield is less than 6.2 tons/hectare, and the good yield is over 6.2 tons/hectare. The bad class has 48 records, while the good class has 14 records.
Step 8: The final File of this dataset is saved in CSV format to apply machine learning techniques. The final dataset file has five columns: Rainfall, maximum temperature, average temperature, minimum temperature, and crop yield.

Performance Evaluation
A popular performance concept for classification is the Confusion Matrix, a table that shows the model's predictions against actual labels (see Table 3). The rows of this confusion matrix define the instances of a current class and the columns the instances of the predicted label.

Experimental Setting
The parameters set are n, k, and m. The n is the number of clusters for each class at the level of the MLPUS method. The k is the number of subdivisions of the training set applying for bagging after balancing the training set, and m is the number of neurons hidden for feedback-propagation to train its MLP model. These parameters are defined by the user and must be known for the execution of the experiment. To determine the significance of the results, we use an alpha = 0.05, which is the confidence interval between the classifier's results. In implementing these experiments, we used a laptop computer with an Intel Core i7-4720HQ (2.59) microprocessor, with 8.00 GB RAM and a 64-bit file system for the operating system. We use MATLAB for preprocessing methods and Weka machine learning tools to create the models. The two-fold-five-iterations split cross-validation is used to train the model.

Experimental Results and Discussion
The different experimental results of our method and six other methods of the 45 imbalanced datasets of Table 2 are summarized in this section. The results in this experiment are obtained by two-fold cross-validation with five iterations, with the standard deviation (std) between the parentheses. The other six methods mentioned earlier as baseline are MLP classifier [46] on the original imbalanced dataset, MLPUS (the MLP Under-Sampling preprocessing method) [38], SMOTEBagging [29], SMOTEBoosting [30], Under-Bagging [11], and RUS_Boost [32]. Beyond our proposed method, MLPUS with Bagging, we also performed MLPUS with boosting in the experiment. In total, in this experiment, we compare eight methods for a better analysis. Table 4 shows the accuracy (std) of the different techniques with the 23 datasets in Table 2. The accuracy is not a powerful metric in an imbalanced dataset. We have considered three other metrics: F-Measure, G-Mean, and the ROC curve (Receiver Operating Characteristic). Table 5 shows the F-Measure (std) metric of the different methods with the 45 datasets of Table 2. The G-Mean (std) results of each technique on each of the datasets of Table 2 are shown in Table 6. Finally, Table 7 shows the different outcomes of each method's ROC (std) curve with the 45 datasets in Table 2. The results are significantly better with the letter "v", or significantly weak with the symbol "*", or not significant with a confidence interval of alpha 0.05, i.e., 5% risk of error.        In Table 4, the accuracy of our method revealed on the 45 datasets is twenty times significantly better than the MLP classifier in the original imbalanced dataset. At the same time, it is significantly weak only three times. Also, the proposed method wins respectively eighteen, two, three, twenty-two, twenty-three, and ten times against MLPUS, SMOTEBagging, SMOTEBoosting, Under-Bagging, RUSBoost, and MLPUS_Boost methods. It loses three, four, four, two, and two times against MLP Under-Sampling, SMOTEBagging, SMOTEBoosting, Under-Bagging, and RUSBoost methods. In contrast, it loses one against the MLPUS_Boost technique.
The different F-measure results on the 45 datasets show in Table 5 that our method respectively wins twenty-eight, nine, twenty, eighteen, sixteen, and sixteen times against MLP classifier on the original imbalanced dataset, MLPUS, SMOTEBagging, SMOTEBoosting, Under-Bagging, and RUSBoost methods. However, It wins only one time against the MLPUS_Boost method. At the same time, Our proposed method respectively loses four, fourteen, four, five, seven, six, and six times against MLP, MLPUS, SMOTEBagging, SMOTEBoosting, Under-Bagging, RUSBoost, and MLPUS_Boost methods on the F-measure metric.
In Table 6, the results of G-mean on 45 imbalanced datasets reveal that our method wins respectively twenty-seven, eleven, fourteen, thirteen, eighteen, and fourteen times against MLP classifier in the original imbalanced dataset, MLPUS, SMOTEBagging, SMOTE-Boosting, Under-Bagging, and RUSBoost methods. However, the MLPUSBagging did not win against the MLPUS_Boost technique. At the same time, the proposed method loses five, fourteen, five, eight, six, seven, and six times, respectively, against MLP, MLPUS, SMOTEBagging, SMOTEBoosting, Under-Bagging, RUSBoost, and MLPUS_Boost methods in terms of G-mean on 45 imbalanced datasets.
The different Area Under ROC results in Table 7 shows that for the 45 imbalanced datasets, our proposed method wins thirteen, ten, four, nine, sixteen, eighteen, and three times respectively against MLP, MLPUS, SMOTEBagging, SMOTEBoosting, Under-Bagging, RUSBoost, and MLPUS_Boost methods. The MLPUS_Bagging loses two, five, four, three, and two times, respectively, against MLP, MLPUS, SMOTEBagging, SMOTEBoosting, and Under-Bagging methods. While the MLPUSBagging loses only ones against the RUSBoost and MLPUS_Boost methods in Area Under ROC results on the 45 imbalanced datasets.
Apart from accuracy, which is not an adequate metric for dealing with imbalanced datasets, our proposed method has shown significant results for the Niger_Rice dataset with other metrics such as F-Measure, G-mean, and the area under the ROC. These results of the F-measure from the Niger_Rice dataset show that only one method is significantly better considering the p-value of 0.05. On the other hand, these same results indicate that our proposed method is considerably better than the five methods with the p-value (see Table 5). Thus, the results of F-Measure for the Niger_Rice dataset are 0.73, 0.82, 0.57, 0.76, 0.76, 0.55, 0.54, and 0.69, respectively, for our proposed method, the classifier MLP on the original imbalanced dataset, MLPUS, SMOTE_Bagging, SMOTE_Boost, Under-Bagging, RUS_Boost MLPUS_Boost methods.
Beyond F-measure, the results of our method for the Niger_Rice dataset are much better for the G-Mean metric, with three methods significantly weak and no method better than our proposed method with the p-value 0.05 (see Table 6). This result of the Niger_Rice dataset means that the true positive and negative rates are distributed well with our proposed method. Thus, the G-mean results for the Niger_Rice dataset are 0.76, 0.59, 0.60, 0.77, 0.76, 0.60, 0.60, and 0.22, respectively, for our proposed method, the classifier MLP on the original imbalanced dataset, MLPUS, SMOTE_Bagging, SMOTE_Boost, Under-Bagging, RUS_Boost MLPUS_Boost methods.
These results are even better for the area under the ROC metric on the Niger_Rice dataset. Our approach provides better results than others methods (see Figure 2). The areas under the ROC curve results show us that whatever the threshold, the true positive rate against the false positive rate is significantly better with our method than others except for the SMOTE_Bagging method. The Roc curve of our proposed method is far superior to the other curves; only the Roc curve of SMOTEBagging and SMOTEBoost are competitive with it.
Besides these promising results of our method, it presents some limitations. The first limit is a reduced number of samples from the minority class. Because when this sample is reduced and selected, the square root of this reduced number can lead to a loss of essential samples in both classes. Another drawback is choosing the best activation function to train the model. In this experiment, the sigmoid function was used. However, the other activation functions are being explored in future experiments. In this experiment, the results show that the IR is not a factor that influences the model's training when the samples of the minority class are not sufficiently reduced.

Conclusions
This paper has proposed a hybrid method composed of the under-sampling and ensemble methods to deal with a class imbalance problem for a real-life Niger_Rice dataset. The under-sampling method consisted of taking the samples by evaluating them with the stochastic measure by training the Multilayer perceptron. The ensemble methods used in this research are bagging and boosting. The proposed method, MLPUS_Bagging, consists of aggregating the different training sets provided after the under-sampling of the original training set. To measure and quantify our method, we compare it with six other hybrid methods combining the preprocessing methods and the ensemble methods and the combination of our MLPUS with Boosting. Beyond our real-life Niger_Rice dataset, forty-four other datasets were used in this study to understand the impact of our method on the well-known imbalanced datasets. The results clearly show that on the 45 imbalanced datasets, our method is better than the other methods concerning metrics such as F-measure, G-mean, and ROC curve with a p-value of 0.05. The results of our method for the Ni-ger_Rice real-life dataset are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC.
In comparison, the MLP classifier on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, F-measure, G-mean, and ROC. Our hybrid method combining the under-sampling and ensemble methods gave convincing results. However, in future work, we will try to study another oversampling method using the evaluation of the stochastic measure by training the multilayer perceptron. We will also explore the hybrid method combining oversampling using the evaluation of the stochastic measure by training the multilayer perceptron and ensemble methods. We will also explore how our proposed method can deal with the multi-class imbalance problem in future work. Besides these promising results of our method, it presents some limitations. The first limit is a reduced number of samples from the minority class. Because when this sample is reduced and selected, the square root of this reduced number can lead to a loss of essential samples in both classes. Another drawback is choosing the best activation function to train the model. In this experiment, the sigmoid function was used. However, the other activation functions are being explored in future experiments. In this experiment, the results show that the IR is not a factor that influences the model's training when the samples of the minority class are not sufficiently reduced.

Conclusions
This paper has proposed a hybrid method composed of the under-sampling and ensemble methods to deal with a class imbalance problem for a real-life Niger_Rice dataset. The under-sampling method consisted of taking the samples by evaluating them with the stochastic measure by training the Multilayer perceptron. The ensemble methods used in this research are bagging and boosting. The proposed method, MLPUS_Bagging, consists of aggregating the different training sets provided after the under-sampling of the original training set. To measure and quantify our method, we compare it with six other hybrid methods combining the preprocessing methods and the ensemble methods and the combination of our MLPUS with Boosting. Beyond our real-life Niger_Rice dataset, forty-four other datasets were used in this study to understand the impact of our method on the well-known imbalanced datasets. The results clearly show that on the 45 imbalanced datasets, our method is better than the other methods concerning metrics such as F-measure, G-mean, and ROC curve with a p-value of 0.05. The results of our method for the Niger_Rice real-life dataset are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC.
In comparison, the MLP classifier on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, F-measure, G-mean, and ROC. Our hybrid method combining the under-sampling and ensemble methods gave convincing results. However, in future work, we will try to study another oversampling method using the evaluation of the stochastic measure by training the multilayer perceptron. We will also explore the hybrid method combining oversampling using the evaluation of the stochastic measure by training the multilayer perceptron and ensemble methods. We will also explore how our proposed method can deal with the multi-class imbalance problem in future work.