The main classification problem is the identification of a set of categories or sub-populations belongs to which group (cluster) on the basis of a training set of data, that contains observations and whose categories membership is known. For example, suppose we have to predict whether a given electricity consumer belongs to consumer segment (class) 0, 1, 2, or 3, after training a set of ECPB data. To solve this problem applying the classification technique is a mandatory. In this case, a classifier is used to predict class labels such as consumer segment ‘0, 1, 2 or 3’ for classifying electricity consumers.

As a result of K-means clustering, a new attribute is added to the data set named consumer-segment. This labeled data set will be used in a classification model. Now the data set contains (ConsumerID, normalized average quantity per month, normalized average number of bills per month, normalized area, Normalized tariffNo, and consumer-segment). ECPB with the consumer-segment feature is a labeled data set. K-Means is used to find clusters within the data set and test how good it is as a feature. The data set then is divided into train and test data sets. SVM classifier is applied on the data set. The consumer-segment attribute values are 0, 1, 2, or 3 this refer to cluster 0, cluster 1, cluster 2, or 0 cluster 3 respectively. This class attribute is used to determine the consumer segment that a consumer belongs to. The following classification task is applied. Suppose we have to predict whether a given electricity consumer belongs to consumer segments 0, 1, 2, or 3, on the basis of four variables, the average electricity quantity consumed per month, the average number of prepaid bills per month, the area and tariff the consumer belongs to. These variables are called features. To solve this classification problem, a set of ECPB observations called training data set, which is prepared from the actual electricity prepaid bills and the classification results come from the K-means clustering algorithm. A model is trained and used to predict whether a certain consumer will belong to consumer segments 0, 1, 2, or 3. Therefore, the outcome depends upon the ability of features to map to the outcome. Evaluation of the quality of the model by statistical and mathematical measures to check to what extent the classifier generalizes the relationship between the features and the outcome using the optimal parameter values obtained on the training data sets, the accuracy was examined using independent test data sets. The classification accuracy is the ratio of correct prediction to total prediction made. As a result of the SVM classification method application on data sets, the classification accuracy is 93.4%. It can be observed that SVM has a significance performance measure. Therefore, the machine learning classifier based on SVM algorithm applied on the consumer electricity prepaid bills’ data set is more than 93% accurate in predicting whether a consumer belongs to consumer segment 0, 1, 2, or 3.

The accuracy of classification alone is not enough, because it can be misguiding, particularly if we have an unequal number of observations in each class or as in our case if we have more than two classes. Confusion matrix calculation is the solution. The confusion matrix summarizes the performance of the classification algorithm and gives us better information about how our classification model is improvinf and also what types of errors are making. Therefore, the confusion matrix is a mandatory in our work because it can be used to see more detail about the performance of the model. It is a summary of prediction results on a classification problem. The next section is a detail discussion of classification performance.

#### 6.5.2. Discussion of Confusion Matrix

The results of the classification model is shown in

Table 3. It shows the confusion matrix after applying the SVM classifier. It is a summary of prediction results. The problem is to predict whether a certain electricity consumer will belong to consumer segment (class) 0, 1, 2, or 3. It is a four-class confusion matrix. This matrix helps us to understand the type of errors that occur during the testing and training data sets. The confusion matrix, cm is a 4 × 4 matrix. Its rows and columns refer to the ground truth and predicted class labels of the data set, respectively. In other words, each element,

$c{m}_{ij}$, refers to the number of observations of class i that were assigned to class

j by the SVM classification method. For instance, the number of observations of class 1 that were assigned to class 0 by SVM classifier is 104. The diagonal of the confusion matrix gives the correct classification decisions (

i =

j). Count values in the matrix show the number of correct and incorrect predictions, and broken down by each class. The ways in which the classification model is confused when it makes predictions is shown. Insight not only into errors, but type of errors being made by classifier are given. It is an excellent choice for reporting results in 4-class classification problems because the relations between the classifier outputs and the true ones is possible to be observed. In 2-class confusion matrix, identification of the four possible results, true positive (TP), false positive (FP), false negative (FN), and true negative (TN), that means correctly classified or predicted, incorrectly classified or predicted (type I error), incorrectly rejected (type II error), and correctly rejected, respectively is easy. In a four-class confusion matrix, where the elements

$c{m}_{ij}$ in the confusion matrix, where

i is row identifier and

j is column identifier, refer to the cases belonging to

i that had been classified as

j. The total numbers of true positive (TTP), false positive (TFP), false negative (TFN), and true negative (TTN) for each class

i (

i = 0, 1, 2, 3) will be calculated as:

In our case

$TT{P}_{all}=$ 1351 + 867 + 1940 + 1871 = 6029 times. That means that the total number of times over the samples were correctly classified or predicted is 6029 times.

In our case

$TF{P}_{2}=$ 32 + 0 + 11 = 44 times. That means we have 44 times non-class 2 classified or predicted as class 2.

In our case

$TF{N}_{1}=$ 0 + 0 + 0 = 0 times. That means all class 1 instances that are not classified or predicted as class 1 are 0 times.

In our case

$TT{N}_{1}=$ 1351 + 32 + 26 + 94 + 1940 + 9 + 153 + 11 + 1871 = 5487 times. That means all non-class 1 instances that are not classified or predicted as class 1 are 5487 times.

In our case the total number of cases is 6458 cases. To evaluate the overall accuracy of the classifier:

In our case the overall accuracy of SVM classifier is 93.3%, obviously, the 1—overall accuracy is the overall classification error, which is 6.7%.

The measure of the overall accuracy is characterizes the classifier as whole. There are three class-specific measures, that describe how well the SVM classifier algorithm performs on each class. Firstly, the class recall measures,

R(

i), which is the proportion of data with true class label

i that were correctly assigned to class

i. In other words, out of all positive classes, how much we predicted correctly? It should be high as possible.

In our case the recall measure of class 0, which correctly assigned class 0 is 0.99 (99%). Secondly, the class precision,

P(

i), which is the fraction of observations that are correctly classified to class

i if we take into account the total number of observations that are classified to that class. In other words, out of all positive classes we have predicted correctly, how many are actually positive.

In our case the precision measure of class 2, which measures the fraction of data, which are correctly classified to class 2 if we take into account the total number of data, which are classified to class 2 is 0.993 (99.3%). Thirdly, the class specificity measures,

S(

i), which answer the question that out of all negative classes we have predicted correctly, how many are actually negative?

Finally, we conclude that SVM classifier can be used to predict whether a certain electricity consumer will belong to the consumer segment (class) 0, 1, 2, or 3 with a high degree of accuracy.