3.2.1. Creating a Database
In the process of signal analysis, each signal was analyzed in detail by dividing it into sequences of 100,000 elements. For each of these sequences, mean values were calculated for the signals coming from both the accelerometer and the microphone. For the signals from the current transformer, mean absolute values were determined. This process made it possible to create new sequences representing the averages for the individual signal channels.
The data set contains 3738 records.
Table 2 presents the distribution of the number of cases for each layer in this data set.
The differences in the number of cases between layers reflect the variation in samples from different stages of the process, which is an important aspect in terms of analyzing the signals and understanding their characteristics depending on the material layer.
In order to effectively develop a classifier to identify the layer of material being processed, the data set was divided into a learning set, representing 80% of the cases, and a test set, comprising 20% of the cases. The test set consists of 748 instances.
Let denotes the data set . Elements of the sequence are the values of basic statistics such as the mean value and standard deviation determined on the basis of 10-s readings from the sensors. The elements of the vectors belong to the four classes and are predictors for classification models. The following methods were used to construct the classifier for identifying the layer of processed material:
These advanced classification models are designed to effectively recognize layers of processed material based on signal analysis. Their effectiveness will be tested on the test set, which will allow for assessing the effectiveness of the classifier. This approach allows one to develop an understanding of the machining process and to improve the precision of identifying material layers, which is a key element of optimizing production processes.
3.2.2. Logistic Regression
Let
be the probabilistic space, and
. Logistic regression describes the distribution of the probability of realizing the random variable
based on the realization of the independent variables
[
29,
30]. To apply logistic regression for each of the classes
, modify the training set
, where
and the value
means (odds).
In logistic regression, we analyze the linear dependence of the logarithm of the odds based on the realization
:
where
is a random variable with a normal distribution
and
. From Formula (3), we have:
For each class
, we solve the task
where the likelihood function
is given by:
To estimate the structural parameters of
for the
class, task (5) is replaced by an auxiliary task:
where
When the predictors are collinear, then ELASTICNET regularization [
31,
32] is additionally used. The parameters of linear regression in model (7) are determined by solving the problem:
where
,
, and
means the penalty given by the formula:
For each class, the parameters
,
were estimated. For the realization of independent variables
, the probability of belonging to the
th class was assessed as follows:
The results of the analysis of the use of logistic regression for the test set were presented using a confusion matrix and the overall accuracy of the classifier (
Table 3). The overall accuracy of the model is 90.78%. The classifier successfully identifies Layer 1, achieving 254 correct predictions and only seven errors. However, as the number of layers increases, errors increase, especially for Layer 2 and Layer 3, which may require further optimization of the model. It is worth paying attention to cases where errors occur, which may suggest difficulties in distinguishing between certain classes.
Additionally, characteristics were estimated for each class (
Table 4). The analysis of the logistic regression classifier’s recognition characteristics demonstrates the effectiveness of the model in classifying individual layers. High values for sensitivity (Recall) indicate that the model effectively identifies positive cases in all layers. Specificity is also high, indicating effective detection of negative cases. The Precision (positive predictive values—Pos Pred Value) exhibits variability, with the highest being seen in Layer 1 and the lowest in Layer 4. Furthermore, the negative predictive values (Neg Pred Value) generally remain high, particularly for Layer 4. Precision and Recall demonstrate satisfactory outcomes, although there is some variation in F1 score among the classes. Prevalence denotes the distribution of categories in the training set, with Layer 1 and Layer 2 being predominant. Balanced Accuracy persists at a high level across all layers.
Additionally, a ROC curve was determined for each class, and AUC values were estimated (
Figure 15), with the best results obtained for Layer 1—AUC = 0.9923.
ROC curves and AUC values confirm the overall quality of the classifier in distinguishing between classes. This analysis is an important tool for assessing the effectiveness of the model and indicates areas that may require further optimization [
33].
3.2.3. Gradient Boosting Classifier
Boosting is one of the learning methods. It was originally implemented for the classification problem. The idea of the boosting method consists in combination with the ‘weak’ classifiers set to produce the ‘powerful’ classifier [
34,
35].
Thus, the gradient boosting classifier [
31] relies on the definition of the sequence of trees
, where the classification-boosted model based on trees is created as follows:
To identify the layer of processed material, we determine the boosted model
based on data set
, where
but
for
. The boosting trees (11) are determined by applying a forward stagewise procedure. The classification tree
is forced to concentrate on observations that are misclassified by boosted model
. In every step
, the tree is defined as follows:
where
denotes a set of separable regions:
From the above the sequence,
identifies the parameters of the
th classification tree. For the
th step (
), the parameters of tree (12) were estimated by the solution of the task:
where
denotes the loss function, and
,
. In this case, the K-class exponential loss function was used [
36].
The results of the analysis of the application of the gradient boosting classifier for the test set are presented in the confusion matrix (
Table 5).
The values in each cell represent the number of cases assigned to the class. For example, the number 261 in the first row and first column means that for true Layer 1 instances, the classifier also correctly predicted Layer 1 in 261 instances.
The overall accuracy of the classifier on the test set is 97.46%, which means that almost 98% of the instances were classified correctly. The classifier seems to be effective in discriminating between different layers, especially for Layer 1 and Layer 2, where it achieves very high accuracy. It is worth noting that the lack of errors in Layer 1 may suggest that this class is relatively easy for the classifier to recognize.
However, it is also important to note individual error cases, such as the seven cases in which Layer 3 was misclassified as Layer 4. Analysis of these cases may be important to understand the specifics of the classifier’s errors and potential areas for further optimization.
The obtained results suggest that the gradient boosting classifier performs well in classifying the different layers, offering high overall accuracy, but it is worth analyzing the error cases in detail for possible model improvement.
Additionally, characteristics were estimated for each class, and their values are presented in
Table 6.
Analysis of the results for the gradient boosting classifier reveals the classifier’s excellent ability to effectively identify individual classes. The sensitivity (Recall) for each of the layers (Layer 1–4) remains at an impressive level of over 95%, demonstrating the classifier’s ability to effectively identify instances of a particular class from among all actual instances of that class.
Specificity for most layers also remains at a very high level, demonstrating the ability of the classifier to correctly detect cases that do not belong to a given class among all cases that do not belong to that class.
The Precision (Positive Pred Value) for most of the strata approach 1, indicating the effective detection of true positive cases. Neg Pred Value values also remain high, indicating effective elimination of false positive cases.
Comparing the layers, it can be seen that Recall, Specificity, and Precision are highest for Layer 1, suggesting that for this class, the classifier achieves the highest performance. Layer 3 achieves the lowest Precision, but it still maintains a satisfactory level.
Balanced Accuracy for each layer remains very high, confirming the overall effectiveness of the classifier in the context of the balanced classification problem.
Additionally, as for the previous model, a ROC curve was determined for each class, and the AUC value was estimated (
Figure 16).
Comparing the recognition of layers, it can be seen that the gradient boosting classifier (97.46%) has a better overall accuracy on the validation (test) set than the logistic regression (90.78%). Precision (Positive Pred Value—the proportion of correctly recognized positive cases among positively classified cases) and Recall (sensitivity—the proportion of positively recognized positive cases among positive cases) are responsible for the accuracy of recognizing individual layers (as positive cases).
Moreover, comparing the Precision values, it can be seen that regardless of the layer, better results were obtained for gradient boosting classifier. A new cutter was used for Layer 1, and the cases for Layer 1 were accurately recognized by the gradient boosting classifier (i.e., Precision = 1). For logistic regression, Precision is 0.9732. For the remaining layers, the Precision in the gradient boosting classifier is at least 0.8, whereas for logistic regression, it is less than 0.66 for Layer 4.
The recall for the gradient boosting classifier is at least 0.95 regardless of the strata, whereas for logistic regression, it is below 0.95. When analyzing the area under the curve, it can be observed that for the gradient boosting classifier, it is above 0.99 regardless of the layer, whereas for logistic regression, it is above 0.99 only for Layer 1.
Therefore, by analyzing the above layer detection characteristics (which are equivalent to cutter wear), the gradient boosting classifier identifies the cutter condition quite accurately.