1. Introduction
The tunnel boring machine (TBM), with its advantages of high efficiency and safety, has recently been widely used in long and deep tunnels. Accurate and valid investigation of the geological condition of the surrounding rock is significant for selecting proper tunneling parameters to ensure tunneling safety and efficiency, which has become a critical research focus in tunnel construction. However, due to the constraints imposed by the intricate mechanical structure of the tunnel boring machine (TBM) and the confined working space, conventional in situ testing techniques for rock mass parameters, which are readily applicable to surface excavations, face significant challenges in underground environments. Consequently, numerous scholars have directed their efforts towards developing suitable methodologies for rock mass parameter determination. Naeimipour et al. devised a Rock Strength Boring Probe (RSBP) to acquire geological data, including uniaxial compressive strength and tensile strength, by analyzing the scratch depth measurements induced by the probe on the rock surface [
1]. Wang et al. developed a True Triaxial Rock Drilling Test System (TRD) capable of characterizing rock mass parameters under various lithological conditions [
2]. In addition, Goh et al. employed the Spectral Analysis of Surface Waves (SASW) method to analyze shear wave velocity for assessing the Rock Quality Designation (RQD) [
3]. Furthermore, Kong et al. [
4] and Liu et al. [
5] utilized point load test results to evaluate rock compressive strength.
The aforementioned studies provide valuable frameworks for acquiring geological data and assessing rock mass conditions within TBM-driven tunnels. Nevertheless, a common limitation among these approaches is the time-intensive nature of testing and analytical procedures, which inherently leads to latency in geological data acquisition. This makes it impractical to obtain data in real time at the pace of TBM excavation. To address this issue, establishing a correlation between real-time TBM tunneling data and rock mass parameters through statistical methodologies—including traditional regression analysis and data mining techniques—offers a viable solution for overcoming these latency constraints. Such research typically employs diverse datasets, either obtained from field measurements or laboratory tests, as training inputs. The prediction or evaluation targets then serve as the output, in which known data are fed into the model. Through regression or data mining algorithms, the functional relationships between inputs and outputs are established and subsequently utilized as the basis for evaluation. By integrating newly acquired data into these models, the corresponding outputs can be generated, which represent the predicted values of the target parameters.
Mikaeil et al. [
6], Hassanpur et al. [
7], Samaei et al. [
8], Nelson et al. [
9], Grima et al. [
10], and Entacher et al. [
11] proposed models to characterize the evolutionary relationship between rock mass properties and TBM tunneling parameters. Concurrently, advancements in computer technology have facilitated the integration of machine learning approaches, which exhibit exceptional capacity for handling regression problems involving large-scale datasets and complex nonlinear patterns. Specifically, Armaghani et al. [
12], Zare et al. [
13,
14], Mahdeveri et al. [
15], Liu et al. [
16], Yagiz et al. [
17], and Minh et al. [
18] applied various machine learning algorithms, such as artificial neural networks, particle swarm optimization, fuzzy logic, and gene expression programs, to construct the rock–machine interaction models, yielding favorable predictive outcomes. In the field of rock mass parameter prediction, there has been a trend of machine learning algorithms replacing traditional regression and becoming increasingly widely applied.
To facilitate their application in actual tunnel projection, researchers have proposed several rock mass classification methods, such as Q
TBM, RMR, RME, and GSI, which have been verified in actual tunnel projects and demonstrated effective practical applications [
19,
20,
21]. Among them, Hydropower Classification (HC) is a widely used rock mass classification method, particularly used in the construction of hydraulic tunnel projects [
20]. The main factors considered by HC are the uniaxial compressive strength and the integrity index of the surrounding rock. In addition, the discontinuity of the structural plane, attitude of the major discontinuity plane, and groundwater conditions are also used to modify the classification results.
In essence, rock mass classification methods, including the HC method, are effective indexes to characterize the rock mass parameters, and their prediction belongs to the classification problem in machine learning, which is different from the prediction of rock mass parameters. Supervised classifiers, such as support vector machine (SVM) and artificial neural networks (ANNs), are widely used to solve this kind of task. In addition, unsupervised algorithms, such as K-means clustering and fuzzy C means clustering, can only calculate the classification results without labels, which cannot be directly used in this task [
22,
23].
To bridge the gap between traditional unsupervised algorithms and supervised geotechnical tasks, this paper introduces the Spearman-Weighted Supervised Prototype Classifier (SW-SPC), an application-specific supervised prototype optimization approach, for evaluating the surrounding rock HC in TBM tunnels. Specifically, the real-time TBM tunneling parameters are utilized as input variables, and partial HC is the output. In the training set, the Spearman’s correlations between each input and the output are calculated and used as the weight of the corresponding input to measure distances between samples. On the basis of the weighted distance, samples are grouped into multiple prototype-defined clusters. Further, a multiple regression of the HC with the input of tunneling data is proposed to evaluate the expected HC values of the prototypes of categories and assign labels to them.
When developing SW-SPC based on evaluating HC, improvements are mainly made in two aspects. Firstly, in supervised tasks, the Spearman’s correlation between input and output is known by sample labels, which has a positive effect on the evaluating accuracy according to validation by field samples. Then, assigning labels to clustering categories by multiple regression effectively integrates the discriminative power of supervised learning into the prototype-based classification framework.
2. Data Collection
2.1. Original Data
This research is conducted based on a hydraulic tunnel project located in Northeastern China. The length of the tunnel is 23 km, and the main landforms consist of valleys and hills. The dominant lithologies in the project area are granite and limestone, which occupy about 38% and 30% of the total length of the tunnel. The development of fractures in the surrounding rock is relatively high, and the surrounding rock primarily consists of class III and IV rock masses. Especially in the limestone area, groundwater is abundant, and frequent seepage occurs on the tunnel face.
The tunnel is excavated by an open-type TBM, whose cutterhead is 3.95 m in radius. The TBM is equipped with 56 cutters, with diameters of 19 inches, and the cutterhead space is about 84 mm. With the TBM tunneling, nearly 200 tunneling parameters are recorded by the acquirement system of the TBM, including mechanical, electrical, and hydraulic data. The sampling frequency of these data is 1 Hz. In addition, the HC is recorded by artificial statistics every tunneling cycle. The tunneling data and HC constitute the original dataset for this research.
2.2. Division of the Dataset
Due to the complexity of the original data, it cannot be directly used to establish the HC prediction model. Firstly, the nearly 200 features of the tunneling data should be screened. In this research, redundant tunneling features which are not closely related to the results are removed [
16,
24], and only five features, including the revolution per minute of the cutterhead, the torque, the cutterhead power, the penetration, and the thrust, are used to implement the SW-SPC model. The abovementioned five parameters are directly related to the tunneling load or energy for rock-breaking, which are always regarded as the main controlling parameters in the field of TBM tunneling research [
25,
26]. Previous research investigated the relationship between the rock mass and tunneling parameters using statistical methods, such as principal component analysis (PCA), which proves that variables directly related to the tunneling process always have a higher correlation than other tunneling parameters, such as electrical and hydraulic parameters [
27].
In addition, the project comprises thousands of tunneling cycles, corresponding to thousands of samples, which is computationally prohibitive for the prototype optimization process. In this research, adjacent tunneling cycles with the same HC are considered as homogeneous segments and are merged into a single representative sample. In other words, a continuous tunneling section with the same HC is regarded as a sample, which consists of multiple tunneling sections.
By merging homogenized samples, thousands of tunneling cycles are transformed into a total of 275 samples. The 275 samples are divided into two sub-datasets. The first part includes 200 samples, making up the training set to establish the SW-SPC-based prediction model of HC. The other 75 samples make up the testing set to validate the trained model. Generally, the distribution of the training and testing samples should be different. The proportions of the different HCs of the surrounding rock in the training and testing sets are shown in
Figure 1.
3. Data Pretreatment
The inherent discrepancy between rock mass characteristics and TBM tunneling data poses a significant challenge in the field of predicting rock mass parameters or rock classification. In detail, a tunneling area with a rock classification value might correspond to tens of thousands of tunneling data points. Directly using the average values to represent the multiple tunneling data points is unreasonable, because the tunneling data recorded by the TBM acquisition system includes invalid data, such as the tunneling data in stoppage and trial tunneling, which always leads to an underestimation of the tunneling parameters. Therefore, the key of data pretreatment is to distinguish the invalid tunneling data generated by the TBM stoppage or trial tunneling. Currently, there is no universal distinguishing standard, and tunneling data pretreatment is often based on subjective experience.
Take the selected tunneling features recorded from 0:00:00 to 18:00:00 on 20 October 2015 as an example, which are listed in
Figure 2a. As shown in the data, during most of the listed time segment, such as the segment from 0:00:00 to 5:09:58, all the selected features are zero. These data are obviously from the TBM stoppage, and they should be removed in data pretreatment. In addition, there are short segments, in which only the cutterhead power and torque have positive values, and the other three tunneling data features are still zero, such as from 5:09:59 to 5:10:11. Meanwhile, in this kind of segment, the cutterhead power and torque always maintain low values which are obviously lower than during a normal tunneling cycle. In these segments, the TBM cutterhead is always idling, and it is not in contact with the tunnel face, which explains why the thrust and penetration rate remain at zero. This kind of segment is called trial tunneling, and its tunneling data should also be removed in data pretreatment.
The above sections represent obvious cases of data exclusion in pretreatment. However, in a whole tunneling cycle, there is still a portion of the data that should be removed. Take the tunneling cycle from 13:30:00 to 13:48:40 as an example, which is shown in
Figure 2b. In this tunneling cycle, the tunneling data can be obviously divided into three stages. Firstly, in the stage from 13:30:00 to 13:34:16 (the increasing stage), the tunneling data sharply increase from 0 to their maximum, and this stage can be called the increasing stage. Then, in the stage from 13:34:17 to 13:46:36, the tunneling data are relatively stable and fluctuate near their maximum, which is called the stable stage. In the stage from 13:46:37 to 13:48:40, the tunneling data decrease from the maximum to 0, which is called the decreasing stage. Previous research indicates that the trends of tunneling data in the increasing or decreasing stage are influenced not only by the rock mass condition, but also by the operating habits of the TBM workers. Therefore, in data pretreatment, only the tunneling data in the stable stage should be retained [
28].
According to the above analysis, the target of data pretreatment is to exclude tunneling data from stoppage, trial tunneling, and the increasing/decreasing stages, while utilizing stable-stage data to construct the final tunneling dataset.
Based on the characteristics of the tunneling data, a total of three steps were conducted to pretreat the tunneling data and build the tunneling dataset. Firstly, the tunneling data in stoppage and trial tunneling stages were removed. Then, the tunneling data in the increasing and decreasing stages were removed, and the tunneling data in the stable stage were screened. Finally, for each sample, the average value of tunneling data within the corresponding mileage was used as the tunneling feature.
These three steps are a general method to handle TBM tunneling data in the field of predicting rock mass parameters. Among them, the first and last steps, as well as removing the stoppage data and calculating the average value of the tunneling data, are relatively simple. In contrast, there is no widely recognized criterion for screening tunneling data in the stable stage due to the difference in the data trends of different TBM specifications. In this research, the penetration rate, defined as the product of penetration and rotation speed, is adopted as the criterion for screening the stable stage. In each tunneling cycle, when the tunneling data shows a penetration rate higher than 10 mm/min for a continuous duration of 10 s, it is judged as the beginning of the stable stage. Similarly, when the tunneling data shows a penetration rate lower than 10 mm/min for a continuous duration of 10 s, it is judged as the end of the stable stage. Take the tunneling data shown in
Figure 2a as an example. The pretreated results are shown in
Figure 3. The pretreated data were used for training and testing the SW-SPC.
4. Formulation of the Spearman-Weighted Supervised Prototype Classifier (SW-SPC)
4.1. Mechanics of the Prototype Optimization Classifier
On the basis of the pretreated and refined data, a baseline prototype optimization classifier using unweighted distance is developed as a supervised framework, which is conceptually aligned with the K-means iterative refinement process, to categorize tunneling areas according to their tunneling parameters. It has the advantage of a fast calculation speed and good interpretability. A small number of samples is sufficient for a relatively stable stratum without sharp changes. Hence, the rock mass dataset usually contains tens to hundreds of samples, and the prototype optimization classifier performs sufficiently well for such a volume of data.
Before executing the unweighted distance-based prototype optimization classifier, the number of target categories
n should be determined based on experience or trial-and-error. On this basis, n samples are randomly selected as the prototype of each category, and recorded as
x1,
x2, …,
xn. Each selected sample represents a category (
c1,
c2,…,
cn). The Euclidean distances between each sample and category prototype can be calculated as shown in Equation (1).
where
dj is the difference in the
jth feature between the sample and its assigned prototype. On this basis, each sample is assigned to the nearest prototype, and the updated prototypes are recalculated as ([
29,
30]):
where
Ci and
ni are, respectively, the prototypes and total number of samples of the
ith category, and
xijk is the value of the
jth feature of the
kth sample in the category. The prototype of each category can then be calculated, and this is regarded as the new prototype. The calculation of prototypes and division of samples is performed iteratively until the results are stable.
According to the principle of prototype optimization and the structure of the dataset, the prediction code for HC is developed. The code consists of six parts, including data import, prototype initialization, distance calculation, sample classification, iteration, and results export. The structure and synopsis of the code are shown in
Figure 4.
K-means is a typical unsupervised algorithm, which can only divide samples into different categories, and the categories do not have labels. However, evaluating HC is a supervised task, whose results include labels [
31]. Therefore, K-means cannot be directly used to evaluate HC. In this research, the K-means clustering is only used as a baseline for prototype optimization with unweighted distance. Further, the classical distance metric of the K-means operator is refined by Spearman-derived non-uniform weights, thereby reforming the algorithm into a strictly supervised prototype learning framework. Weighted distance and assigning ordered value will be introduced in
Section 4.2 and
Section 4.3.
4.2. Distance Metric Refinement Based on Spearman Ranks
In clustering problems, the classification results are determined by multiple inputs. The influence of each input feature on the classification outcome varies. In fact, a stronger correlation between an input feature and the target variable signifies a greater influence, and in K-means clustering, the higher weight should be given to the distance of the input. In conventional K-means clustering, the input features are equally weighted. Equal weights are typically adopted in conventional K-means clustering because it is primarily designed for unsupervised learning, in which the influence of the input features on the output classification is unknown. However, in the issue of HC, there is clear physical significance and data basis of the input and output, and the influence of the input on the output can be analyzed by statistics. Given that HC is an ordinal variable, Spearman’s rank correlation is employed instead of Pearson’s correlation as a reference to modify the distance in K-means clustering.
Before calculating Spearman’s correlation by the 200 samples in the training set, their tunneling data are ranked in ascending order from 1 to 200, which is recorded as
xi, while
i represents the sample number. In addition, the HC values are also ranked in ascending order and recorded as
yi. According to
Figure 1a, samples with class II are ranked from 1 to 26, and the average order, 13.5, is recorded as these samples’ rank data
yi. Similarly, the rank data
yi of class III, IV and V samples are calculated as 60, 131.5, and 185. According to the data
xi and
yi, the Spearman correlation
can be calculated by Equation (3).
In Equation (3),
n represents the total sample numbers in the training set, which is 200. The distribution of each tunneling feature to the HC is shown in
Figure 5.
As
Figure 5a–e show, there is a significant difference in the selected tunneling data distribution under different HCs, which also proves the relevance between tunneling data and the rock mass classification. Take the distribution of revolutions per minute as an example, which is shown in
Figure 5a; with an increase in the HC value, the integrity of the surrounding rock decreases, and the selected rotation speed decreases to ensure that the volume of muck generated during tunneling does not exceed the load-bearing capacity of the belt conveyor. Therefore, according to the statistical results, the average pretreated revolutions per minute under surrounding rock with classes II, III, and IV are 0.77, 0.35, and 0.26. Similarly, penetration increases with an increase in HC, as shown in
Figure 5d. According to the training data and Equation (3), the Spearman’s correlations between the selected five tunneling features and the HC and the corresponding
p values are calculated and listed in
Table 1.
As shown in
Table 1, the absolute values of Spearman’s correlation for rotation speed and thrust are notably higher than those of the other three features, which proves that the two features have higher influence on the HC than the other three features. In particular, the
p values of the five features are lower than 0.01, implying that the null hypothesis is rejected, which indicates a statistically significant correlation between these features and the rock mass HC of the surrounding rock. In addition, a sensitivity analysis is conducted to further test the stability of Spearman’s correlation by 1000 times bootstrap. The testing results are shown in
Figure 6, and the 95% confidence intervals of the five tunneling data points, with the order shown in
Table 1, are [−0.35, −0.58], [−0.10, −0.38], [−0.21, −0.47], [0.10, 0.36], and [−0.42, −0.63].
Accordingly, their distance weights in the prototype optimization classifier should be higher. Therefore, the absolute values of Spearman’s correlation listed in
Table 1 were used as the distance weights, and the expression of distances (Equation (1)) is modified as shown in Equation (4).
where
d in Equation (4) denotes the modified distance and
w is the sum of the absolute values of the Spearman’s correlation coefficients for the five selected features, which is 1.82 according to
Table 1.
is the distance weight of each tunneling feature, and
,
, …,
are 0.46, 0.24, 0.34, 0.24, and 0.54.
4.3. Assigning Ordered Values to the Prototype
Before evaluating the HC of the surrounding rock by prototype optimization, the problem of assigning the values of the class prototypes should be solved. On the basis of the field data, the HC is ordered known data, and the evaluation of HC for surrounding rock is treated as a supervised classification problem [
31]. In
Section 4.1, the prototype optimization process leverages the iterative logic inherent in K-means clustering, which is a classical algorithm primarily utilized in unsupervised learning scenarios and cannot be directly used to solve the supervised problem. In other words, samples can be assigned to a specific category, but their corresponding HCs remain unknown. In this section, a method for assigning values to categories obtained by the prototype optimization is introduced to effectively integrate the discriminative power of supervised learning into the prototype-based classification framework.
To assign values to each category, the most probable HC for each prototype must be determined. For this purpose, a normalized training set is used to establish a multiple regression equation for HC using the least squares method. Its basic form is shown in Equation (5).
where
is the HC value calculated by the multiple regression equation and
n is the number of categories, which is set to 4 in this research.
is the calculated weight of the
ith input variable,
is the normalized
ith input variable, and
c is a constant. The 200 training samples are used to search for the optimal combination of weights by iterations. In each iteration, the calculated error is evaluated using Equation (6).
where
HC represents actual HC values in the training data and
m is the number of training samples, which is 200. According to the variable order listed in
Table 1,
x1 to
x5 represent revolutions per minute, torque, cutterhead power, penetration, and thrust, respectively, and the corresponding coefficients
to
are calculated as −1.38, −0.97, −0.84, 0.88, and −1.64, respectively. The constant
c is calculated as 5.35. According to the calculated
and
c, the multiple regression equation is established and used to assign the HC values of categories obtained by K-means clustering. Notably, standard linear regression struggles to adapt to complex tunneling field data and provide satisfactory evaluation results. Therefore, the multiple regression (Equation (5)) is only used as a rough indexing tool to order the prototypes.
It should be noted that applying MLR to ordinal HC is a statistical approximation, as the intervals between the classifications may not be strictly uniform. Future work may explore ordinal regression models or nonlinear mapping techniques to further refine the classification boundary.
4.4. Training of the Prediction Model of HC
The SW-SPC is applied to the 200 training samples. Given that the training set comprises four classes of surrounding rock, the number of clusters, k, is set to 4. Through the alternating calculation of prototypes and their corresponding distances, four clusters are successfully identified. The iterative convergence process of the distances is illustrated in
Figure 7.
As shown in
Figure 7, after 10 iterations, the prototypes of the four categories reach a stable state, and the classification results remain unchanged. The prototypes of the four categories after normalizations are listed in
Table 2.
Table 2 shows the prototypes of the four obtained categories. Further, the HC values of the four categories should be assigned by the multiple regression equation introduced in
Section 4.3. Substituting the prototype data listed in
Table 2 into Equation (5), the calculated
values for Categories 1 to 4 are 3.03, 4.08, 2.06, and 4.90. Ranking these four categories in ascending order by their calculated
values yields the sequence: Category 3, 1, 2, and 4, which correspond to rock mass classes II, III, IV, and V, respectively.
By the SW-SPC, the 200 samples in the training set are divided into four categories, and the HC value of clustering is given. The given HC is compared to the actual investigated HC value, and the results are shown in
Figure 8. In addition, precision and recall, as widely adopted evaluation indices for classification problems, are utilized to assess the classification performance, which can be calculated by Equations (7) and (8).
In Equations (7) and (8), TP represents the number of samples correctly classified into a specific category. FP denotes the number of samples incorrectly assigned to this category from other classes. FN represents the number of samples belonging to this category but incorrectly assigned to other classes. Precision and recall are employed to evaluate the performance of each specific category rather than the overall classification accuracy. Take the sample with the classification of IV as an example; the TP is 68, the FP is 1 + 4 + 6 = 11, and the FN is 1 + 0 + 7 = 8. Therefore, the precision and the recall of the sample with the HC of IV are 86.1% and 89.5%.
As shown in
Figure 8, the SW-SPC yielded correct classification results for the majority of samples in the training set. For instance, regarding class II rock mass, the SW-SPC correctly identified 21 out of 26 samples, and the precision and recall were 87.5% and 80.8%. The precision and recall of the four kinds of samples in the training set are listed in
Table 3. It is shown that the precision and recall of nearly all the four kinds of samples are higher than 80%, with their average values reaching 86.4% and 85.5%. Only the precision of class
V rock is below 80%, reaching 78.1%. This discrepancy may be attributed to the imbalance of the training set, in which there were only 31 samples with a HC of
V. Overall, the classification results are acceptable, and the prediction model of HC based on the SW-SPC is trained. In other words, the prototypes listed in
Table 2 are regarded as the correct ones, and they are applied to the test set to further validate its prediction performance.
4.5. Results
Section 4.4 details the training process of the prediction model based on the SW-SPC, which demonstrates a satisfactory performance on the training set. To further validate the model’s predictive capability, it is applied to a testing set comprising 75 samples. The testing samples are collected from the same project, which is introduced in
Section 2 of this paper.
A comparison between
Figure 1a,b reveals a significant difference in the distribution of the four surrounding rock classes between the training and testing sets. Notably, the testing set exhibits a more balanced class distribution. Samples with HCs of II and V, whose proportions in the training set are only 13% and 15.5%, occupy 18.7% and 26.7% in the testing set. Based on the prototypes listed in
Table 2, the 75 testing samples are directly categorized, yielding the prediction results of the SW-SPC. The comparison results between the SW-SPC and the actual HC are shown in
Figure 9a. The precision and recall of the samples in the four categories are listed in
Table 4.
The average precision and recall on the testing set achieved 84.6% and 81.7%, respectively, with each class exhibiting a precision and recall exceeding 75%. Although the testing set performance metrics showed slight decreases of 1.8% and 3.8% compared to the training set, these results remain highly satisfactory. These findings demonstrate that the method is helpful for predicting the HC of the tunnel surrounding rock.
To further validate the effectiveness of the proposed method, a random classifier was employed as a baseline on the test set. In the random classifier, the probabilities of the samples being evaluated as II, III, IV, and V are set as 13%, 33.5%, 38%, and 15.5%, respectively, which equal the corresponding probabilities in the training set shown in
Figure 1a. As illustrated in
Figure 9b, the average precision and recall achieved only 24.5% and 24.2%, respectively. These values are significantly lower than those of the proposed method, further demonstrating the superior effectiveness of our approach.
6. Conclusions
The main conclusions of this paper are summarized as follows.
1. This paper introduces a method to predict the HC of tunnels excavated by a TBM. For this purpose, hundreds of samples with matched HC and tunneling data are needed. Using the tunneling data as input, the SW-SPC categorizes the data into distinct groups corresponding to the specific HC of the surrounding rock.
2. Based on the field tunneling data distribution, the SW-SPC based on the Spearman’s correlation between HC and tunneling data is introduced. In the prototype optimization with unweighted distance, all of the distance weights of different input features are selected as 1, which means the input features have the same influence on the output. Because the correlation between each tunneling feature and the HC is known, Spearman’s correlation is used as the weighting factor, which results in tunneling features with a higher correlation with the HC having higher weights. This approach demonstrates a positive impact on the prediction accuracy.
3. Based on a tunnel project located in Northeast China, the tunneling data recorded by the acquisition system equipped on the TBM and the corresponding HC data were collected. After pretreatment, a total of 275 samples with matched tunneling data and HC are obtained. Among them, 200 samples made up the training set to establish the prediction model and calculate the prototypes, and the other 75 samples made up the testing set to verify the performance of the model. The SW-SPC model achieved an accuracy of 82.7% (precision: 84.6%, recall: 81.7%) on the static test set. Furthermore, 5-fold cross-validation yielded an average accuracy of 82.0% (precision: 59.3–61.0%, recall: 58.3–66.5%), demonstrating robust global performance while reflecting sensitivity to class distribution variance inherent in field-measured tunneling data.