Correlation Stability Problem in Selecting Temperature-Sensitive Points of CNC Machine Tools

: In the thermal-error compensation of CNC machine tools, temperature-sensitive points (TSPs) are used for predicting thermal error and need to have a high correlation with the thermal error. The stability of the correlation between TSPs and the thermal error is the key to long-term prediction accuracy. In this paper, the uncertainty-calculation method of the correlation coefficient is proposed to measure the stability of the correlation, and the reasons that affect the stability of the correlation of TSPs are analyzed. Then, the uncertainty-correlation coefficient is proposed, which can comprehensively evaluate the correlation and the stability of the correlation between TSPs and the thermal error. Through long-term experimental verifications, compared with the current TSP selection algorithm, the uncertainty-correlation coefficient can help to select a more stable TSP and improve the long-term prediction accuracy of the thermal error.


Introduction
During the working process of the CNC machine tool, the thermal deformation of the structure will cause the tools to move, resulting in thermal error. The influence of the thermal error on machine-tool accuracy cannot be ignored [1,2]. Thermal-error compensation is a widely used method for solving the thermal error problem. The principle is to send the thermal error into the CNC system as a feedback signal. Then the CNC system controls the tool to move the same value in the opposite direction of the thermal error to achieve compensation [3]. The origin offset function is a built-in module of the CNC system of the machine tool that can automatically complete the entire compensation process without affecting the normal processing of the machine tool [4,5]. Therefore, the focus of thermal-error compensation is how to obtain accurate thermal error values during machining. In the machining process of machine tools, it is difficult to install sensors to measure high-speed rotating tools, and it is necessary to use temperature to predict thermal error. Therefore, how to establish an accurate model between the temperature and thermal error is the key to thermal-error compensation. Typically, the model is called a thermal-error-compensation model.
The thermal error originates from the thermal deformation of the machine tool's structure, so some studies have used FEM to simulate thermal deformation. Through FEM simulation, Ni [6] found that, if a constraint is imposed on the front end of the spindle (near the tool), and the rear end is kept free to expand, the thermal deformation can be guided to the rear end of the spindle, thereby reducing the thermal deformation at the tool position. Wang [7] analyzed the spindle deformation caused by cutting force and temperature through FEM simulation and found that thermal deformation accounts for 90%, so the thermal error is the main error source of the machine tool. Wei [8] analyzed the thermal deformation of the shaft components in the axial direction and found that the temperature points that maintain a linear relationship with the thermal error are more stable for thermal-error prediction. The generation mechanism of the thermal error can be deeply analyzed through simulation, but due to the complexity of the overall structure of the machine tool, the current research can only simulate a certain part of the machine tool, and some necessary simplifications need to be made in the simulation. The simulation of a certain part cannot reflect the influence of the assembly method on thermal deformation, and simplification may cause simulation errors. At present, the simulation can only provide qualitative conclusions and cannot quantitatively calculate the thermal error of the machine tool. Therefore, the data-driven method is often used in thermal-errorcompensation modeling. This method needs to install sensors to measure the thermal error and the temperature of multiple points near the heat source of the machine tool during the idling state. Then a thermal error compensation model can be built based on the measurement data [9,10]. The thermal-error-measurement method refers to the international standard ISO230-3 [11]. It obtains the tool-position change by measuring the repeated positioning error of the machine tool after heating, that is, the thermal error. This method is well-established and commonly used in the current research [12][13][14]. Since the machine tool cannot work during thermal-error measurement, frequent measurements can lead to a significant reduction in machining efficiency. Furthermore, the accuracy of the model needs to be stable for a long time. Therefore, existing research focuses on the modeling algorithm, which needs to be selected according to the thermal-error characteristics. Usually, the relationship between the thermal error and temperature is mainly linear, so multiple linear regression is a commonly used modeling algorithm [15,16]. Neural network algorithms are often used to fit the nonlinear relationship between thermal error and temperature [17,18]. There are also studies that use the LSTM algorithm to model thermal error lags relative to temperature [19]. Although a highly nonlinear model can achieve higher fitting accuracy, it is also easier to fit the interference information of nonlinear components into the model, resulting in overfitting. Therefore, when the operating environment of the machine tool is more complex, the stability of the prediction accuracy of the linear model is better [20].
In addition to modeling algorithms, temperature is a source of information for model predictions of thermal errors. Therefore, if the temperature measurement points are not properly selected, it is impossible to provide effective temperature information for thermal-error prediction, and an effective thermal-error-compensation model cannot be established only by modeling algorithms. Therefore, it is very important to select reasonable temperature measurement points as the model input. The selected temperature points are usually called temperature-sensitive points (TSPs). Lo [21] proposed a selection strategy of correlation ranking and clustering. First, the initial temperature measurement points are grouped; then, from each group, one of the highest correlations with the thermal error is selected as the TSP. In this way, the cluster can solve the collinearity problem of TSPs and reduce the probability of overfitting of the thermalerror-compensation model [22]. Most studies have followed this strategy. Abdulshahed [23] and Miao [24] used the fuzzy clustering algorithm for grouping and used gray correlation to calculate the correlation between temperature measurement points and thermal error. Li [25], Fu [26], Zhang [27], and Yin [28] used the fuzzy clustering algorithm for grouping and used the correlation coefficient to calculate the correlation between temperature measurement points and the thermal error. The difference between the above research lies in the modeling algorithm. The most commonly used modeling algorithms are multiple regression and neural network. This study focuses on the selection of TSPs, so the modeling algorithm is not discussed in depth. Liu [20] found that the temperature measurement points that have a strong correlation with thermal errors often have high collinearity, so they will be grouped into the same group, and then the correlation between the thermal error and TSPs selected in the other groups is weak. TSPs with weak correlations will introduce a large amount of interference in the thermal error compensation model, and this will greatly reduce the long-term prediction accuracy of thermal errors. Furthermore, in Reference [20], TSPs were directly selected by correlation ranking without the clustering, and ridge regression was used to solve the collinearity problem, which greatly improved the long-term stability of the thermal-error-prediction accuracy.
In summary, the high correlation with the thermal error is the key to selecting TSPs. A premise of the existing research is that the correlation-coefficient calculation result is credible. However, is it really credible? Existing research shows that abnormal data will interfere with the calculation results of correlation coefficients. Additionally, the interference of abnormal data can be reduced by improving the calculation method of the correlation coefficient. Niven [29] proposed the LOOT correlation coefficient in which the test data are expanded into multiple segments through re-sampling, and the correlation coefficient of each segment is calculated; finally, the weighted average is calculated as the result of the correlation coefficient. The Spearman correlation coefficient [30] and Kendall correlation coefficient [31] are rank correlation coefficients, and the correlation is calculated by sorting the data. The above research can solve the calculation error of the correlation coefficient caused by the abnormal data. However, for thermal errors, the instability problem is another reason why the correlation between the thermal error and temperature measurement points is not credible. Temperature measurement points will be affected by a variety of heat sources, such as changes in ambient temperature and temperature changes caused by human activities [32]. There is no guarantee that the influence of these heat sources on the thermal error is stable, nor can it be guaranteed that the correlation calculation result of measurement data is stable [33].
Through a large number of experiments, this study found that the unstable temperature measurement points cannot maintain a long-term high correlation with thermal errors. If they are selected as TSPs, a large amount of interference information will be introduced into the thermal error compensation model, resulting in a rapid decline in the accuracy of thermal error prediction. This paper analyzes the temperature characteristics of unstable temperature measurement points and proposes the uncertainty calculation method of the correlation coefficient to measure the stability of the correlation. Then, combined with the correlation coefficient, a new correlation calculation method is proposed, which is called the uncertainty-correlation coefficient in this article. This algorithm enables a comprehensive evaluation of the correlation and the stability of the correlation. Therefore, during the selection of TSPs, unstable temperature measurement points that are temporarily correlated with thermal errors can be filtered out, so as to improve the stability of TSPs. After a large number of experimental verifications, compared with the current TSP selection method and the current correlation calculation method, the uncertainty-correlation coefficient can select better TSPs, which can improve the thermal-error-prediction accuracy and the long-term stability of the prediction accuracy. Figure 1 shows the working principle of the thermal-error compensation. Thermal errors will cause the tool to deviate from the ideal position and cause machining errors.

Principle of Thermal Error Compensation
After the thermal error ( ) occurs, the thermal-error-compensation model is used to predict the thermal error based on the TSPs. Then the predicted value is used as the feedback control signal ( ) and is sent to the CNC system. Finally, the CNC system controls the tool to move the same value in the opposite direction of the thermal error, thus reducing the thermal error. The accuracy of thermal-error compensation depends only on the prediction accuracy of the thermal-error-compensation model. Figure 2 shows the principle of the data-driven method to establish a thermal-error-compensation model. As seen in Figure 2, first operate the machine tool to run in an idling state. Second, the thermal error and initial temperature points are simultaneously measured. Third, the TSPs are selected as model input, and the output is the thermal error. Finally, the thermal-error-compensation model is established according to the measured data. The thermalerror-measurement method refers to the international standard [11]. The principle is to measure the repeated positioning error of the machine tool after heating. First, replace the tool with a test bar and install displacement sensors around the bar. Then measure the position change in the test bar when the machine tool moves to the same coordinate after heating up. When the thermal error is not considered, the repeated positioning accuracy of machine tools is high [34], so this method can guarantee the accuracy of the thermal error measurement.

Correlation-Stability Problem in TSP selection
The key step in the selection of TSPs is to calculate the correlation between the thermal error and each temperature measurement point. However, this study found that the correlation between some temperature measurement points and the thermal error is not stable, especially the measurement points with small temperature changes. This article has elaborated on the theoretical analysis of this and proposed the uncertainty calculation method of the correlation coefficient to measure the stability of the correlation. Finally, the uncertainty-correlation coefficient calculation method is proposed to select more stable TSPs.

Proof.
The correlation coefficient between variables and is shown in (4).
The correlation coefficient represents the angle between the vectors * and * . Therefore, evaluating the influence of interference on the angle of the vectors is equivalent to evaluating the influence on the correlation coefficient.
If the changes in data and are very small during the measurement, the elements in the vectors * and * are close to 0. Moreover, if the spatial direction of one vector changes due to interference, the angle between the two vectors will change too. Then the influence of interference on the angle between two vectors is equivalent to evaluating the influence of interference on the direction of one vector.
Take the variable X as an example. Suppose that an element in the vector * , such as * , becomes * + ∆ * under interference; then the cosine of the angle of the vector * changes, as in (5).
represents the modulus of the vector * ; the closer the element in * is to 0, the smaller the is. Find the derivative of (∆∠ * ) with respect to , as in (6).
According to (6), if the conditions (2) are met, the formula (6) is greater than 0 ( (∆∠ * ) is proportional to ), and (∆∠ * ) is inversely proportional to ∆∠ * ; it can be obtained that ∆∠ * is inversely proportional to . Similarly, the proof for variable Y is the same as for X. Therefore, if the conditions (2) are met and the changes in data and are small during the measurement, the correlation coefficient is easily affected by interference.
For thermal-error experiments in this study, the length of each measurement datum is long, meaning that it is easy to meet the conditions (2). Therefore, if the change in a temperature point during the measurement is small, the correlation between this temperature point and the thermal error will become very sensitive to the interference. For a temperature point, the interference is not only due to random errors, but also includes the influence of other heat sources. If the interference shows the same trend as the thermal error in the experimental data, it will cause the calculation result of the correlation coefficient to increase. Therefore, if a temperature point changes slightly, it not only causes the correlation coefficient to decrease, but also makes the correlation coefficient unstable.

Uncertainty-Correlation Coefficient
If the correlation coefficient is high, this might be caused by the instability of the calculation results. Therefore, this paper proposes the use of uncertainty to measure the stability of the correlation coefficient. According to the synthesis method of uncertainty [35], the calculation method is as follows (7).
9) * and * represent the measurement uncertainty of the sensor to the variables and , respectively. For temperature sensors, the uncertainty is 1 °C. For the eddy current sensor used for thermal-error measurement, the uncertainty is 2 μm. The higher the uncertainty, the lower the stability of the calculation results.
To improve the stability of the correlation coefficient, the correlation coefficient and the uncertainty are combined, and this combination is called the uncertainty-correlation coefficient in this article.
where ≥ 0 is the weight of the uncertainty in the calculation. For the machine-tool thermal error and temperature-measurement data, in addition to the sensor error, the random variation of some factors, such as the ambient temperature and machineoperating parameters, is also the reason for the uncertainty of the correlation coefficient.
There are many factors involved in this part. The ambient temperature is related to unpredictable factors, such as climate change and human activities. The operating parameters of the machine tool can affect the heat-generation characteristics of the machine tool, thereby affecting the thermal error by changing the thermal-deformation characteristics of the structure. Research on this is very difficult. As the structure of the machine tool is too complex, the current related research only simulates part of the structure of the machine tool, and some assumptions need to be made. As a result, the research on thermal deformation can draw some qualitative conclusions, but it is difficult to apply quantitatively [6][7][8]. Therefore, the effects of random changes in factors cannot be accurately calculated. When calculating the uncertainty-correlation coefficient, this paper chooses = 2; that is, the uncertainty caused by the sensor error is doubled to cover the influence of random changes in factors. Through experiments, the stable TSPs can be selected by = 2 (see Section 5.2). The uncertainty-correlation coefficient is between [−1, 1], which is proportional to the correlation coefficient and inversely proportional to the uncertainty. If the uncertainty increases, the uncertainty-correlation coefficient will decrease. Therefore, when the uncertainty-correlation coefficient is used for the selection of TSPs, the temperature measurement points with stable correlation will be selected first.

Thermal-Error-Measurement Experiment
The stability of the correlation between thermal error and TSPs requires long-term experimental verification. A vertical machining center was chosen for thermal-errormeasurement experiments over six months. The main parameters of the machine are as follows.
Model: Vcenter-55, Victor Taichung Machinery Works Co., Ltd. Stroke: X, 550; Y, 460; Z, 460 mm. Size: 1955 × 2350 × 2500 (high) mm. Maximum spindle speed: 8000 rpm. Positioning accuracy: 5 μm. Figure 3 showed the measurement of the thermal errors, using three eddy sensors in the x-, y-, and z-axis, respectively. A total of 16 temperature sensors were installed near the heat source of the machine tool to measure the initial temperature points (Figure 4).   (11) where ∆ ( ), ∆ ( ), and ∆ ( ) are the thermal errors in the x-axis, y-axis, and z-axis, respectively; and ∆ ( ), = 1, … ,16 are the changes in 16 initial temperature points. The reason for using the temperature change is that the thermal error is the position change in the tool after being heated, which is a relative error, so it is only necessary to pay attention to the temperature change after the machine tool has run.
Each experiment lasted 4 to 5 h. On different dates, a total of 27 batches of experiments were carried out, denoted as D1~D27. The ambient temperature of each experiment is shown in Table 1. The temperatures and the thermal error of experiments D1, D13, and D27 are shown in Figure 5. They represent the data at the beginning, middle, and end of the experiment, respectively. In order not to make the paper too long, other data are not shown. The x-axis thermal error change was very small, due to the symmetrical structure of the machine tools. Therefore, this article mainly focuses on the y-axis and z-axis.

Analysis of Correlation Stability of TSPs
According to the commonly used correlation coefficient, the two temperature points with the highest correlation with the thermal error were selected as the TSPs, as shown in Table 2. The number of TSPs depends on the previous engineering experience [36]. In Table 2, for the y-axis, the most frequent TSP combinations are T3 and T5. However, they are easy to change, as with T6, T7, T10, T14, and T15. For the z-axis, mainly T1 and T4, T7, T8, T10, and T13 appeared in some data. We calculated the correlation coefficient between these TSPs and thermal error in each measurement datum, and the results are shown in Figure 6. In Figure 6, the TSPs can be divided into two categories: one is stable (the correlation remains stable in different batches of measurement data), and the other is unstable. The correlation between unstable TSPs and thermal errors can easily change and sometimes even exceeds stable TSPs. The unstable correlation means that TSPs are easily disturbed,

Correlation coefficient Correlation coefficient
Correlation coefficient Correlation coefficient y-axis: unstable y-axis: stable z-axis: stable z-axis: unstable and this will seriously damage the prediction accuracy of the thermal errors. The analysis of accuracy was as follows.
The stable TSPs and unstable TSPs were selected to establish the thermal-errorcompensation model, and the thermal-error-prediction accuracy of the two temperaturesensitive points was compared. The TSPs involved in the comparison are shown in Table  3. Table 3. TSPs for model-accuracy comparison.

y-axis
(1) Choose a TSP combination and establish the thermal-error-compensation models in the y-axis and z-axis, respectively. The models established by experiments D1~D27 are referred to as M1~M27. The modeling algorithm is the ridge regression algorithm; the ridge parameter was obtained by the previous study [20]. (2) Choose a model and bring the TSPs data of one experiment into the model to obtain the thermal-error-predicted value. Then calculate the root-mean-squared error (RMSE) by (12) to measure the prediction accuracy. The bigger the RMSE, the worse the prediction accuracy [37].
where → is the RMSE obtained by bringing the temperature data of experiment into the model , is the length of the measurement data of , is the -th thermal error measurement value of , and is the corresponding thermal error prediction value.
(3) For each model, use the average RMSE of all batches of experiment data to evaluate the model accuracy. The calculation method is shown in (13).
The of each model is shown in Figure 7. According to Figure 7, the prediction accuracy of the stable TSPs is significantly better than the unstable TSPs. This means that the stable TSPs are very important for thermal error prediction.

S Mi (um) S Mi (um)
y-axis z-axis

TSP Selection Result Based on Uncertainty-Correlation Coefficient
To test the effect of the uncertainty-correlation coefficient on the selection of TSPs, this study selected the commonly used TSP-selection algorithm for comparison. Meanwhile, the widely used improved correlation-coefficient calculation method was also selected for comparison. The specific selection method of TSPs involved in the comparison is shown in Table 4. The methods in Table 4 are divided into two types: (1) Directly select the two with the highest correlation with the thermal error as TSPs.
(2) Based on the fuzzy clustering algorithm [8], the temperature points are divided into two groups; then, from each group, the one with the highest correlation with the thermal error is selected as the TSP. For each method, the average of all models was used for accuracy evaluation. The calculation method is as follows: (1) Build the thermal-error-compensation model according to the measurement data of D1~D27, referred to as M1~M27, respectively. The modeling algorithm is ridge regression. (2) Calculate the of each model according to (12) and (13), and then calculate the average of all models to measure the accuracy of the TSP selection method, as shown in (14).
For ease of presentation, the specific TSP-selection results of each experimental datum are not provided; instead, the distribution of TSPs in the selection results of 27 batches of data is displayed. The results are shown in Figures 8 and 9. The accuracy evaluations of each TSP-selection method are shown in Table 5.   According to Figures 8 and 9, the TSPs selected by the UCC are concentrated in T1~T5. According to Table 5, the UCC is the best, and the UCC + C is the best among all the cluster methods.

Discussion
The TSPs selected by UCC were concentrated in T1~T5, indicating that only the TSPs near the spindle can have an excellent thermal error prediction effect. The reasons for this are as follows.
(1) The spindle motor generates a large amount of heat and is close to the tool, contributing the most to the thermal error. The heat source of other temperature points is the feed motor. Compared to the spindle motor, the speed is lower, the heat generation is less, and the contribution to the thermal error is also small. If the effective information of these temperature points is less than the interference, it will cause a decrease in prediction accuracy. Therefore, abandoning these temperature points has little effect on the model, and it even helps to improve the accuracy. (2) The heat will be conducted in the machine-tool structure, leading to the correlation between the temperature points of different parts of the machine tool. Therefore, the temperature-measurement data of the spindle include temperature information on other heat sources.
The following are the explanations of questions that readers may have regarding the experimental results.
Q1: For the y-axis, why does the UCC-C method have only one TSP near the spindle and can achieve the same effect as UCC? There are two main reasons for this: (1) For the TSPs selected by the UCC-C method, the model coefficient of the temperature point near the spindle is much larger than the other one (close to 0). This means that, for the y-axis, one TSP can also provide sufficient temperature information for thermal error prediction.
(2) Collinearity between TSPs will cause the model prediction accuracy to decrease. Collinearity refers to the correlation between TSPs. High collinearity can easily cause overfitting of the model [38] and a decrease in prediction accuracy. The purpose of ridge-regression modeling in this study was to suppress collinearity. Ridge regression relies on the ridge parameter. If the ridge parameter is too big, it will cause the model coefficient to tend to 0 and lose accuracy. If the ridge parameter is too small, collinearity cannot be suppressed. The UCC-C will cluster first, and TSPs with low collinearity are preferred. Therefore, for the modeling algorithm, the influence of collinearity is easier to suppress. Otherwise, the ridge parameter in this paper depends on a large amount of previous experimental experience [20], but this is not ideal, so it is normal that the effect of the UCC-C is better than the UCC. If the ridge parameter is optimized for these data, a model with a higher UCC than UCC-C may be found. However, the accuracy difference between the two algorithms is too small, and this optimization is unnecessary.
Q2: How is the number of TSPs determined? At present, it mainly relies on experience.
This study used previous research experience [36] and found that choosing two TSPs works well. Further research on this is needed.

Conclusions
(1) This article studied the temperature-sensitive points (TSPs) selection method of the thermal-error-compensation model. The principle is to select temperature points with high correlation with the thermal error. Through long-term experiments, the correlation between some temperature measurement points and thermal error may be unstable and will change with the change in the temperature conditions of the machine tool. Therefore, widely used methods may not be able to select the TSPs that have a stable high correlation with the thermal error. Unstable TSPs will introduce a large amount of interference into the thermal-error-compensation model, thus greatly reducing the long-term prediction accuracy of thermal errors. (2) This article proposed the uncertainty-calculation method of the correlation coefficient to measure the stability of the correlation. Additionally, we proposed the uncertainty-correlation coefficient algorithm. This algorithm combines the correlation coefficient and its uncertainty. It is proportional to the correlation coefficient and inversely proportional to the uncertainty. Therefore, when applied to the selection of TSPs, if a temperature point shows a temporary high correlation with the thermal error, it will be eliminated due to high uncertainty. Therefore, this method will prioritize selecting TSPs that can maintain a long-term stable correlation with the thermal error. According to the long-term prediction accuracy of the thermal error, the uncertainty-correlation coefficient has obvious advantages over the existing TSP-selection methods. (3) The uncertainty-correlation coefficient is an improvement over the traditional correlation coefficient. The main improvement lies in the uncertainty calculation of the correlation coefficient, which can evaluate the stability of the correlation. Therefore, it can also provide some references when encountering other problems similar to the selection of TSPs of machine tools. The characteristic of the problem is to make decisions based on the correlation between multiple variables, and the stability of the correlation is an issue that cannot be ignored. For example, in medical testing, it is necessary to calculate the correlation of paired data (data measured by the same object on different instruments) for clinical diagnosis. Uncertainty can measure the impact of errors in the data and provides some auxiliary information for decision-makers [39].