Deep Learning-Based Diagnosing Structural Behavior in Dam Safety Monitoring System

Collecting a myriad of prototype data through various types of monitoring sensors plays a virtual important role in many aspects of dam safety such as real-time grasp of safety state, exposure of hidden dangers, and inspection design and construction. However, the current methods of prediction are weak in the long-term sequence of nodes with missing and abnormal error value. Moreover, the limitation caused by the apparatus, environmental factors, and network transmission can lead to the deviation and inconsistency of diagnosis and evaluation of local region. In this paper, we consider the correlation of data on nodes in the entire monitoring network. To avoid the deviation caused by noise and missing value in the single-node data sequence, we calculate the correlation between the multiple sequences. A single-node assessment model based on multiple relevant sequence (SAM) is proposed to improve the accuracy of single node assessment. Given the different nodes of a local region have varying impacts on the evaluation results, a local region evaluation algorithm based on node credibility (LREA) is presented to model the credibility of nodes in order to alleviate inconsistent evaluation results in the local region of dam. LREA can assess the dam’s operation state by considering the variations in credibility and multiple nodes coordination. The experimental results illustrate the LREA can reveal the trends of the monitoring values change in a timely and accurate way, which can elevate the accuracy of evaluation results of dam safety.


Introduction
Reservoirs and dams now play a pivotal role in the regulation of the temporal and spatial distribution of water resources as well as their optimization. They also provide considerable social and economic benefits such as flood control, power generation, irrigation, water supply, transportation, and tourism. As a colossal engineering structure, dams are usually built in a strategical position yet fragile ecology, thus making their internal and external conditions very complex. To guarantee its safe operation, the following two types of measures can be adopted. (1) Engineering measures: engineering technologies are used to reinforce and maintain a dam at fixed intervals of time. (2) Non-engineering measures: these measures include flood forecasting and safety monitoring, among other means. Nowadays, various types of devices (e.g., sensor nodes) are deployed in the large dam engineering to measure the different physical quantities and sense their changes in various regions of the structure, such as deformation, stress, pressure, etc. As shown in Figure 1a-c, three types of devices are the static level, inverted vertical meter, and seam measurement meter, respectively. They are deployed to measure the deformation, stress, and pressure of the dam. At present, the dam safety evaluation is mainly based on the periodic post-evaluation. After a dam operates for some time (a week, a month, a quarter, a year, or a periodic inspection), the monitoring data of the dam's operating status during this period will be collated and examined [1][2][3][4]. For the crucial nodes of a key dam region which fit under different categories of monitoring items such as deformation, seepage-pressure, stressstrain, and temperature, the specialized statistical analysis models will be constructed to analyze and predict the changing trend of single nodes data, and then evaluate the working conditions of the key dam region. The dam overall safety evaluation basically stays at data evaluation and trend analysis of key nodes of the local region of the dam's local region [5].
Local regions of the dam under monitoring are mainly partitioned and formed by a spatial arch monitoring grid which is composed of longitudinal monitoring sections and horizontal monitoring sections, combining the concrete dam-specific monitoring design specifications, structural calculation results, and previous engineering experience. In the early phase, dam safety monitoring only paid attention to a small number of key nodes. At that time, the monitors classified the key monitoring data and judged whether the data was abnormal by the 3 principle. With the continuous upgrade of monitoring technologies and means, the automatic dam monitoring systems have gone online, and many nodes got connected to them consequently. According to different types of monitored physical quantities, these monitoring systems can be divided into several categories of level-1 monitoring items, such as displacement, pressure, strain, and crack. Monitoring items are classified according to different monitoring technologies or means and then assigned to different nodes. As a result, they form a tree structure that consists of many layers with an entire dam at the top and nodes at the bottom. Experts assign weights to each layer of the structure and judge the overall operation state of the dam with these weights in a bottom-up approach. Besides, experience-driven assessment methods also include structural mechanics models, finite element models, spectrum analysis, statistical models, and wavelet analysis [6]. Furthermore, deformation monitoring data analysis turns out to be another means to diagnose the dam's health status. According to the dam engineering specifications, the multi-scale fuzzy C-means-based data mining algorithm [7] is used to cluster the deformation monitoring data of a dam, compare and analyze the differences between its conventional deformation distribution diagram and deformation clustering diagram, and ultimately diagnose the operation state of the dam. The dam monitoring data contains a variety of information on effect quantity. Based on the Dempster-Shafer (DS) theory of evidence, a multi-effect quantities fusion reasoning model for dam behavior [8] is built to get the dam operation characteristics' reasoning results. Although fitting single short-time sequence well, these traditional methods fit single short-time sequence well, these traditional methods are weak in predicting the long-term sequence of the key nodes with missing or abnormal error values, especially those playing a critical role in dam safety evaluation.
Moreover, due to the limitations of equipment, environment, network transmission and other factors, the comprehensive diagnostic evaluation of nodes deployed in the same Collecting a large amount of data through various types of monitoring sensors plays an important role in many aspects of dam safety such as real-time grasp of safety state, exposure of hidden dangers, and inspection design and construction. Therefore, in addition to diagnosing and evaluating the working behavior of dam, it is also important to accurately analyze the data from their safety nodes.
At present, the dam safety evaluation is mainly based on the periodic post-evaluation. After a dam operates for some time (a week, a month, a quarter, a year, or a periodic inspection), the monitoring data of the dam's operating status during this period will be collated and examined [1][2][3][4]. For the crucial nodes of a key dam region which fit under different categories of monitoring items such as deformation, seepage-pressure, stressstrain, and temperature, the specialized statistical analysis models will be constructed to analyze and predict the changing trend of single nodes data, and then evaluate the working conditions of the key dam region. The dam overall safety evaluation basically stays at data evaluation and trend analysis of key nodes of the local region of the dam's local region [5].
Local regions of the dam under monitoring are mainly partitioned and formed by a spatial arch monitoring grid which is composed of longitudinal monitoring sections and horizontal monitoring sections, combining the concrete dam-specific monitoring design specifications, structural calculation results, and previous engineering experience. In the early phase, dam safety monitoring only paid attention to a small number of key nodes. At that time, the monitors classified the key monitoring data and judged whether the data was abnormal by the 3σ principle. With the continuous upgrade of monitoring technologies and means, the automatic dam monitoring systems have gone online, and many nodes got connected to them consequently. According to different types of monitored physical quantities, these monitoring systems can be divided into several categories of level-1 monitoring items, such as displacement, pressure, strain, and crack. Monitoring items are classified according to different monitoring technologies or means and then assigned to different nodes. As a result, they form a tree structure that consists of many layers with an entire dam at the top and nodes at the bottom. Experts assign weights to each layer of the structure and judge the overall operation state of the dam with these weights in a bottom-up approach. Besides, experience-driven assessment methods also include structural mechanics models, finite element models, spectrum analysis, statistical models, and wavelet analysis [6]. Furthermore, deformation monitoring data analysis turns out to be another means to diagnose the dam's health status. According to the dam engineering specifications, the multi-scale fuzzy C-means-based data mining algorithm [7] is used to cluster the deformation monitoring data of a dam, compare and analyze the differences between its conventional deformation distribution diagram and deformation clustering diagram, and ultimately diagnose the operation state of the dam. The dam monitoring data contains a variety of information on effect quantity. Based on the Dempster-Shafer (DS) theory of evidence, a multi-effect quantities fusion reasoning model for dam behavior [8] is built to get the dam operation characteristics' reasoning results. Although fitting single short-time sequence well, these traditional methods fit single short-time sequence well, these traditional methods are weak in predicting the long-term sequence of the key nodes with missing or abnormal error values, especially those playing a critical role in dam safety evaluation.
Moreover, due to the limitations of equipment, environment, network transmission and other factors, the comprehensive diagnostic evaluation of nodes deployed in the same local area is prone to deviations, resulting in inconsistent diagnostic evaluation results. Because of the above-mentioned deficiencies existing in the safety assessment of a dam monitoring region, to avoid the deviation caused by noise and missing value in the single-node data sequence, a single-node assessment model based on multiple relevant sequence (SAM) is proposed to improve the accuracy of single node assessment. To alleviate inconsistent evaluation result of the nodes in a same local region, a local region evaluation algorithm based on node credibility (LREA) is proposed to model the credibility of the nodes and achieve consistent local region evaluation results by cooperating with multiple nodes.
The rest of this paper is organized as follows. We introduce the related work in Section 2. The preliminary and problem statement are presented in Section 3. In Section 4, we present a local region evaluation algorithm based on node credibility for structural behavior diagnosing. Finally, we evaluate the performance of the proposed SAM and LREA with large-scale real dam monitoring data in Section 5, and conclude the work in Section 6.

Related Work
The dam's local region is deployed with various types of nodes to monitor deformation, seepage-pressure, stress-strain, and temperature, among other dam engineering aspects. Therefore, monitoring data gathered from nodes can assess the local region's operation state where such nodes are deployed. To be specific, the nodes monitor the dam working condition effect quantity of the place where it is arranged, and the patterns of how time-sequence data of nodes change to reflect the operation state of the corresponding region. With the actual monitoring data from single nodes analyzed, one can learn how the nodes are deployed separately. Next, multiple nodes can be cooperated to assess the local region's operating state together.

Analysis of Single Node Values
The analysis of single-node monitoring data primarily predicts the time-sequence data of nodes, analyzes the differences between the measured (real) values and the predicted values, and identifies the specific physical characteristics changes in the local regions. The existing single-node prediction methods can be roughly divided into regression analysis methods (such as partial least squares regression method, ridge regression method, principal component regression method, and Lasso regression), time-sequence analysis (stationary time sequence [9], autoregressive-moving average (ARMA) model [10] and autoregressive integrated moving average (ARIMA) model [11], artificial neural network method, spectrum analysis method, Kalman filter method [12,13], and gray theory analysis method. Although fitting single short-time sequence well, these traditional methods are weak in predicting the long-time sequence with missing values and abnormal error values. With the development and prevalence of machine learning/deep learning in recent years, processing and predicting time sequence have increasingly been used by these following methods: Kernel Method [14,15], Online Learning [16], Support Vector Machine [17], Recurrent Neural Network (RNN) [18], and Long Short-Term Memory (LSTM) [19,20]. Both RNN and LSTM can remember the historical state of time sequence. In comparison, LSTM adds forgetting gate, input gate, and output gate to the RNN structure, which effectively prevents the gradients from vanishing or exploding, a problem that exists in RNN. Although LSTM demonstrates impressive learning ability in processing the single time sequence with long cycles and large fluctuations, it sees its ultimate prediction effects severely impacted by the non-smooth, unstable time sequence. To address the prediction and assessment errors caused by noisy or missing values in the time sequence of single nodes, we compute the node sequence correlation and harnesses the multi-correlation sequence for raising the accuracy of single node assessment.

Collaborative Diagnosis with Multiple Nodes
As to the evaluation of a region, many existing methods are used along with the assessment results of multiple nodes to conclude how the local region works. Hierarchical Aggregate Classification (HAC) [21] is a method used to construct a hierarchical tree of nodes: focal points can aggregate all nodes' classification results to achieve the overall assessment of the entire network of sensors. The majority node voting scheme [22] harnesses the cluster head nodes to make statistics of node classification results in the cluster to identify abnormal states in the cluster. In these methods, the classification results of majority nodes represent those of focal points or clusters. Enhanced Naive Bayes Classifier (ENBC) [23] resorts to unsupervised learning to improve the probability distribution of multi-node classification results, thereby increasing the accuracy of event classification in the network of sensors. Majority of Methods Voting Scheme (MMVS) [24,25] integrates multiple methods and models, usually outperforming single methods. Because of the greater model complexity, ENBC and MMVS both can increase the accuracy of multi-node classification. The current methods primarily rely on most nodes or the increased model complexity to enhance the accuracy of collaborative multi-node classification single-node classification result's credibility without considering the single-node classification result's credibility. However, because of reasons such as apparatus, environmental factors, and network transmission, the evaluation results of different nodes in the same local region are inconsistent. Under this circumstance, treating all nodes with no distinction will lead to deviations. Therefore, we introduce a node credibility model, which describes how differently each node affects the overall evaluation results of a local region and uses the variances in credibility to unify the local region's evaluation results produced by different nodes.

System Model
Regional partition algorithm (RPA) divides a structure into several single region according to the spatial-temporal characteristics of nodes, since the evaluation result of the physical state of the single region can showcase the local operation characteristics of the structure [26]. In a single region, many types of physical quantities can be observed from multiple nodes. According to the data gathered from single nodes, the changing trend of a certain physical quantity in the region can be analyzed. For example, when a single region's stress value changes suddenly, the stress sensors will gather the changed data. Simultaneously, there is a correlation between any two changes in various types of physical quantities in a single region. Sudden changes in stress values will lead to sharp fluctuations in displacement values. Whether changes to physical quantities are abnormal can be determined regarding the time sequence revealed by nodes' data. Therefore, we can predict the time sequence of single nodes and then compare the differences in the measured and predicted values to assess the state of physical quantities observed at single nodes. The spatial and temporal characteristics of nodes within a single region are highly correlated, which renders the time sequence of nodes highly relevant. In practical application, single nodes' sequence data are generally characterized by being non-smooth, unstable, with noisy and missing values, thus interfering with the prediction and assessment results of single nodes. Data changes at a single node are often accompanied by synchronous data changes across multiple correlation nodes. At this moment, if multiple related node sequences are used to track and predict the data of a single node collaboratively, it is expected to deliver a better assessment performance.
A single node's assessment results can be deemed a reflection of how the local region operates on a certain physical quantity. In this sense, it is feasible to coordinate all the nodes to evaluate a single local region's physical state. However, because of apparatus failure, external environment, and other factors, the monitoring sequence data from a single node are not reliable, so different nodes exert various impacts on the overall region's overall evaluation. In fact, multiple sequences can be used to improve the single-node assessment results' credibility. However, suppose multiple highly correlated time sequence all become abnormal. In that case, there will be a situation where "two negatives make a positive," that is, multiple abnormal time sequences are all regarded as normal ones, which leads to deviations in the single-region evaluation results. To boost the accuracy of single-region evaluation results, the credibility of single-node is modeled so that the credibility can be used to describe the impact of single nodes on the single-region evaluation results. Based on the credibility, the assessment results of different nodes will then be coordinated and integrated so that a single region can be evaluated more accurately by distancing from the nodes with low credibility and getting close to those with high credibility wherever possible.

Problem Formulation
The local region assessment includes two phases: First, predict the time sequence of a single node and then use the differences between the predicted values and the real values to assess the operation state of the node; second, coordinate multiple nodes to assess the operation state of the corresponding local region. The detailed process of which region r j is evaluated at the time point T is elaborated in Figure 2. overall evaluation. In fact, multiple sequences can be used to improve the single-node assessment results' credibility. However, suppose multiple highly correlated time sequence all become abnormal. In that case, there will be a situation where "two negatives make a positive," that is, multiple abnormal time sequences are all regarded as normal ones, which leads to deviations in the single-region evaluation results. To boost the accuracy of single-region evaluation results, the credibility of single-node is modeled so that the credibility can be used to describe the impact of single nodes on the single-region evaluation results. Based on the credibility, the assessment results of different nodes will then be coordinated and integrated so that a single region can be evaluated more accurately by distancing from the nodes with low credibility and getting close to those with high credibility wherever possible.

Problem Formulation
The local region assessment includes two phases: First, predict the time sequence of a single node and then use the differences between the predicted values and the real values to assess the operation state of the node; second, coordinate multiple nodes to assess the operation state of the corresponding local region. The detailed process of which region j r is evaluated at the time point T is elaborated in Figure 2.

LSTMs Dense Softmax
Phase1: predict where j i d is the assessment result of a single node as a probability vector. For instance, the node 1 2 x in the region 1 r may get its operation state assessed as any of the following (1) Single node diagnosis process During the process of single node diagnosis, we predict and assess the node x j i with the data of historical time sequence. At the time point T, select the historical sequence T x j i = [t 1 , t 2 , · · · , t λ ] with the length of λ to train and learn about how to map the singlenode prediction model: According to the single node prediction model, the node x j i sees its predicted value at the time point T computed as t T , and meanwhile every historical moment before the time point T has a predicted value calculated as T x j i = t 1 , t 2 , · · · , t λ . The real value T x j i and the predicted value T x j i are combined to assess the operation state of the node: where d j i is the assessment result of a single node as a probability vector. For instance, the node x 1 2 in the region r 1 may get its operation state assessed as any of the following preset results: "Good," "Normal," "Checked," "Abnormal." After prediction and assessment, , which means the above four operation states correspond to the probabilities of 65%, 20%, 10%, and 5%, respectively. With the maximum probability adopted, the node x 1 2 has its operation state assessed as "Good." (2) Collaborative evaluation with multiple nodes For the collaborative evaluation with multiple nodes, we predict and assess each node in a local region to obtain {d j i |i = 1, · · · , len(r j )}. It integrates the operation states of multiple nodes to assess how the corresponding region operates. The evaluation process of a local region can be defined as below: where d j indicates the consistent evaluation result of the region reached by all nodes available. For instance, suppose the region r 1 takes the assessment result as its operation state, all nodes available at the local region are coordinated to produce the evaluation result d 1 = [0.8, 0.1, 0.09, 0.01], which means the region is assessed as "Good" (with a probability of 80%).

Single Node Diagnosis
Given the fact that the data on a time sequence of single nodes are non-smooth, unstable, with noisy and missing values in actual operation, we propose the single-node assessment model based on multiple relevant sequence (SAM). Within SAM, the data on the time sequence of nodes are pre-processed and normalized, so that cosine similarity can be used to compute the relevancy of the sequence; select multiple correlation sequence as inputs and add the attention layer to assign multiple sequences with attention weights; adopt the LSTM layer to make a prediction and complete the assessment of a single node through the Softmax layer. The structure of the SAM is shown in Figure 3. The single-node evaluation process includes two stages.
(1) Pre-process the time sequence of a node, calculate the correlation coefficients between the node and other ones in time sequence as inputs, and harness the attention layer and LSTMs to finish the single-node prediction based on multiple correlation sequence. (2) Based on the sequence of real and predicted values produced by a node, d j i can be obtained by assessing the state of physical quantities observed at the single node x j i through the Dense layer and the Softmax layer. preset results: "Good," "Normal," "Checked," "Abnormal." After prediction and assessment, it is obtained that 1 2 0.65,0.2,0.1,0.05    d , which means the above four operation states correspond to the probabilities of 65%, 20%, 10%, and 5%, respectively. With the maximum probability adopted, the node 1 2 x has its operation state assessed as "Good." (2) Collaborative evaluation with multiple nodes For the collaborative evaluation with multiple nodes, we predict and assess each node in a local region to obtain {} | 1, , ( ) . It integrates the operation states of multiple nodes to assess how the corresponding region operates. The evaluation process of a local region can be defined as below: where j d indicates the consistent evaluation result of the region reached by all nodes available. For instance, suppose the region 1 r takes the assessment result as its operation state, all nodes available at the local region are coordinated to produce the evaluation result 1 0.8,0.1,0.09,0.01    d , which means the region is assessed as "Good" (with a probability of 80%).

Single Node Diagnosis
Given the fact that the data on a time sequence of single nodes are non-smooth, unstable, with noisy and missing values in actual operation, we propose the single-node assessment model based on multiple relevant sequence (SAM). Within SAM, the data on the time sequence of nodes are pre-processed and normalized, so that cosine similarity can be used to compute the relevancy of the sequence; select multiple correlation sequence as inputs and add the attention layer to assign multiple sequences with attention weights; adopt the LSTM layer to make a prediction and complete the assessment of a single node through the Softmax layer. The structure of the SAM is shown in Figure 3. The singlenode evaluation process includes two stages. (1) Pre-process the time sequence of a node, calculate the correlation coefficients between the node and other ones in time sequence as inputs, and harness the attention layer and LSTMs to finish the single-node prediction based on multiple correlation sequence.

Time Sequence Normalization
In a local region, various nodes are deployed to oversee its different physical quantities. As a result, related data come in varying units and ranges. Before single nodes are predicted and assessed, they should be normalized. The node x j i sees its original time sequence T x j i = [t 1 , t 2 , · · · , t λ ] normalized with the following equation: represent the mean value and standard deviation of the sequence, respectively. Any null value (NULL, NAN) existing in the sequence will be set as "0." Then compute the correlation sequence of T x j i and construct SAM inputs, with the process seen in detail as shown in Figure 4.
ties. As a result, related data come in varying units and ranges. Before single nodes are predicted and assessed, they should be normalized. The node j i x sees its original time sequence 12 ,, , j i x T t t t     normalized with the following equation: represent the mean value and standard deviation of the sequence, respectively. Any null value (NULL, NAN) existing in the sequence will be set as "0." Then compute the correlation sequence of j i x T and construct SAM inputs, with the process seen in detail as shown in Figure 4. When data are normally distributed or undergo standardized processing, Pearson correlation coefficient, cosine similarity, and squared Euclidean distance can be considered as equivalent. For the sake of calculation convenience, cosine similarity is used to compute the correlation between the two nodes j A x and j B x : To compute the correlation coefficients between all the nodes available in a local region, we need to select the sequence TT T  , N should be a natural number equal to or greater than "1." When N is equal to "1," the sequence of j i x itself will be selected. If N is equal to "3," it means two correlation sequence should be selected. The selection of N value will be verified in the following experiments.

Single Node Diagnosis Model
After the sequence of nodes is pre-processed, we can compute the correlation of multiple sequences from different nodes

TT
T  , which is as the input of diagnosis model. The diagnosis of single nodes is based on the multiple nodes with the spatial-temporal correlation, the specific procedure can be seen in Figure 5. When data are normally distributed or undergo standardized processing, Pearson correlation coefficient, cosine similarity, and squared Euclidean distance can be considered as equivalent. For the sake of calculation convenience, cosine similarity is used to compute the correlation between the two nodes x j A and x j B : To compute the correlation coefficients between all the nodes available in a local region, we need to select the sequence {T 0 , T 1 , · · · , T N −1 }, consisting of N − 1 nodes featuring the highest correlation with the sequence x j i (denoting T x j i as T 0 ). In the sequence {T 0 , T 1 , · · · , T N −1 }, N should be a natural number equal to or greater than "1." When N is equal to "1," the sequence of x j i itself will be selected. If N is equal to "3," it means two correlation sequence should be selected. The selection of N value will be verified in the following experiments.

Single Node Diagnosis Model
After the sequence of nodes is pre-processed, we can compute the correlation of multiple sequences from different nodes T 0 , T 1 , · · · , T N −1 , which is as the input of diagnosis model. The diagnosis of single nodes is based on the multiple nodes with the spatial-temporal correlation, the specific procedure can be seen in Figure 5.
(1) Prediction of single node. A related prediction is completed by using the attention layer and the LSTM layer. Considering that each sequence's input bears varying magnitudes of importance in the prediction process, the attention layer is employed to assign each sequence with a weight: W 0 , W 1 , · · · , W N −1 . With the attention weights applied to the inputted sequence, we can obtain W 0 T 0 , W 1 T 1 , · · · , W N −1 T N −1 as the inputs on the LSTM layer. Weights are updated through error backpropagation. Predictions on the LSTM layer led to the predicted value T of prediction error and assessment error, with the loss function displayed as below. A composition error Loss SAM can alleviate the situation where a single error (RMSE or CE) is optimized excessively, hereby protecting the model from over-fitting.
Sensors 2021, 21, x FOR Proof 8 of 25 (1) Prediction of single node. A related prediction is completed by using the attention layer and the LSTM layer. Considering that each sequence's input bears varying magnitudes of importance in the prediction process, the attention layer is employed to assign each sequence with a weight: With the attention weights applied to the inputted sequence, we can obtain  can alleviate the situation where a single error (RMSE or CE) is optimized excessively, hereby protecting the model from overfitting.

Local Region Evaluation Model with Multiple Nodes
It is possible to assess all nodes in the local region by prediction, diagnosis and then

Local Region Evaluation Model with Multiple Nodes
It is possible to assess all nodes in the local region by prediction, diagnosis and then integrate these nodes' assessment results to evaluate how their corresponding regions operate. However, different single nodes may be varying in credibility, due to factors such as network transmission, environmental impact, and nodes themselves. If all nodes are treated with no distinction, it will obtain the evaluation results that deviate from actuality through integrating the assessment results of all nodes. To obtain more accurate evaluation results for a local region, a local region evaluation algorithm based on node credibility (LREA) was proposed. Note that different nodes in the same local region have inconsistent evaluation results and different nodes' credibility. Therefore, the local region evaluation is converted into a matter about how to optimize the node credibility and assessment results. On that premise, the coordinate descent method can be used to calculate a local region evaluation result. The processing flow diagram is shown in Figure 6.

Node Credibility
How creditable the node x j i take an effect on the evaluation of the local region r j is defined as the credibility ω j i = {x j i , r j } in a bid to measure the degree of influence which the node will exert on the local region. In physical terms, credibility ω j i denotes the role that the physical quantities monitored by the node x j i play in the evaluation of the local region r j . Suppose the middle part of an arch dam is designated as the local region under monitoring and that the most obvious deformation and displacement are identified from such part, the nodes deployed for monitoring deformation and displacement will take a bigger part in the local region evaluation than other nodes. On the other hand, the credibility ω j i can also indicate the difference between the single node assessment result d j i and the local region evaluation result d j . The higher ω j i is, the nearer d j i and d j tend to be. Apparently, the local region evaluation result d j will approach the assessment results produced by most highly creditable nodes.
Sensors 2021, 21, x FOR Proof 9 of 25 treated with no distinction, it will obtain the evaluation results that deviate from actuality through integrating the assessment results of all nodes. To obtain more accurate evaluation results for a local region, a local region evaluation algorithm based on node credibility (LREA) was proposed. Note that different nodes in the same local region have inconsistent evaluation results and different nodes' credibility. Therefore, the local region evaluation is converted into a matter about how to optimize the node credibility and assessment results. On that premise, the coordinate descent method can be used to calculate a local region evaluation result. The processing flow diagram is shown in Figure 6.   happens, it means that a certain node has its credibility far greater than that of any remaining node. For example, a certain node's credibility is "1" and that of any other node is "0." The assessment result of the node will directly restrain that of any other node, leading to  All nodes available in the same local region should see their credibility subject to a certain type of constraint to facilitate comparison and calculation. An intuitive choice will be ∑ ω 1], where all nodes have their credibility be non-negative and added up to "1." Below are some problems that may arise from there.
(1) When ω j i max{ω j i |i = 1, · · · , len(r j ); i = i } happens, it means that a certain node has its credibility far greater than that of any remaining node. For example, a certain node's credibility is "1" and that of any other node is "0." The assessment result of the node will directly restrain that of any other node, leading to d j ≈ d j i , that is, a locally optimal solution.
(2) ω j i ∈ [0, 1] means a decimal. When there are many nodes, some nodes are likely to have their credibility ω j i ≈ 0, which results in more errors or an error of division by 0 from the calculation process.
(3) The function f (x) = x considers its derived function as a constant, a situation defying the solution with gradients.
To avoid the above problems, ∑ e −ω j i = 1 is selected as the credibility constraint. In the function g(x) = e −x , its derived function is −g(x), a design conducive to obtaining the closed-form solution. In the case of special values and boundary values, we can set g(0) = 1, g(1) = e −1 , g(∞) = 0, to avoid the occurrence of calculation error.
At the beginning, the node credibility was initialized randomly and got constantly corrected with the measured data. The node time sequence data in dynamic changes make it possible for the node credibility distribution to match up to the actual operation state of the local region under monitoring. On this basis, credibility can be used to coordinate multiple nodes for local region evaluation.

Region Evaluation
Define the single node assessment result of the single-node x j i as d j i and integrate multiple nodes to get the local region evaluated as d j . Given that different nodes vary in credibility, they will affect the local region evaluation result to different extents. As to the final revaluation result, it can be felt that the higher node credibility leads to the smaller evaluation error, and the lower node credibility leads to greater evaluation error. The local region evaluation error is defined as below: where Loss(r j ) is the sum of all node assessment errors, and len(r j ) is the total number of nodes. In Equation (8), ω Therefore, minimizing Loss(r j ) keeps the local region evaluation result d j away from the highly creditable nodes (when ω j i is relatively large) yet close to the poorly creditable nodes (when ω j i is relatively small). Now, considering the node credibility constraints and assessment errors, the local region evaluation process is converted into the optimization solution shown as below: The objective of optimization is to minimize the local region evaluation error Loss(r j ).
The constraints conditions include ∑ e −ω j i = 1 and d j i ≥ 0, d j i = 1 . The former condition is used to restrain the node credibility, while the latter one is intended to make sure the local region evaluation result d j will not be negative and that all component probabilities can be added up to "1." The optimization problem illustrated in Equation (9) tries to search in the solution space for the node credibility ω j = {ω 0 i , ω 1 i , · · · , ω len(r j ) i } and the local region evaluation result d j . This is a non-convex optimization problem to which the gradient method is not applied. Given this, the coordinate descent method is adopted for a solution. At each iteration, the search is carried out along one direction in which the optimization problem is solved to update the node credibility and the local region evaluation result alternately.

Iterative Solution
The process of the iterative solution with the coordinate descent method takes place in two steps.
Step 1: fix the node credibility and update the local region evaluation result.
Step 2: fix the local region evaluation result and update the node credibility. Each round of iteration pushes the optimization problem closer to the optimal solution. The above two steps need to be executed alternately until the final convergence is realized.
(1) Update the revaluation results When any node x j i gets its credibility ω j i fixed, updating the single region evaluation result d j can minimize the overall evaluation error. Since ω j i is supposed as known, the partial derivative of Loss(r j ) to d j is calculated directly as below: Suppose the partial derivative of Equation (10) as "0," it can be solved as below: It should be noted that ∑ ω j i = 1 (the node credibility restraint is ∑ e −ω j i = 1). Besides, since the single node assessment result can meet the constraint condition d j i ≥ 0, d j i = 1 , the single region evaluation result solved by Equation (11) is bound to satisfy the above restraint condition.
(2) Update the credibility of nodes Next, we will fix the single region evaluation result d j and update the node credibility ω j i . The Lagrange method is adopted to rewrite the optimization objective function into Equation (12): where α is the Lagrange multiplier factor. The partial derivative is solved to ω j i as below: Suppose Equation (13) is equal to "0," it can be solved as below: With the node credibility constraints, we sum the two sides of Equation (14) as = 1, and solve the Lagrange multiplier factor α = ∑ d j i − d j 2 which is substituted into Equation (14) as follows: Equations (11) and (15) are adopted to update the single region evaluation result d j and the node credibility ω j i alternately.

Local Region Evaluaiton Algorithm Based on Node Credibility
To address the inconsistent evaluation results of a local region that arise from the varying levels of node credibility, the paper proposes a local region evaluation algorithm based on node credibility (LREA, or Algorithm 1 shown as below). LREA receives the assessment results of multiple nodes within a single region {d j i |i = 1, 2, · · · , len(r j )} and the iteration error threshold as inputs and meanwhile works out the single region evaluation result d j and the node credibility ω j as outputs.

Algorithm 1 Local region evaluation algorithm based on node credibility (LREA)
Input: multiple node assessment results {d j i |i = 1, 2, · · · , len(r j )}, the iteration error threshold Output: single region evaluation result d j , node credibility ω j 1: INITIALIZE ω j , d j (0), d j (1)//Initialize node credibility and the single region evaluation result 2: t = 1 //Rounds of iteration //Save single region evaluation result 9: END WHILE 10: Return ω j , d j In Algorithm 1, the node credibility ω j is initialized using the historical credibility, and the single region evaluation results (initial) d j (0) and d j (1) are initialized randomly. d j (1) and ω j i (t) represent the single region evaluation result in the t round of iteration and the value of node credibility, respectively. Iteration error is computed using the single region evaluation results in two rounds of iteration d j (t) and d j (t − 1). Then d j and ω j are updated alternately as instructed by Equations (11) and (15). LREA keeps converging until the iteration error threshold is reached. After that, the algorithm will return the single region evaluation result and node credibility.

Datasets
In order to show that the proposed the region evaluation based on node credibility (LERA) works well with various kinds of datasets, we have chosen the real dam safety monitoring dataset. The real dataset from the highest arch dam in the world is from 01 January 2017 to 31 December 2017 of 964 sensor nodes, which has 350,000 data items recording the sensors' types, the spatial coordination positions, time slots, and the observed data. According to the design specification of dam safety monitoring systems, the dam is divided into 34 areas, which are distributed as shown in Figure 5. When the existing models are used to predict the displacement change of Node P04618 in the dam safety monitoring system, the predicted values often tend to be volatile at a pace lagging the real values. Therefore, Node P04618 is selected as an object of a single node prediction and assessment experiment to assess the performance of SAM. Multiple nodes are coordinated to evaluate the local region r 1 collaboratively and to analyze the performance of LREA.
Within the set of data measured from 1 January 2017 to 31 December 2017 the single local region r 1 is chosen from the current 34 local regions of the dam under monitoring for assessment as shown in Figure 7. The local region r 1 , located in the middle part of the dam, consists of 31 nodes. According to the design specifications and engineering experience in dam safety monitoring, the 31 nodes are grouped into 11 categories (C1-C11). Under the same category are highly similar nodes, of which a Key Node is selected to represent the entire category. Key Nodes map on to categories as shown in Table 1. Take Category C1, for example. There are three nodes under C1, that are, P04616, P04617, and P04618, of which P04618 is chosen as the Key Node. The experimental results show that the 11 Key Nodes produce the assessment results the same as those of the 31 nodes. Given this, the 11 Key Nodes are used to display the engineering instance as below.

Category Intra-Category Nodes Key Node Category
P06706, P06707, P06708 P06706 The dam is a type of double-curvature arch dam. The local region 1 r , located in the upper part of the crown cantilever, comes in a complicated force-bearing structure, thus triggering the most significant displacement on a cumulative basis. Take the key nodes P04618 (displacement), P04776 (stress), and P07588 (stress) for example. The three key nodes each presented in 2017 a time sequence curve as shown in Figure 8. The three curves are vastly different in amplitude of variation and volatility.  The dam is a type of double-curvature arch dam. The local region r 1 , located in the upper part of the crown cantilever, comes in a complicated force-bearing structure, thus triggering the most significant displacement on a cumulative basis. Take the key nodes P04618 (displacement), P04776 (stress), and P07588 (stress) for example. The three key nodes each presented in 2017 a time sequence curve as shown in Figure 8. The three curves are vastly different in amplitude of variation and volatility.  Options for a single node assessment result or a single region evaluation result may be "Good," "Normal," "Checked," or "Abnormal" (which correspond to 0, 1, 2, and 3, respectively). Key Nodes are predicted and assessed using SAM. Then LREA is employed to coordinate multiple nodes and evaluate a single region collaboratively. Taking the date of 12 September 2017 as an example, the single node assessment results and single region evaluation results are shown in Table 2.  Options for a single node assessment result or a single region evaluation result may be "Good," "Normal," "Checked," or "Abnormal" (which correspond to 0, 1, 2, and 3, respectively). Key Nodes are predicted and assessed using SAM. Then LREA is employed to coordinate multiple nodes and evaluate a single region collaboratively. Taking the date of 12 September 2017 as an example, the single node assessment results and single region evaluation results are shown in Table 2. In Table 2, Column 4 displays the node credibility, which meets the credibility constraint ∑ 11 i=1 e −ω 1 i = 1. The single region evaluation result is d 1 = [0.315, 0.211, 0.229, 0.245], corresponding to the operation state of "Checked" and consistent with the actual evaluation result. To be specific, the key nodes P04618, P06706, P07045, and P08421 see their operation state assessed as "2," and the key nodes P04776, P05257, P06152, and P07857 post an assessment result of "1." However, the former group outperforms the latter group in terms of credibility, so the single region gets its operation state assessed as "2." Moreover, since the node P04618 is in the middle part of the single region, its displacement pattern plays a decisive role in the region's operation state. The node bears high credibility accordingly. The node P08252 delivers the credibility of 1.770, the lowest level among all nodes. This situation can be attributed to the existence of missing or noisy data in the node sequence.
With the node assessment error as err(x j i ) = d j i − d j 2 , we make statistics of key node credibility and assessment error as shown in Figure 7. In the figure below, the horizontal axis indicates the serial numbers of nodes. The scatter dotted line, corresponding to the left vertical axis, tells the credibility of each node. The dotted line in parallel to the horizontal axis exhibits the average value of credibility, and the columns, along with the right vertical axis, indicate assessment errors. As shown in Figure 9, the assessment error is inversely proportional to the credibility of a node: the higher the one, the lower the other. Nodes P04618, P05257, P06152, and P07045 can be taken as examples in this case. Even though the credibility of node P06706 falls below the average level and its assessment error is quite high, it is proven that its predicted assessment result is consistent with the final evaluation result of the single region. The node's assessment error can be corrected by other nodes with higher levels of credibility. vertical axis, indicate assessment errors. As shown in Figure 9, the assessment error is inversely proportional to the credibility of a node: the higher the one, the lower the other. Nodes P04618, P05257, P06152, and P07045 can be taken as examples in this case. Even though the credibility of node P06706 falls below the average level and its assessment error is quite high, it is proven that its predicted assessment result is consistent with the final evaluation result of the single region. The node's assessment error can be corrected by other nodes with higher levels of credibility. Figure 9. Credibility analysis of key nodes.

Baselines
In the dam structural behavior evaluation, the commonly used single-node prediction and assessment methods or models include the auto-regressive integrated moving average model (ARIMA), support vector machine (SVM), exponential fitting, and polynomial fitting. When it comes to the dam safety operation, and maintenance practice, the

Baselines
In the dam structural behavior evaluation, the commonly used single-node prediction and assessment methods or models include the auto-regressive integrated moving average model (ARIMA), support vector machine (SVM), exponential fitting, and polynomial fitting. When it comes to the dam safety operation, and maintenance practice, the dynamic optimal combined model (CM) is adopted to conduct the single-node prediction and assessment. CM represents the best prediction and assessment performance of ARIMA, SVM, and fitting models. SAM and CM will see their performance compared in a single-node prediction and assessment experiment respectively.
At present, the multiple node coordination methods are hierarchical agglomerative clustering (HAC) [21], enhanced Naive Bayes classifier (ENBC) [24], and the majority of methods voting scheme (MMVS) [25]. MMVS integrates multiple traditional methods including Bayes classifier, decision-making tree, BP neural network, support vector machine, and k-nearest neighbor. LREA, HAC, ENBC, and MMVS are compared and analyzed in a single region evaluation experiment.

Evaluation Metrics
(1) Single node prediction experiment: SAM and CM are used to measure the daily lateral displacement of the node P04618 from 1 January 2017 to 31 December 2017. With the measured values dotted into curves, we can demonstrate how the trends of predicted and real value change and then use the root mean squared error (RMSE) to measure the accuracy and stability of related methods. (2) Single node assessment experiment: SAM and CM are employed to assess the predicted values and true values of nodes. In the actual engineering work, nodes may see their operation state fitting under any of the following four categories of assessment results: "Good," "Normal," "Checked," or "Abnormal." Then we will make statistics of the accuracy rates of assessment results produced by SAM and CM both as a whole and on a category-specific basis. The resulting experiment results are presented in the form of a confusion matrix. (3) Experiment on the number of correlation sequence and correlation coefficient thresholds on the SAM performance: Within SAM, multiple sequences are harnessed to make predictions and assessments. Therefore, it is necessary to verify how the number of correlation sequence and correlation coefficient thresholds impact the model's accuracy and stability. The resulting assessment indicator is RMSE. (4) LREA convergence analysis: The analysis is made to verify the impact of the number of nodes on the LREA convergence and the changes in iteration error. (5) Single region evaluation experiment with such methods or models as LREA, HAC, ENBC, and MMVS: Compare and analyze the accuracy rates of local region evaluation results produced by LREA and the three other methods currently adopted.

Single Node Prediction Experiment
The node P04618 of the single region r 1 sees its displacement subject to sharp fluctuations and the value changes. SAM and CM are used to predict the daily displacement of the node over the period from 1 January, 2017 to 31 December, 2017 with the experiment results shown in Figure 10. SAM, CM, and Real, along with the legends and arrows on the right corner of the figure, mean the curve of predicted values with SAM, the curve of predicted values with CM, and the curve of real values. A holistic look at the figure reveals that the SAM curve almost overlaps the Real curve, while the CM curve clearly diverges from the Real curve.
tuations and the value changes. SAM and CM are used to predict the daily displacement of the node over the period from 1 January, 2017 to 31 December, 2017 with the experiment results shown in Figure 10. SAM, CM, and Real, along with the legends and arrows on the right corner of the figure, mean the curve of predicted values with SAM, the curve of predicted values with CM, and the curve of real values. A holistic look at the figure reveals that the SAM curve almost overlaps the Real curve, while the CM curve clearly diverges from the Real curve. In Figure 11a, as the Real curve plunges on both 28 May 2017 and 4 June 2017 the CM curve takes on a trend of mild increase. However, the curve with SAM follows the Real tuations and the value changes. SAM and CM are used to predict the daily displacement of the node over the period from 1 January, 2017 to 31 December, 2017 with the experiment results shown in Figure 10. SAM, CM, and Real, along with the legends and arrows on the right corner of the figure, mean the curve of predicted values with SAM, the curve of predicted values with CM, and the curve of real values. A holistic look at the figure reveals that the SAM curve almost overlaps the Real curve, while the CM curve clearly diverges from the Real curve. In Figure 11a, as the Real curve plunges on both 28 May 2017 and 4 June 2017 the CM curve takes on a trend of mild increase. However, the curve with SAM follows the Real In Figure 11a, as the Real curve plunges on both 28 May 2017 and 4 June 2017 the CM curve takes on a trend of mild increase. However, the curve with SAM follows the Real curve to drop promptly. As shown in Figure 11b, the Real curve reaches a prominent peak from 10 October 2017 to 28 November 2017. Over the same period, the SAM curve keeps up with the Real curve in time, while the CM curve fluctuates considerably and falls far behind the pace at which the Real curve changes.
SAM resorts to multiple sequences for prediction. It will not skip or fluctuate suddenly in response to a sharp change in the Real curve. Its predicted values tend to be relatively stable. However, the currently used CM is unable to reveal the changing pattern of the Real curve promptly. From 1 January 2017 to 31 December 2017 divided into many 15day periods, we make statistics of average RMSE values produced by SAM and CM, respectively. The details can be seen in Figure 12 15-day periods, we make statistics of average RMSE values produced by SAM and CM, respectively. The details can be seen in Figure 12. In the figure, the SAM RMSE curve always lies below the CM RMSE curve. Within the periods when the Real curve flattens, like the period from 16 January 2017 to 1 May 2017 the SAM RMSE curve and the CM RMSE curve are relatively close to each other. But over the period from 10 October 2017 to 28 November 2018, when the Real curve fluctuates sharply, the CM RMSE curve goes up remarkably yet the SAM RMSE curve manages to remain steady.

Single Node Diagnosis
SAM and CM assess a single node's operation state using the predicted values and real values of the node. Let us take the predicted values and real values of the node P04618 from 1 January 2017 to 28 February 2017 as a historical sample to assess the node's operation state from 1 March 2017 to 31 December 2017 (the assessment result may be any of the following four options: "Good," "Normal," "Checked" or "Abnormal"). The assessment results with SAM and CM are presented in the form of a confusion matrix as Figure  13a

Single Node Diagnosis
SAM and CM assess a single node's operation state using the predicted values and real values of the node. Let us take the predicted values and real values of the node P04618 from 1 January 2017 to 28 February 2017 as a historical sample to assess the node's operation state from 1 March 2017 to 31 December 2017 (the assessment result may be any of the following four options: "Good," "Normal," "Checked" or "Abnormal"). The assessment results with SAM and CM are presented in the form of a confusion matrix as Figure 13a,b. The number of days under assessment totals 306. The horizontal and vertical axes represent the predicted state and the real state, respectively.
respectively. The details can be seen in Figure 12. In the figure, the SAM RMSE curve always lies below the CM RMSE curve. Within the periods when the Real curve flattens, like the period from 16 January 2017 to 1 May 2017 the SAM RMSE curve and the CM RMSE curve are relatively close to each other. But over the period from 10 October 2017 to 28 November 2018, when the Real curve fluctuates sharply, the CM RMSE curve goes up remarkably yet the SAM RMSE curve manages to remain steady.

Single Node Diagnosis
SAM and CM assess a single node's operation state using the predicted values and real values of the node. Let us take the predicted values and real values of the node P04618 from 1 January 2017 to 28 February 2017 as a historical sample to assess the node's operation state from 1 March 2017 to 31 December 2017 (the assessment result may be any of the following four options: "Good," "Normal," "Checked" or "Abnormal"). The assessment results with SAM and CM are presented in the form of a confusion matrix as Figure  13a  The node P04618 sees its operation state assessed as "3" (Abnormal) throughout the year. The limited quantity leads to very poor prediction accuracy with SAM and CM, so we do not analyze the abnormal state. A diagonal line in a confusion matrix means the consistency between the predicted state and the real state. While analyzing the remaining three states ("Good," "Normal," and "Checked"), we find that their assessment accuracy rates with SAM stand at 0.75, 0.67, and 0.66, respectively, while those rates with CM come at 0.36, 0.43, and 0.27. It can be concluded that SAM outperforms CM, concerning the accuracy rate of any of the three states. The overall assessment with SAM and CM registers an accuracy rate of 70.0% and 37.6%, respectively. There is a huge gap of 32.3% between the two. Besides, SAM makes less error in the assessment of each state than CM, which lends SAM a more leading edge over the latter in overall assessment accuracy.

Impact on the SAM Performance (1) The Number of Correlation Sequence
When SAM is used for prediction and assessment, the number of correlation sequence N will affect the prediction accuracy rate and the number of convergence epochs. In the prediction of 31 nodes available within the single region r 1 from 1 January 2017 to 31 December 2017 we increase the number of correlation sequence N from "1" to "12," and make statistics of each node's average RMSE and average convergence epochs. The experiment results are shown in Figure 14 where the horizontal axis represents the number of correlation sequences, the left y-axis means RMSE and corresponds to the legend of the dot, and the right y-axis indicates the average iteration epochs and corresponds to the legend of the triangle. three states ("Good," "Normal," and "Checked"), we find that their assessment accuracy rates with SAM stand at 0.75, 0.67, and 0.66, respectively, while those rates with CM come at 0.36, 0.43, and 0.27. It can be concluded that SAM outperforms CM, concerning the accuracy rate of any of the three states. The overall assessment with SAM and CM registers an accuracy rate of 70.0% and 37.6%, respectively. There is a huge gap of 32.3% between the two. Besides, SAM makes less error in the assessment of each state than CM, which lends SAM a more leading edge over the latter in overall assessment accuracy.

Impact on the SAM Performance (1) The Number of Correlation Sequence
When SAM is used for prediction and assessment, the number of correlation sequence N will affect the prediction accuracy rate and the number of convergence epochs. In the prediction of 31 nodes available within the single region 1 r from 1 January 2017 to 31 December 2017 we increase the number of correlation sequence N from "1" to "12," and make statistics of each node's average RMSE and average convergence epochs. The experiment results are shown in Figure 14 where the horizontal axis represents the number of correlation sequences, the left y-axis means RMSE and corresponds to the legend of the dot, and the right y-axis indicates the average iteration epochs and corresponds to the legend of the triangle. As the number of correlation sequence increases, the average RMSE takes on a trend of moving up first and then down, and the number of iteration epochs goes up as a whole except for a slight drop when there are less than four correlation sequence. Multiple correlation sequences ( 4  as shown in Figure 14) introduce additional information, compared with a single sequence. This can not only help reduce single node prediction error but the number of model convergence epochs. However, when there are too many correlation sequences ( 4  as shown in Figure 14), irrelevant or disrupting information may exist among these sequences to result in more error. Furthermore, the input scale-up will lead to more convergence epochs. As indicated in Figure 14, when the number of correlation sequences is "4," both the RMSE and iteration epoch curves are found in their trough. Therefore, in the subsequent experiments, we will select four correlation sequences ( 4  ) to conduct single node prediction and assessment for the single region 1 r . In other words, the sequence of the first three nodes related to the target node will be chosen as inputs. As the number of correlation sequence increases, the average RMSE takes on a trend of moving up first and then down, and the number of iteration epochs goes up as a whole except for a slight drop when there are less than four correlation sequence. Multiple correlation sequences (N ≤ 4 as shown in Figure 14) introduce additional information, compared with a single sequence. This can not only help reduce single node prediction error but the number of model convergence epochs. However, when there are too many correlation sequences (N ≤ 4 as shown in Figure 14), irrelevant or disrupting information may exist among these sequences to result in more error. Furthermore, the input scale-up will lead to more convergence epochs. As indicated in Figure 14, when the number of correlation sequences is "4," both the RMSE and iteration epoch curves are found in their trough. Therefore, in the subsequent experiments, we will select four correlation sequences (N = 4) to conduct single node prediction and assessment for the single region r 1 . In other words, the sequence of the first three nodes related to the target node will be chosen as inputs.
(2) Correlation Coefficient Thresholds When the time sequence of a node bears a relatively low correlation with that of any other node, the sequence N − 1 may all project a very low correlation coefficient (Coef ) with the node, a situation which will undermine the single node prediction accuracy. The correlation coefficients of the sequence of the 31 nodes available in the local region r 1 are calculated and presented in the form of a matrix as shown in Figure 15. In the figure, both horizontal and vertical axes encompass nodes; the darker a box is, the higher the two-node sequence's correlation coefficient will become.
When the time sequence of a node bears a relatively low correlation with that of any other node, the sequence 1  may all project a very low correlation coefficient (Coef) with the node, a situation which will undermine the single node prediction accuracy. The correlation coefficients of the sequence of the 31 nodes available in the local region 1 r are calculated and presented in the form of a matrix as shown in Figure 15. In the figure, both horizontal and vertical axes encompass nodes; the darker a box is, the higher the twonode sequence's correlation coefficient will become.  Figure 16 shows that the sequence of the nodes P04618 and P07590 has a relative high correlation coefficient than that of P08423 and vice versa. With other nodes ranked by their correlation coefficients with P08423 in descending order, we select three correlation node sequence through the sliding window (namely, the number of correlation sequence 4  ) every time to make statistics of the average RMSE as shown in Figure 16.   Figure 16 shows that the sequence of the nodes P04618 and P07590 has a relative high correlation coefficient than that of P08423 and vice versa. With other nodes ranked by their correlation coefficients with P08423 in descending order, we select three correlation node sequence through the sliding window (namely, the number of correlation sequence N = 4) every time to make statistics of the average RMSE as shown in Figure 16.
other node, the sequence 1  may all project a very low correlation coefficient (Coef) with the node, a situation which will undermine the single node prediction accuracy. The correlation coefficients of the sequence of the 31 nodes available in the local region 1 r are calculated and presented in the form of a matrix as shown in Figure 15. In the figure, both horizontal and vertical axes encompass nodes; the darker a box is, the higher the twonode sequence's correlation coefficient will become.  Figure 16 shows that the sequence of the nodes P04618 and P07590 has a relative high correlation coefficient than that of P08423 and vice versa. With other nodes ranked by their correlation coefficients with P08423 in descending order, we select three correlation node sequence through the sliding window (namely, the number of correlation sequence 4  ) every time to make statistics of the average RMSE as shown in Figure 16.  In Figure 16, the horizontal and vertical axes represent the correlation coefficient (Coef ) and the average RMSE, the legends of the solid circle, triangle, and rhombus correspond to the nodes P04618, P07590, and P08423, respectively. Overall, the average RMSE curve goes down as the Coef increases. As displayed by the grey column in the figure, when the Coef threshold ranges between 0.8 and 0.825, the average RMSE will dive. Since more sequence Coef s are input, the single node prediction will see a more significant enhancement. However, at the time when the Coef threshold grows to a certain number, for instance about 0.9, the average RMSE will decrease at an increasingly flattened range. Therefore, the Coef threshold of 0.82 is set for the dam monitoring data set used in the experiment.

Multiple Node Convergence Analysis with LREA
LREA adopts the iterative solution to obtain the diagnosis results. The number of iterations affects the real-time performance of the algorithm. As preset, the number of nodes ranges between 2 and 31, the iteration threshold = 1 × e −8 , and the algorithm is run for 100 times repeatedly to make statistics of the average convergence steps as the number of nodes changes. The details can be seen in Figure 17.
curve goes down as the Coef increases. As displayed by the grey column in the figure, when the Coef threshold ranges between 0.8 and 0.825, the average RMSE will dive. Since more sequence Coefs are input, the single node prediction will see a more significant enhancement. However, at the time when the Coef threshold grows to a certain number, for instance about 0.9, the average RMSE will decrease at an increasingly flattened range. Therefore, the Coef threshold of 0.82 is set for the dam monitoring data set used in the experiment.

Multiple Node Convergence Analysis with LREA
LREA adopts the iterative solution to obtain the diagnosis results. The number of iterations affects the real-time performance of the algorithm. As preset, the number of nodes ranges between 2 and 31, the iteration threshold 8 1 e   , and the algorithm is run for 100 times repeatedly to make statistics of the average convergence steps as the number of nodes changes. The details can be seen in Figure 17. In Figure 17, the horizontal and vertical axes represent the number of nodes (2-31) and the number of average convergence steps, respectively. As seen in the figure, when the number of nodes is 2, the number of average convergence steps is 6.42; when the number of nodes is 3, the number of average convergence steps will reach its peak: 57.92; when the number of nodes grows from 3 to 10, the number of average convergence steps drops rapidly, approaching the average value of 16.34; when the number of nodes exceeds 10 and continues to increase, the number of convergence steps falls at a slight and steady pace and hits its lowest level of 11.48 at the time when there are 30 or 31 nodes. In the dam safety monitoring systems, nodes usually come in a large number (more than 10), so LREA can converge fast.
Suppose that the iteration threshold is 8 1 10   and the number of iterations is 100. The average iteration error will change as shown in Figure 18. In the figure, the horizontal and vertical axes represent the number of iteration steps (1-13) and the iteration error, and the number of average convergence steps is 13. As shown in the figure, the iteration error plunges as the number of iteration steps increases from "1" to "3," which evidences that LREA can get stabilized and converge rapidly. In Figure 17, the horizontal and vertical axes represent the number of nodes (2-31) and the number of average convergence steps, respectively. As seen in the figure, when the number of nodes is 2, the number of average convergence steps is 6.42; when the number of nodes is 3, the number of average convergence steps will reach its peak: 57.92; when the number of nodes grows from 3 to 10, the number of average convergence steps drops rapidly, approaching the average value of 16.34; when the number of nodes exceeds 10 and continues to increase, the number of convergence steps falls at a slight and steady pace and hits its lowest level of 11.48 at the time when there are 30 or 31 nodes. In the dam safety monitoring systems, nodes usually come in a large number (more than 10), so LREA can converge fast.
Suppose that the iteration threshold is = 1 × 10 −8 and the number of iterations is 100. The average iteration error will change as shown in Figure 18. In the figure, the horizontal and vertical axes represent the number of iteration steps (1-13) and the iteration error, and the number of average convergence steps is 13. As shown in the figure, the iteration error plunges as the number of iteration steps increases from "1" to "3," which evidences that LREA can get stabilized and converge rapidly.  (1) Impact on the accuracy Using the 11 Key Nodes to replace the 31 nodes available in the single region r 1 , we try to work out how the number of Key Nodes will impact the stability and accuracy of local region evaluation results with LREA, HAC, ENBC, and MMVS. If there are two key nodes, it needs to calculate the average accuracy rate under the C 2 11 = 55 combined schemes. k key nodes mean the C k 11 combined schemes under which we need to make statistics of average evaluation accuracy rate.
In Figure 19, the horizontal and vertical axes represent the number of key nodes and the evaluation accuracy rate, respectively. As the number of key nodes grows, local region evaluation results' accuracy rate moves up accordingly. The four curves in the figures illustrate the evaluation results with LREA, MMVS, HAC, and ENBC from bottom to top. The LREA curve stays above the three other curves all the time. This is because the introduction of node credibility makes it possible for a single node evaluation error to be corrected by other correlation nodes. Among the three traditional methods, MMVS achieves the optimal values, thus functioning better than HAC and ENBC. Failing to consider the assessment error of different nodes, HAC and ENBC perform worse than the two other assessment accuracy methods. (2) Accuracy Analysis of LREA, HAC, ENBC, and MMVS With the real data on the dam under monitoring from 1 January 2017 to 31 December 2017, we compare LREA with the currently used three methods: HAC, ENBC, and MMVS. The monthly average evaluation accuracy rates of the local region r 1 attained with the four methods are illustrated in Figure 20, and the annual average evaluation accuracy rates of the 34 local regions (r 1 to r 34 ) with the four methods can be seen in Figure 21. In Figure 20, the horizontal axis encompasses different months, the vertical axis indicates the monthly average evaluation accuracy rates, the four tops of each colored column represent the accuracy rates of the four methods. Taking July 2017 as an example, we can see that the four columns in different colors from high to low correspond to LREA, MMVS, cates the monthly average evaluation accuracy rates, the four tops of each colored column represent the accuracy rates of the four methods. Taking July 2017 as an example, we can see that the four columns in different colors from high to low correspond to LREA, MMVS, HAC, and ENBC, with the accuracy rates standing at 93.3%, 82.3%, 72.5%, and 69.1%, respectively. Overall, the four methods are ranked by the accuracy rates in descending order: LREA, MMVS, HAC, and ENBC. The annual accuracy rates of evaluation results obtained with LREA, HAC, ENBC, and MMVS come at 90.5%, 82.5%, 76.0%, and 67.1%.  Figure 21 shows that the horizontal and vertical axes represent the single region and the annual evaluation accuracy rates, respectively. All local regions under monitoring see their average evaluation accuracy rates obtained with LREA, MMVS, HAC, and ENBC standing at 84.2%, 78.6%, 71.6%, and 63.5%, respectively. From Figures 19 and 20, it is obvious that LREA is superior to MMVS, HAC, and ENBC in terms of evaluation accuracy rate.
(3) The application system of the evaluation of the dam operation state To further illustrate the evaluation results of the dam operation state, we have developed and implemented an application system for the evaluation the dam operation state, as shown in Figure 22.  In Figure 20, the horizontal axis encompasses different months, the vertical axis indicates the monthly average evaluation accuracy rates, the four tops of each colored column represent the accuracy rates of the four methods. Taking July 2017 as an example, we can see that the four columns in different colors from high to low correspond to LREA, MMVS, HAC, and ENBC, with the accuracy rates standing at 93.3%, 82.3%, 72.5%, and 69.1%, respectively. Overall, the four methods are ranked by the accuracy rates in descending order: LREA, MMVS, HAC, and ENBC. The annual accuracy rates of evaluation results obtained with LREA, HAC, ENBC, and MMVS come at 90.5%, 82.5%, 76.0%, and 67.1%. Figure 21 shows that the horizontal and vertical axes represent the single region and the annual evaluation accuracy rates, respectively. All local regions under monitoring see their average evaluation accuracy rates obtained with LREA, MMVS, HAC, and ENBC standing at 84.2%, 78.6%, 71.6%, and 63.5%, respectively. From Figures 19 and 20, it is obvious that LREA is superior to MMVS, HAC, and ENBC in terms of evaluation accuracy rate.
(3) The application system of the evaluation of the dam operation state To further illustrate the evaluation results of the dam operation state, we have developed and implemented an application system for the evaluation the dam operation state, as shown in Figure 22.

Experiment Conclusions
In SAM, multiple sequence data are used to complete single node prediction and assessment, thus substantially avoiding the prediction and assessment error due to missing and abnormal values in a single node. From the results of Experiment 1, we can find that SAM can promptly reflect the fluctuations in the node sequence, which enables it to outperform CM in the prediction performance. In Experiment 2, SAM and CM generate an overall assessment accuracy rate at 70.0% and 37.6%, respectively. This is an illustration that SAM out-competes CM in terms of assessment stability and accuracy. Experiment 3 is designed to analyze how the number of correlation sequence N and the correlation coefficient threshold Coef impacts the SAM performance. According to the measured data set of the dam under monitoring, we assume N = 3 and Coe f = 0.82 in the experiment. A node credibility model is created with LREA, so that multiple nodes can be coordinated to evaluate the operation state of a local region through variations in credibility. Experiment 4 aims to verify the impact on the number of convergence steps with LREA brought by the different number of nodes. The experimental results indicate that LREA can converge rapidly when there are more nodes (more than 10). Compared with three methods: HAC, ENBC, and MMVS, LREA can outperform the other three methods in terms of the monthly average evaluation accuracy rates of the local region.

Experiment Conclusions
In SAM, multiple sequence data are used to complete single node prediction and assessment, thus substantially avoiding the prediction and assessment error due to missing and abnormal values in a single node. From the results of Experiment 1, we can find that SAM can promptly reflect the fluctuations in the node sequence, which enables it to outperform CM in the prediction performance. In Experiment 2, SAM and CM generate an overall assessment accuracy rate at 70.0% and 37.6%, respectively. This is an illustration that SAM out-competes CM in terms of assessment stability and accuracy. Experiment 3 is designed to analyze how the number of correlation sequence and the correlation coefficient threshold Coef impacts the SAM performance. According to the measured data set of the dam under monitoring, we assume 3  and 0.82 Coef  in the experiment. A node credibility model is created with LREA, so that multiple nodes can be coordinated to evaluate the operation state of a local region through variations in credibility. Experiment 4 aims to verify the impact on the number of convergence steps with LREA brought by the different number of nodes. The experimental results indicate that LREA can converge rapidly when there are more nodes (more than 10). Compared with three methods: HAC, ENBC, and MMVS, LREA can outperform the other three methods in terms of the monthly average evaluation accuracy rates of the local region.

Conclusions
To provide the accuracy and real-time evaluation for the dam safety state, this paper addressed the region evaluation with multiple relevant time sequences from the nodes deployed for dam safety monitoring. In this paper, we consider the correlation of data on nodes in the entire monitoring network. To avoid the deviation caused by noise and missing value in the single-node data sequence, we calculate the correlation between the multiple sequences. A single-node assessment model based on multiple relevant sequence

Conclusions
To provide the accuracy and real-time evaluation for the dam safety state, this paper addressed the region evaluation with multiple relevant time sequences from the nodes deployed for dam safety monitoring. In this paper, we consider the correlation of data on nodes in the entire monitoring network. To avoid the deviation caused by noise and missing value in the single-node data sequence, we calculate the correlation between the multiple sequences. A single-node assessment model based on multiple relevant sequence (SAM) is proposed to improve the accuracy of single node assessment. Given the different nodes of a local region have varying impacts on the evaluation results, a local region evaluation algorithm based on node credibility (LREA) is presented to model the credibility of the nodes in order to alleviate inconsistent evaluation results in the local region of the dam. LREA can assess the dam's operation state by considering the variations in credibility and multiple nodes coordination. The experimental results illustrate the LREA can reveal the trends of the monitoring values change in a timely and accurate way, which can elevate the accuracy of evaluation results of dam safety.
The measured values of nodes in the dam safety monitoring systems are affected by many environmental factors, such as the water level, temperature. In the future work, the evaluation of single node state can consider more influencing factors. Moreover, we will study in depth the changing law of the reliability of measurement points at different times, and optimize the iterative process of evaluation.