Framework for Validation of Permanently Installed MEMS-Based Acquisition Devices Using Soft Sensor Models

: Asset integrity and predictive maintenance models require ﬁeld data for an accurate assessment of an asset’s condition. Historically these data collected periodically in the ﬁeld by technicians using portable units. The signiﬁcant investment in inexpensive microelectromechanical (MEMS) sensors mounted on untethered (energy-harvesting or battery-powered) microprocessors communicating wirelessly to the cloud is expected to change the way we collect asset health data. Permanently installed MEMS-based sensing units will enable near-real time data collection and reduce the safety exposure of technicians by eliminating the need to manually collect ﬁeld data. With hundreds of MEMS-based sensing units expected to be installed at a single site it is vital to assure the data they produce and maintain them cost effectively. An asset management framework for validation of MEMS-based sensing units for condition monitoring and structural integrity (CM&SI) applications is proposed. An integral part of this framework is the proposed use of soft sensor models to replace technician inspections in the ﬁeld. Soft sensor models are used in the process industry to stabilize product quality and process operations but there are few examples in asset management applications. The contributions of this paper are twofold. Firstly, we use an interdisciplinary approach drawing on electronics, process control, statistics, machine learning, and asset management ﬁelds to describe the emerging ﬁeld of permanently installed MEMS-based sensing units for CM&SI. Secondly, we development a framework for assuring validation of the data these sensing units generate.


Introduction
There are safety and economic benefits if appropriate corrective action can be taken before asset failure occurs [1]. Condition monitoring and structural integrity (CM&SI) programs are integral to failure prevention. Both are cyclic asset management processes involving program design, data collection and analysis, execution of maintenance recommendations and program review. The key decisions for engineers involved in CM&SI programs are what, where, and when to collect asset health data [2]. Once these are decided, engineers are often reliant on technicians using portable devices to collect data at pre-set intervals in specific locations. This exposes the technicians to hazards as well as being expensive due to the need for trained and certified technicians and specialist equipment. The historical reliance on technicians with portable devices is partially due to the high costs of power and communication infrastructure associated with permanently installed systems [3]. Any move to have more permanently installed sensors necessitates the development of validation and maintenance programs for the sensing network. Currently validation inspections on permanently installed sensing units are usually conducted at fixed intervals, regardless of any indication the sensing unit has a fault. We might be sending technicians to test sensing units that are healthy [4], thus exposing them to unnecessary risk and incurring considerable manpower costs if the sensor population is significant. In many sectors, for example oil and gas and mining, there has been a move in the last decade to unmanned facilities. While reducing the exposure of employees to operational risks, this situation also complicates the maintenance effort involved in sensor calibration as trips to the remote sites by technicians are infrequent.
The implementation of many inexpensive, permanently installed microelectromechanical (MEMS) sensing units creates a critical challenge to ensure that the near-real time data coming from individual MEMS-based sensing units is valid. Motivated by this we propose the use of on-line mathematical models to identify out-of-calibration or malfunctioning sensing units. This will facilitate a move to an on-condition approach to managing the integrity of the permanently installed CM&SI sensing units.
The objective of this paper is to present a framework for sensor validation management in CM&SI applications using permanently installed MEMS-based sensing units. The paper draws on topics from electronics, process control, statistics and machine learning, asset management and reliability engineering fields. We start with an overview of elements in a MEMS-based sensing units and their reliability. A description of on-line monitoring (also known as soft sensor) models follows with specific examples of developments in sensor validation. Steps necessary to manage maintenance of the model(s) ensure confidence in the analysis, and cost effectiveness of the program are then described. Experiences of the process industry in using soft sensor models are used throughout the paper. Finally we identify opportunities to improve sensor validation practice for the CM&SI community.

MEMS-Based Sensing Systems
A MEMS-based sensing node has five functional elements as shown in Figure 1. The key physical components of a MEMS-based sensing node are the sensing device (II) and the edge device (III) generally consisting of a microcontroller with an inbuilt communications system. Each require power (IV) and firmware/software to operate. The sensing node also needs to be physically mounted to the asset (I), and needs to communicate to the cloud (V). From a reliability perspective this sensing node is a series system. Failure of any one element leads to failure of the node.
A typical sensing device comprises a MEMS sensor, an analogue to digital converter (ADC), filters and associated logic. The sensing device communicates with the edge device. The edge device performs multiple functions including on-board power management, data storage, signal processing and wireless data transmission. It can have on-board power supplies such as rechargeable or replaceable batteries or energy harvesting components and extra ports for additional sensors. Settings such as sampling triggers, sampling rates and on-off-sleeping modes are adjustable through software. Advances in microprocessor capabilities now allow the ability to perform on-board signal processing and data compression, called edge computing, is increasing rapidly. Microprocessors can support a range of wireless communication technologies including RFID, Bluetooth, WiFi, Zigbee, Ultra-wideband and Wireless USB [5]. There are also a range of low-power wide-area network (LPWAN) protocols such as Lora and Sigfox which are becoming increasingly available. The cost of the commercial off-the-shelf (COTS) parts for one of these sensing nodes is presently around US $100. An example of the size for a single COTS capacitive MEMS accelerometer sensor and an edge device is shown in Figure 2. We envisage many sensing nodes being permanently installed on assets. For example, as many as 13 could be installed on a single motor-pump system if we adopt recommended guideline ISO 13371 for condition monitoring of machines [6]. Data are therefore collected from multiple nodes in near-real time. To manage the challenge of ensuring the quality of this data we propose a framework for validation of these multiple redundant MEMS-based sensing nodes.
In developing this framework we draw on guidelines from ISO 17359 for condition monitoring and diagnostics of machines and ISO 19902 for in-service inspection and integrity management of steel structures [2,7]. Figure 3 shows a proposed closed loop process starting with defining the scope of the management system, design of the process, data collection and analysis, execution of actions, review of their technical and cost effectiveness and then ongoing review of the scope and process.

Scope of the Management System
Determining the scope of the validation management system involves defining the number, location and function of each sensing system. A sensing system is a group of sensing nodes on a single asset. The function of the sensing system is to detect events or changes in the monitored asset and accurately transmit information about these changes in an appropriate format and at an appropriate frequency to the cloud. Functional failures can occur in any of the elements of an individual sensing node shown in Figure 1.
Each element in the sensing node requires a functional statement against which a failure modes and effects analysis (FMEA) for the specific node can be conducted. Practical issues such as how the sensing node is attached to the asset can be of vital importance and consideration needs to be given to how the mounting method responds to temperature, moisture, and UV degradation, for instance [3].
Drawing on the work of Kullaa [8] we suggest an initial list of functional failures for the sensor element are bias, drift, scaling and hard failure. These are described mathematically as follows [9].
The variable y f (t) represents the faulty sensor readout, y(t) is the nominal sensor measurement and η(t) corresponds to inherent noise. A bias fault, y f (t) = y(t) + α + η(t), is characterized by a constant offset (i.e., α). A drift-type failure y f (t) = y(t) + β(t) + η(t) is represented as a time-varying offset factor (i.e., β(t)). Bias and drift faults can also be categorized as additive-type sensor failures. Scaling, gain or precision degradation failures occur when the nominal sensor outputs are scaled or multiplied by a factor (i.e., g(t)) as in y f (t) = g(t)y(t) + η(t). Finally, the hard-fault type occurs when the sensor readings are stuck at a particular constant value (i.e., δ) in y f (t) = δ + η(t). A complete loss of signal can also be represented by this type of fault by assuming that the output from the sensor is zero (i.e., δ = 0).
There has been considerable research into specific failure mechanisms of individual types of MEMS sensors [10][11][12][13][14][15][16]. However information regarding operational failure modes in MEMS and how the mechanisms described in the literature link to the functional failures (bias, drift, scaling and hard failure) is lacking [16]. Additionally, there are few FMEA examples for components used in MEMS sensing devices [5] and we located none that provide a comprehensive examination of the links between operational failure modes at the sensing system level and failure mechanisms in the constituent node elements (sensor, sensor mounting, MEMS sensing device, edge device, power and communications elements).
A key element of the sensing system is wireless communications and the wireless sensor network. Technical issues to be considered include time synchronisation between the sensor nodes, network topology and the ability to scale. This is an active research field that could be useful in developing a FMEA for the wireless element of the sensing system, for instance [3].
The specific context of the asset and nature of the monitoring needs to be considered. For example structural response requires accurate and synchronized measurements from different points on the asset, whereas analysis of slow varying phenomena require low frequency sampling over long periods [17]. Other design considerations especially important in an untethered system are what data to process on the edge device, how often and what to communicate, and the effect of these decisions on the sensing node's power budget [5,18].

Validation System Design
The first stage of the framework's design process ( Figure 3) is the selection of suitable indicators for the failure modes of the sensing system. This process is informed by the FMEA described previously. Once indicators are selected a sensing system can be installed to acquire data. The data are then reviewed against performance thresholds. In a simple system this can be performed manually but once there are tens or hundreds of sensing systems this will need to be actioned on-line in an automated way. To manage this process we propose the use of soft sensor models.

Soft Sensor Models
Soft sensor models, also known as inferential models and virtual sensors, refer to mathematical models developed to infer indicators from real-time measurable variables. The concept of soft sensors is superficially simple. The measured responses of multiple redundant sensors on a single structure are naturally correlated as the data they measure is related to a common generating process. Consider the example of the motor and pump with 13 vibration sensors referred to earlier. The development of a fault with the pump, for example due to a loose hold-down bolt, will result in an increase in amplitude in the vibration of all the vertically mounted sensing nodes.
To assure that all the sensor nodes are functioning to specification we need to be able to detect when any single sensing node (a 'target') has a fault. We achieve this by developing a mathematical model of the relationship between a target node and the other sensing nodes on the structure. Once this relationship has been established the model can be used to infer when the target sensing node is performing as expected, or not.
There are only limited examples of soft sensor models for sensor validation in the CM&SI context. This situation is understandable due to the historical reliance in the CM&SI community on periodic, manually collected inspection and condition monitoring data. Conversely, soft sensor models rely on the availability of on-line data from permanently installed sensors. To move forward we propose the CM&SI community consider the experience the process industry has in developing soft sensor models for estimating difficult to measure process variables and product quality [19][20][21][22][23] and for validation of process sensors such as temperature, pressure and other process sensors [24][25][26][27].

Model Development Process
The development and maintenance of soft sensor models is categorised by [28] into four stages, as depicted in Figure 4. These are data acquisition, data pre-processing, model design and model maintenance. The remainder of this section discusses what we consider to be the most important aspects of the first three stages. Attention is given to model maintenance in the next section. The first stage in model development is data acquisition from the permanently installed sensing nodes. Consider a single target sensor and denote by Y t = (y 1 , y 2 , . . . , y t ) the data arriving sequentially in time from that sensor. A soft sensor model is concerned with modelling Y t . The data collection stage is often overlooked, but we believe it is of vital importance. Previous routine operations and field testing can aid informing the signatures of Y t that may indicate failure mechanisms, asset behaviour and sensing system issues. Data inspection identifies and irons out issues involved with missing measurements, outliers, multi-rate data, measurement delay and drift factors et cetera. It can take time to assess the quality, variability and coverage of Y t but if the incorrect signatures of the sensor are being measured, and/or the quality is sub-par, a soft sensor model will be ineffective. A description of approaches to handle these issues in the sensor validation context is provided in [29].
There are often reasons to believe other factors, say environmental conditions, may help explain failure modes and hence be used to model Y t . Denote by X t = (x 1 , x 2 , . . . , x t ) the data that encodes these other factors. Each x i in X t is a vector of inputs to be used for modelling Y t . Additional acquisition and inspection issues exist for X t as for Y t . For starters, good explanatory power may require the spatial and temporal alignment of X t with Y t . This sounds obvious, but our experience suggests this is a real headache in industrial data sets, and makes the mathematical modelling that much harder. Early collaboration between the modeller and the technicians can help avoid this problem.
Given the data acquisition stage is satisfactorily carried out, the development of a suitable data analytic model can commence. Data pre-processing and model design in Figure 3 are concerned with the question 'How is Y t related to X t ?'. Simple, interpretable, computationally efficient models should be used where possible, conditional on producing results at an acceptable level of accuracy. All models should be cross validated on independent data or scrutinized in terms of their predictive coverages of future measurements when they arrive.
Conceptually, models can be considered as white-box, black-box and grey-box. White box models use physical knowledge of the process, for example based on differential and algebraic equations or derived from numerical models, to construct a function over X t to imitate the behaviour of Y t . Black box methods are approximations to non-parametric models and are used when physical models do not exist or are too complex. They are mostly associated with computational/artificial intelligence models where classification rules, estimated using (X t , Y t ), inform us about the current or future states of the process. However, there are many statistical models that also fit the black box criteria.
For example, Gaussian processes are methods for flexible data fitting and have been used for sensor validation in the nuclear industry [30,31].
In practice many soft sensor models fall into the grey-box category. These grey-box models start with the identification of what is known about the process and synthesize this with information obtained from the observed data and expert knowledge. Being parametric, these tend to be statistical. Regression techniques have proved useful to soft sensor models, for example ridge regression when collinearity exists [32], principal component analysis (PCA) and partial least squares regression (PLS) in higher dimensional problems [33] and kernel regression to approximate non linear relationships in process industry applications [31]. Where a linear relationship cannot be assumed, artificial neural networks and neuro-fuzzy systems have been popular. For a review of PCA, PLS and neural network models applied to soft sensors for process industry applications, see [28,34].
While the above grey-box approaches have been effective, they are not well equipped to take full advantage of physical knowledge of the system. Here, we take the view that Bayesian statistical models are the most adept at combining physical, often non-linear, models with statistical ones. There is increasing interest in, and examples of, first principles and data driven soft sensor models being constructed in a Bayesian framework [35]. There is another advantage to Bayesian approaches in that they have a structure well suited to updating the predictive densities that represent our uncertainty of, say, y t+1 as new data arrives. This allows the calculation of maximized expected utilities for formal decision making based on soft sensor models.
Kalman and particle filter approaches are examples of a Bayesian modelling approach used both in data-driven and first principles contexts. Kalman filters are used when the mean process evolves linearly, and the observational errors are Gaussian. Extended Kalman filters are an analytical approximation method used when the underlying process is non-linear but the noise distributions can be assumed Gaussian. Particle filters are used when the process evolves non-linearly and the observational error is potentially non-Gaussian. An accessible description of Bayesian methods for soft sensor models is provided by [29] with example process industry applications in [23,35,36]. However, examples of Kalman or particle filter models for sensor validation in CM&SI contexts are scarce.

Maintenance of the Validation System
A soft sensor model must be maintained and, if necessary, tuned over its life cycle (Figure 4). In practice the life span of a soft sensor model is limited as most processes and assets do not operate in a stable state for extended periods of time [37]. There is usually no objective measure for assessing model performance. A judgement if the model is working or not is often dependent on the subjective perception of the engineer [34].
Once a fault with a sensing node is suspected there are two options. The first is to manually investigate the faulty sensing node, and the second to exclude the faulty sensing node from the soft sensor model, update the model and continue to operate the sensor network. If gradual degradation occurs then the failing node may not be immediately detected but, once it is, its influence on the model parameter estimates can be reduced. The viability for using weights based on a posterior probability of failure was explored by [27].
The potential advantage of using inspection to provide confirmation that the fault was indeed correctly identified and that the nature of the fault was as-diagnosed is considerable [38]. The disadvantage is the cost of sending a technician into the field to inspect and, if necessary, replace the faulty sensor. In some situations this work can only be executed safely when equipment is taken off-line.
If a technician is sent to investigate, the following data should be recorded: date and time of the validation, the as-found condition, what action was taken and the node's as-left condition should be recorded clearly and consistently [39]. Discussions between the authors and industry personnel suggest that data collection quality is highly variable and that a successful sensing system validation program is highly dependent on the motivation and training of personnel and a culture of adherence to scheduled maintenance procedures and plans. This is in line with previous work on maintenance data quality [40,41].
In response to the challenges with tracking data on calibration of permanently installed sensors instrumentation vendors have been working to improve the capability of sensing nodes to self-check and develop supervisory systems. These developments have the potential to improve calibration fault detection. However interviews by the authors with industry personnel report that integrating these external systems into existing business process, particularly transferring the meta data to describe the failure mode (complete failure, bias, drift etc.) remains a challenge.

Confidence in the Analysis
CM&SI sensing nodes are installed as part of a risk management program to identify and prevent asset failures. It is therefore paramount to have confidence in the data generated by the sensing systems. This requires all elements in the framework shown in Figure 3 to function appropriately and for their function to be assessed regularly.
The soft sensor models in the framework should theoretically produce few false negatives (fail to detect a fault when there is one) and few false positives (suggest there is a fault when there is not one). Both of these situations undermine confidence of the decision makers in the sensing system. The main concerns with soft sensor models are that they (1) do not have an accurate reference but compare the measured value to a calculated reference that itself is less accurate compared to the simulated input used in the traditional calibration process, (2) do not provide accuracy traceable to Standards, and (3) do not allow frequent physical inspection of the instrument or allow technicians to observe instrument anomalies [42].
Given the stochastic nature of asset deterioration there is also a need for decision makers to understand uncertainties in the process, from data collection through modelling and diagnosis. There is presently considerable focus, particularly in the US nuclear industry, in Bayesian-based soft sensor models that can quantify uncertainty [31].

Cost Effectiveness
We consider cost effectiveness from two perspectives. First, is a single model for a single sensing system cost effective? Secondly, what is the cost of the management system for many models and many sensing systems? The cost elements of supporting a single soft sensor model are depicted in Figure 5. The requirements listed on the left hand side of the balance are responsiveness, isolability, novelty identification, robustness, adaptability, classification error estimate and multiple fault identifiability [43]. An ideal model would be responsive to faults without being too sensitive and generating false alarms. A system should be robust to different noise contributions. As discussed earlier in Section 4.2 the model might need to be adaptive to changes in the underlying system it is monitoring and to conditions such as sensor drift. Finally, model developers should provide an a priori estimate on the expected error measures so that the model user can compare actual with anticipated performance. These requirements should be balanced against the costs of developing and maintaining the model. These costs are shown on the right hand side of the balance and include computational requirements, modelling requirements and model transparency.
Direct cost elements relevant to the deployment of a permanently installed, untethered, MEMS-based sensing system include sensing system purchase, installation and maintenance costs, soft sensor model development and maintenance costs described above. Other costs are those relating to dealing with false alarms, as well as false negatives associated with the failure to detect faulty sensors. These need to be balanced against the costs of periodic manual data collection by qualified technicians, either internal or (more commonly) external to the organisation.
Periodic manual data approaches to sensor validation are expensive, time consuming, and can result in longer outages, increased maintenance cost, and additional safety exposure to technicians [44]. A single sensor calibration in the nuclear industry can cost US $3000-$6000 [45]. Assuming a plant has 1000 sensors with a mean of two calibration checks per year, the annual cost is US $6 m. Reviews of calibration logs found 90 percent of sensing systems in a nuclear plant did not exceed their calibration acceptance criteria over a single fuel cycle, (presently 1.5 years) [46]. Studies have found calibration can be counter-productive, introducing errors in calibration of previously fault-free sensors [47]. Fixed interval calibration is required for all safety-related sensors, and validation has emerged as a critical path item for shortening outage duration in some plants [45]. Furthermore, fixed interval based calibration practice involves only periodic assessment of the calibration status. Therefore, a sensor could potentially operate out of calibration for periods up to the recalibration interval.

Evolution of Soft Sensor Models for Sensing Node Validation
To understand how soft sensor validation practices have evolved we conducted a search in the Science Direct database for papers on Soft, Virtual or Inferential sensors since 1984. This search identified over 880 papers. We filtered the results for papers on using soft sensor models for validation. This resulted in a much reduced number. Further filtering to identified soft sensor validation papers for CM&SI applications with associated industry and/or laboratory tests and Process applications with an industry example resulted in the 18 papers reported in Table 1. All of the early models in Table 1 are for process sector examples. The interest from structural integrity practitioners started in the mid 2000s and has mainly been focussed on bridges and using piezoelectric accelerometers. There is an example of using MEMS accelerometers in [61] on a bridge but examples for MEMS based permanently installed sensors in CM with a real asset are scarce. We note also that the papers identified in Table 1 concentrate on the soft sensor modelling part of the process with only passing mention given to issues associated with managing the process of sensor validation over the life cycle of the soft sensor model installation.

Lessons Learned
Sensor validation is vital to ensure appropriate data for CM&SI diagnostics and prognostics programs. There is much that the CM&SI community can learn from the process industry community about soft sensor models and their use for sensor validation. Many issues such the need for models to be adaptive and the need to quantify uncertainty are being actively explored in the process and nuclear industries.
Research into wireless, untethered (battery or self-powered), MEMS-based sensing networks is very active but there are, as yet, few practical applications in CM&SI applications. As products emerge and are shown to contribute to improved CM&SI practices through reduced cost and near-real time data collection, there is expected to be growing interest in how to assure the data they are generating.
There are a number of stages involved in sensor validation as shown in the framework in Figure 3. However most of the focus in the literature is on soft sensor model development, and very little on the other steps in the process such as scoping, maintenance and cost. Understanding the end-to-end process and ensuring that there are competent people and an organisational structure to support it is necessary. Maintenance of the soft sensor models needs to be cost-effective for the risks that are being managed [19,21].
The development of FMEA(s) and collection of failure data on the sensing system as a whole including mechanical, electrical and software elements is necessary. Data should include performance of sensing nodes in the field and meta data associated with their operational environment and maintenance practices. FMEAs will inform selection of appropriate variables to measure for validation assurance. While vendors of the COTS components may publish results of specific accelerated life tests on their MEMS sensors or microcontrollers, the data rarely include failure mode and mechanism information or in-service estimates of failure rates. We suggest that industry considers supporting a reliability data handbook for components used in MEMS-based sensing systems.
While the academic literature has a tendency to develop complex models our review finds that these are seldom used in industry. A survey of engineers in 21 Japanese organisations responsible for 439 process-focussed soft sensor applications [21] found that the major modelling method used was multiple regression analysis (67%) followed by partial least squares regression (21%). More complex models, such as ANN, were found to be rarely used and it was concluded that this is "an interesting gap between theory and practice".
In an assurance process, managing uncertainty is a crucial consideration. It is apparent from developments in the nuclear sector that deciding if you need to quantify uncertainty or not should be discussed at the start of the model selection process. If quantifying uncertainty is important then we suggest using a Bayesian framework for the modelling.

Concluding Remarks
The attraction of low cost MEMS-based sensing nodes connected wirelessly to a cloud platform is obvious for condition monitoring and structural health applications. These sensing nodes offer the opportunity for real time data, and much greater physical coverage of the asset at a fraction of the cost of current practice. As the cost of MEMS-based systems reduces, and the reliability improves there is risk that there will be a rush to install these systems without due consideration for the costs of validation. An efficient, safe and cost effective validation program will require the use of soft sensor models. These have been widely applied in the process industries for many years and there is much the CM&SI community can learn from this sector. Despite the plethora of models reported in the academic literature, reports from the process industry indicate that mainly simple models are dominant in industry applications and model maintenance is a critical issue. The cost and safety benefits of soft sensors could be substantial but will need appropriate management and a validation led sensing systems for CM&SI application.