A Cyber Physical Interface for Automation Systems—Methodology and Examples

.


Motivation
Cyber-physical systems (CPS) are engineered systems that are built from and depend upon the synergy of computational and physical components.Emerging CPS will be coordinated, distributed, and connected, and must be robust and responsive.CPS will transform the interaction of engineered systems, just as the Internet transformed the way people interact with information.In manufacturing, CPS can improve productivity and quality through smart prognostics and diagnostics utilizing big data from different networked sensors, machines, and systems.Each physical component and machine will have a twin model in cyber space.Each component and machine can predict and prevent potential failure and further with self-aware, self-predict, self-compare, and further self-reconfigure, and self-optimize for robust intelligence and performance.With these capabilities, future products and systems can be transformed to be more intelligent and resilient to dynamic changing environments.What if a machine can learn from its own history and also other machines?What if a wind turbine can learn from its peers within the same wind farm so that its condition and maintenance requirement can be quickly generated?A machine could also self-assess its component degradation so that machining parameters can be adjusted to prevent part quality problems.
What if a machine can learn from human knowledge and operations to improve its intelligence for error prevention?A self-aware and self-maintenance machine system is defined as a system that can self-assess its own health and degradation and further use similar information from other peers for smart maintenance decisions to avoid potential issues.Smart analytics for achieving such intelligence will be used at the individual machine and also at the fleet level (see Figure 1).
The purpose of this paper is to address the research gaps in developing a self-aware and self-maintaining framework for automation applications and related machines, so that these systems can be self-aware, self-compare, self-predict and further make self-prioritize and self-optimized decisions.The key objective is to design a unified CPS platform by leveraging the recent advances in information systems and cloud based analytics.

Cyber Physical Interface
The CPS research area has been addressed by the American government since 2007, as part of a new developments strategy [1,2].Applications of CPS include but are not limited to the following: manufacturing, security and surveillance, medical devices, environmental control, aviation, advanced automotive systems, process control, energy control, traffic control and safety, smart structures and so on [3].Janos Sztipanovits et al. [4] indicated heterogeneity as one of the most challenging and important factors in implementation of cyber-physical system in any real life applications.Heterogeneity demands cross-domain modeling of interactions between physical and cyber (computational) components and ultimately results in the requirement of a framework that is model-based, precise and predictable for acceptable behavior of CPS.When designing a CPS for different machine systems in industrial environment, it is difficult to attain the highest level in one step, since it includes different levels of technical implementation.To better evaluate the current status and design a CPS in a systematic way for manufacturing industries, in this paper we propose a hierarchical architecture for CPS value propositions (see Figure 2).In this architecture, CPS can be implemented from the basic lowest connection and data-to-information level, and then increase its value to users through adding advanced analytical and resilient functions in higher levels.Compared with DIKW Pyramid, which is presented in [5,6], the five-level architecture specifically focuses on how to enable physical machines to utilize Data and Information to create Knowledge and Wisdom.
The five levels will be described as follows: I. Smart Connection Level: From the machine or component level, the first thing is how to acquire data in an efficient and reliable way.It may include a local agent (for data logging, buffering and streamlining) and utilize a communication protocol for transmitting data from local machine system to a remote central server.Previous research has investigated and designed robust factory network schemes based on well-known tether-free communication methods, including ZigBee, Bluetooth, Wi-Fi, UWB, etc. [7][8][9].To make machine systems smarter, data transparency is definitely the first step.

II. Data-to-Information Conversion Level:
In an industrial environment, data may come from different resources, including controllers, sensors, manufacturing systems (ERP, MES, SCM and CRM system), maintenance records, and so on.These data or signals represent the condition of the monitored machine systems, however, this data must be converted into meaningful information for a real-world application, including health assessment and fault diagnostics.III.Cyber Level: Once we can harvest information from machine systems, how to utilize it is the next challenge.The information extracted from the monitored system may represent system conditions at that time point.If it can be compared with other similar machines or with machines in different time histories, users can gain more insight on the system variation and life prediction.It is called cyber level because the information is utilized in creating cyber avatars for physical machines and building a great knowledge base for each machine system.IV.Cognition Level: By implementing previous levels of CPS, it can provide the solutions to convert the machines signals to health information and also compare with other instances.In cognition level, the machine itself should take advantage of this online monitoring system to diagnose its potential failure and aware its potential degradation in advance.Based on the adaptive learning from the historical health evaluation, the system then can utilize some specific prediction algorithms to predict the potential failure and estimate the time to reach certain level of failures.V. Configuration Level: Since the machine can online track its health condition, the CPS can provide early failure detection and send health monitoring information to operation level.This maintenance information can be feedback to business management system so that the operators and factory managers can make the right decision based on the maintenance information.At the same time, the machine itself can adjust its working load or manufacturing schedule in order to reduce the loss of the machine malfunction and eventually achieve a resilient system.
Among the above-mentioned five levels, there are many research efforts invested in the first and second levels.For the first level, communication protocols, such as MTConnect [10] and OPC, can help the users acquire the controller signals.An example of an MTConnect-agent for machine tool monitoring was presented by Kao et al. [11].It appears that with the success of the research and development work in [11], that acquiring data from machine tools and other manufacturing equipment will become less of a hurdle.Common data protocols and data format standards are also breaking down the barriers for achieving the connection level, as noted by Koronios et al. [12] and their review of related standards for engineering asset management.However, for more complicated factory systems, such as in semiconductor manufacturing [13], the integration of heterogeneous sources of data (from different suppliers, different time stamps and data formats) remains a challenge.
The data to information level and related works in prognostic and diagnostic have also received considerable attention.Trendafilova et al. [14] presented a non-linear data analysis method using accelerometer measurements to detect and estimate the severity of backlash for industrial robot joints.Despite its promise, the method in [14] was validated on induced faults and not on natural occurring problems that occur in a production environment.Also, the use of accelerometers for monitoring robots could be cost prohibitive if the method was actually implemented.Liao and Pavel [15] had some promising results for a machine tool feed axis application, in which multiple baselines were used to train a self-organizing map health model.The inputs to the model included signatures from the vibration, temperature, and torque measurements, and the method could detect and diagnosis the different types of induced faults on the feed axis system.Although the method achieved reliable results, it was noted that the effect of the machine warming up and other factors could require a model that adapts or is retrained over time.
The previously described work highlights the need to use time histories and algorithms that learn over time, in order to reach the cyber level, and achieve the reliable health information and life estimation for manufacturing equipment and automation systems.Developing health monitoring algorithms that use time histories from different degradation periods and data from similar units in the fleet is not a trivial task, and only recently has there been research that has addressed this topic.The work in [16] applied dynamic Bayesian networks (DBN) for estimating cutting tool wear and remaining useful life, in which a library of cutting tools at different wear stages was used to train the DBN wear estimating algorithm.Although the analysis method was quite sophisticated, for practical use, the life prediction method would have to account for varying operating conditions and the effect of maintenance actions on the life estimation.A recent dissertation work by Lapira [17] investigated the use of clustering algorithms and a machine to machine comparison approach for assessing the health condition of a fleet of assets, such as wind turbines or industrial robots.The fleet-based similarity approach offered more reliable health information then the traditional baseline approach to health monitoring, but would still depend on the caveat that the majority of the units are in a normal health condition.
The cognition level aims to use reasoning and decision making algorithms to recommend the appropriate maintenance or production actions based on the health information from the monitored equipment.Although several works have focused on maintenance and production decision support systems [18], it is typically based on an assumed equipment health or reliability value.Thus, a fully integrated system that incorporates actual equipment health values into the decision making process is not mature, and thus achieving the cognition level for various industries remains a challenge.The work by Haddad [19], considers that the health monitoring system will provide a remaining useful life (RUL) prediction of the asset as an input, and uses option theory to decide on when to take the appropriate maintenance action.Although the modeling approach considered several cost factors and constraints, the maintenance actions did not consider the possibility of reducing the load or speed to further extend the life of the asset.The work by Iyer, Goebel, and Bonissone [20] also draws an interesting conclusion about the link between the algorithm recommended decisions and the human operator that will ultimately decide on the appropriate action.It is noted in [20] that the amount of information that needs to be processed will be outside the capacity for human-decision makers, thus the value in the decision making system is to recommend the top choices, but let the human operator, engineer, or maintenance worker make the final decision.
The work in [20] highlights that the current technology involves a human in the loop, and the ability for the machine to self-adjust or configure is not a current practice.Examples of machines achieving the final self-configuration level are not mature, and thus there are significant research opportunities for advancing this aspect.Although far from reaching the full concept of self-configuration, for rotating machinery, there has been some work on active vibration control and compensation for shaft unbalance [21].In addition, the work that is related to chatter suppression or control for machine tools [22] can be considered one element of the concept.However, accountings for the degradation of other components or adjusting the operating conditions to extend the life have not been realized.In addition, this aspect of self configuration for a group of machines or a production line are even less developed but are part of the overall vision for the configuration level.

Technical Approach
The interface between the cyber space and the physical asset space cannot be realized without the proper platform and analytics technology.Keeping this in mind, this applies that the cyber representation of the automation system or asset is not a trivial task and requires advance learning algorithms and the use of historical machine states "time machines" to achieve this accurate cyber representation.A high-level view of this cyber physical analytics platform with self learning capabilities is illustrated in Figure 3, in which one must initially decide on which aspect of the physical world to consider for the cyber physical platform.
In terms of the physical world, one first must select which fleet of assets to build a cyber physical model for, and what attributes of the asset are important and have value to the end user.For a cyber physical machine automation example, one could consider a variety of different assets, such as a fleet of machine tools, a fleet of industrial robots, or a fleet of automated guide vehicles.Based on the needs of the particular plant or end user, the fleet of systems could also be at the component or subsystem level, such as a fleet of spindle bearings or a fleet of machine tool ball screws.After selecting the physical assets to consider and at what level of hierarchy, one should also consider what hidden state of the machine is important and should be captured by the cyber representation.For example, is the performance degradation of the equipment a concern (e.g., ball screw position accuracy), or is the resultant product quality the main area of interest, or are unexpected failures the most important problem to prevent.Also, from the plant managers' perspective, their overall interest could be focused on the system level, such as achieving improved plant productivity.... Once deciding on the appropriate physical space, it is necessary to decide on the physical assets to model and what type of information is of value to the end-user.One must then consider the cyber-physical interaction between the physical asset world and the cyber representation.The link between the physical world and the cyber-physical interaction is the machine and factory data.The machine data could be sensory data, data from controllers, maintenance records, repair history, and also human inputs if they are recorded.This set of heterogeneous information from a fleet of assets should be time-stamped, that way one can build these "time machines" that capture the fleet histories over time.The time machines would capture signatures (feature values) from the sensory data from the machine, as well as the utilization history, maintenance logs, and other processed information from the raw data.Factory and more system level data could also be provided to the time machines, and this could consist of product quality data, overall equipment effectiveness (OEE) values over time, among other plant level information.

Cyber-Physical Interaction
Effectively, the collection of these "time machines" with the processed machine and factory data is served as the input of different modeling techniques.Then, a cyber representation of a fleet of assets can be created.Each cyber representation of a machine or component is also called a "twin" or a "virtual" model of the system, since the cyber model has learned from its own history and other similar units, and can be considered a cyber twin of the physical asset.The initial step in achieving this cyber representation is the knowledge accumulation module.For machine degradation algorithms, one of the difficulties in using them in practice is that the models can be less accurate after maintenance is performed, since the baseline condition has changed.Another challenge is if a new operating regime or working condition occurs that was not considered or included in the algorithm training.Thus, adaptive clustering algorithms that can add new working conditions or baseline states into the health monitoring algorithm can be used to improve the modeling over time.In addition, using data from similar units in the fleet are not being effectively used by conventional health monitoring approaches, and thus a more robust and reliable model can be developed if it incorporates data from similar assets in the fleet.Lastly, the utilization information and stress factors that affect the component or subsystem degradation rate are also an important aspect of the knowledge base.The stress factors and the time history of the feature data (degradation indicators) from a fleet of units can be used to build a utilization life matrix.This utilization life matrix provides the weights and factors for using the cyber model to estimate the assets remaining useful life.
This knowledge base that includes the ability to add new regimes or baselines, as well as the utilization life matrix, provide the supporting basis for performing the self-assessment and self-prediction of the monitored fleet of machines.Various statistical, machine learning, and classification methods can be used to assess the machine health condition by comparing the current data patterns to the ones accumulated in the knowledge base.The interested reader is referred to [23] for a more comprehensive review of prognostic and health management algorithms for rotating machinery based on this data pattern comparison.This self-aware and self-prediction degradation information can be transferred from the cyber space to the physical space, in which the health information can be presented to the machine operator, the maintenance technician, or the plant manager.The health information and factory-level data can also be input into a factory decision support system, which can be used to optimize the maintenance and production scheduling.More details on the analytics for factory level decision-making and maintenance opportunity windows can be found in [18,24].In general though, the factory decision support system is based on dynamic programming and optimization, while still considering the production and maintenance constraints.This production and maintenance scheduling information would also interact with the physical world, since this information would be provided to plant production and maintenance managers.This completes the interaction between the cyber space and the physical space, and this interaction would continue over time as the cyber model accumulates more knowledge.

Experiments-Ball Screw Case Study
There is an increasing requirement for high precision and high efficiency in manufacturing.The ball screw is widely used in these high precision manufacturing application, such as in machine tools, and other precision instruments.The ball screw is used to convert rotary motion to linear motion with high accuracy, reversibility and efficiency.As it is one of the most important components in the machine system, its accuracy directly affects the overall precision of the whole machine.It is usually preloaded with high rigidity and accuracy, thus increasing the friction resistance, which will deteriorate the internal surface after repeated usage.This will eventually impact the machine system with an undesirable loss of position accuracy.Therefore, the accuracy of ball screw is one of the main for high precision machines.
The Center for Intelligent Maintenance System (IMS) is collaborating with HIWIN technology Taiwan to add additional functionality into the ball screw system, including algorithms for position error estimation and life prediction.An in-house ball screw test rig has been constructed to perform the life testing and develop and validate the self-aware and life prediction models.The ball screw usually takes years to fail, so it is very time-consuming to understand its behavior through normal operation condition.Therefore, the accelerated lab experiment is applied to expedite the degradation and discover its different degradation patterns in limited time and cost.The ball screw is accelerated with high acceleration rate (1.5 g), high speed (3600 rpm), high load (75 kg) and vertical test condition, as shown in Figure 4. External sensors are installed to measure the ball screw performance, like vibration, temperature, speed, torque, etc.The position error is an important criteria to determine the ball screw's accuracy, thus a linear encoder is placed beside the ball screw to measure the actual position the ball screw can reach.The ball screw continuously runs back and forth, which generates high friction and vibration.The heat caused by friction can damage its internal surface, thus the backlash will happen and then precision will lose eventually.In this way, health-monitoring model using vibration signals is established to predict the position error, while the position error signals will be used to benchmark the health value calculated from the vibration signals.

Results and Discussion-Ball Screw Case Study
In the physical world, the ball screw can be applied in any situations, either for lab experiments or for manufacturing.According to the developed methodology described in previous chapter, external sensors will be mounted on ball screw and also the motor output is pulled out to monitor its performance, as seen in Figure 5. Beside of the collected sensor signals, the CPS model input also requires the parameters of ball screw, operation condition, maintenance records, etc.This information is shown in the configuration interface on CPS platform, like the ball screw cloud platform website.Because of large variety and volume of acquired sensor data, all of the collected data should be pre-processed locally before transmitting to cloud server, including signal de-noising, outlier removal, feature extraction and selection, and health assessment, etc.This procedure is called Data-to-Information, level II in the CPS system; it can convert the meaningless data to useful information, which can greatly reduce the data quantity and only keep health value to represent the ball screw's performance.After the preprocessing, the features or health value will be transmitted to cloud server through wireless or wired Internet, from physical world to the cyber space in Figure 5.In the cyber space, it can visualize the health information through cloud based website or apps on smart mobile devices.The right figures in Figure 4 highlight how the result from the physical world is visualized as information in cyber space.This cloud-based PHM platform organizes the ball screw's information including manufacturing parameters, working regime, maintenance records, and also provides health information and diagnosis result for each ball screw, as the health risk radar chart.Additionally, the greatest concern for the end users is the position accuracy loss, which cannot normally be measured during production.The position error based life prediction determines the ball screw's life based on position loss.Normally, the ball screw manufacturer provides certain suggestions for the ball screw's theoretical life, or position error limit depending on the different applications.Because the accelerated life test cannot directly reflect the actual life for the ball screw under field factory environment, certain statistics methodology should be applied to convert the accelerated life to real life and also provide a set of standard criteria to calibrate the ball screw's life according to different working regime.The right bottom figure of Figure 5 gives the results of the health assessment and life prediction.The each component on the ball screw driven system is marked on the radar chart, the value on the radar chart indicates its health condition.The right figure is the remaining useful life; it exhibits the percentage of life according to its remaining useful life (RUL).The RUL is percentage value comparing the current predicted position error to the position error limit.The experimental result is shown in Figure 6, the top figure is the measured position error and predicted error, while the bottom figure is the predicted remaining useful life.Therefore, the end users can monitor and predict the position loss uses these algorithms and the cyber physical interface, thus plan maintenance accordingly during the ball screws life cycle.Therefore, this predictive factory system can provide significant improvements compared to traditional maintenance schemes.This technology can be used to detect incipient problems and prevent failures from occurring.Also, this health information can be used to optimize maintenance and production scheduling.The diagram in Figure 7 summarizes a structure of the CPS interface for the ball screw system.For the smart connection level, data is collected from the sensor network for the ball screw systems in the manufacturing factory.Furthermore, the raw data is stored locally and using Watchdog Agent ® Tools is able to check the data quality, extract features and calculate health value accordingly.The feature matrix and health information are transferred to ball screw cloud, as seen as level II Data-to-Information in Figure 7.In the cyber level, a virtual ball screw has a data analysis model that corresponds to a physical ball screw that is in use.The ball screw might have different types of failure modes; each failure mode corresponds to a certain type of degradation pattern.So the ball screw clusters stores these degradation patterns in the cyber space.Therefore, this virtual synthesis in the cyber space can predict the degradation pattern for the real ball screw according to these clusters after diagnosing its incipient failures, as more data is collected over time and more degradation patterns and signatures are learned during the components life cycle.The virtual ball screw can include the self-aware and self-predict capabilities, so one can use this information to take the appropriate maintenance action on the real physical ball screw.Furthermore, through this cyber physical platform, the manufacturing managers can use the information to optimize maintenance scheduling.Eventually, the cognition level enables the machine with the capability of having the self-compare and self-configure functionality.For example,

Figure 1 .
Figure 1.Research Gap of Self-aware and Self-Maintenance Machines in a Multi-dimensional Environment.

Figure 3 .
Figure 3. Cyber Physical Analytics Platform with Self Learning and Time Machine Capabilities.

Figure 4 .
Figure 4. Setup of the Ball Screw Test Rig.

1 Ac2T
Tot al H our s P o s itio n E rro r H1 .2M o del to P redict H2.Tot al H our s L ife P e rc e n ta g e RUL P red ict H2 .1 (Lim it is 20um ) Pr edicte d Life % Predicted Remaining-Useful Life % o tal H o u rs P o sitio n E rro r H 1.2 M o d e l to P re d ict H 2.1 H 2 1 S c o r e i s 0 .0 7 5 1 1 7 H 2 1 R M S E i s 0 .0 0 4 9 1 1 9 H 2 1 M e a n E r r o r i s -0 .0 0 1 2 9 7 4 H 2 1 M e a n A b s E r r o r i s 0 .0 0 3 7 4 3 6 H 2 1 M a x A b s E r r o r i s 0 .0 1 8 3 3 1 H 2 1 M i n A b s E r r o r i s 1 .5

Figure 5 .
Figure 5. Data to Information Level.