Next Article in Journal
Automatic Detection of Dynamic and Static Activities of the Older Adults Using a Wearable Sensor and Support Vector Machines
Previous Article in Journal
Organic Molecules: Is It Possible to Distinguish Aromatics from Aliphatics Collected by Space Missions in High-Speed Impacts?
Open AccessArticlePost Publication Peer ReviewVersion 1, Original

A Hybrid Approach: Dynamic Diagnostic Rules for Sensor Systems in Industry 4.0 Generated by Online Hyperparameter Tuned Random Forest (Version 1, Original)

Department of Electrical Engineering and Computer Science, Knowledge-Based Systems and Knowledge Management, University of Siegen, 57076 Siegen, Germany
*
Author to whom correspondence should be addressed.
Received: 23 July 2020 / Accepted: 29 July 2020 / Published: 6 August 2020
(This article belongs to the Special Issue Data Science for Industry 4.0. Theory and Applications)
Peer review status: 1st round review Read review reports

Reviewer 1 Carlos Silva Federal University of Paraná (UFPR), Electrical Engineering Department, Curitiba, Brazil Reviewer 2 Yannis Karnavas Democritus Univ Thrace, Dept Elect & Comp Engn, Elect Machines Lab, GR-67100 Xanthi, Greece Reviewer 3 Yinsheng Chen Institute for Bioengineering of Catalonia
Version 1
Original
Approved with revisions
Authors' response
Approved
Authors' response
Approved with revisions
Authors' response
In this work, a hybrid component Fault Detection and Diagnosis (FDD) approach for industrial sensor systems is established and analyzed, to provide a hybrid schema that combines the advantages and eliminates the drawbacks of both model-based and data-driven methods of diagnosis. Moreover, it shines the light on a new utilization of Random Forest (RF) together with model-based diagnosis, beyond its ordinary data-driven application. RF is trained and hyperparameter tuned using three-fold cross validation over a random grid of parameters using random search, to finally generate diagnostic graphs as the dynamic, data-driven part of this system. This is followed by translating those graphs into model-based rules in the form of if-else statements, SQL queries or semantic queries such as SPARQL, in order to feed the dynamic rules into a structured model essential for further diagnosis. The RF hyperparameters are consistently updated online using the newly generated sensor data to maintain the dynamicity and accuracy of the generated graphs and rules thereafter. The architecture of the proposed method is demonstrated in a comprehensive manner, and the dynamic rules extraction phase is applied using a case study on condition monitoring of a hydraulic test rig using time-series multivariate sensor readings. View Full-Text
Keywords: industry4.0; fault detection; fault diagnosis; random forest; diagnostic graph; distributed diagnosis; model-based; data-driven; hybrid approach; hydraulic test rig industry4.0; fault detection; fault diagnosis; random forest; diagnostic graph; distributed diagnosis; model-based; data-driven; hybrid approach; hydraulic test rig
Show Figures

Figure 1

MDPI and ACS Style

Mallak, A.; Fathi, M. A Hybrid Approach: Dynamic Diagnostic Rules for Sensor Systems in Industry 4.0 Generated by Online Hyperparameter Tuned Random Forest. Sci 2020, 2, 61.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1

Reviewer 1

Sent on 24 Aug 2020 by Carlos Silva | Approved with revisions
Federal University of Paraná (UFPR), Electrical Engineering Department, Curitiba, Brazil

In this work, a hybrid component Fault Detection and Diagnosis (FDD) approach for industrial sensor systems is established and analyzed, to provide a hybrid schema that combines the advantages and eliminates the drawbacks of both model-based and data-driven methods of diagnosis.

 

The paper provides a new contribution where it is possible to create a hybrid FFD for industrial sensors. The contribution is clear and inedita. Some questions are indicated as showed bellow.

 

  1. In the first paragraph, industry 4.0 application examples can be supported by a relevant quote from the area where application examples are used.
  2. A sentence "three main categories; sensor faults, actuator and component/system faults", the authors describe different classes of faults. The sensor and actuator are ok, but about component/system as shown below It is needed to be revised. This type of fault, maybe, must be separated because It is possible having a fault in hardware systems such as a high temperature or low-efficiency processing of CPU. IN the other case, can be a fault due to bad implementation or coding the software.
  3. In "healthy state" it is not common in this case, could use "normal state".
  4. The final of section 2 where is presented a brief of related works it is suggested to insert a comparative table where the authors can light the real contribution of the proposed method, in order to show vantages and disadvantages.
  5. The sentence "In order to understand the mathematics behind RF, it is highly recommended to go through the explanation of decision trees and how they work in the first place." could be deleted.
  6. Equation of table 1 could be better presented as an equation in the text. What is the source citation of these equations?
  7. Algorithm 1 doesn't sound being an algorithm, but a description of steps. It is suggested change for algorithm steps or uses another way presentation.
  8. Section 4 is very short and doesn't bring important information such as (i) why is it used this source data from [27]? (ii) Is it possible to describe this machine plant test as a block diagram?
  9. Subsection 5.1 sounds to be part of the Material and Methods. Each block of figure 1 needs to be in deep described, especially where it is necessary to provide the type information used each one (input and output variables).
  10. At "The following table," the text could be removed.
  11. Based in Table 3, is it possible to affirm the method based on RF to improve techniques to identify faults, but no more than CART? Can the authors clarify what characteristics RF is better, such as, processing data information?
  12. Table 5 and also text present information about the accuracy of RF, but the accuracy is less than 1%. Is it correct? These values sounds are very low. 0.985% or 98.5%?
  13. In Figure 2 must be inserted x and y-label.
  14. Must insert the conclusion section at the end.
  15. At the Abstract and other parts of the manuscript, it is informed SQL queries was used. SQL is a language for database (DB) systems where is possible to create and select information from tables from DB, but It was not described the queries used such as "select from *". What kind of DB used?

 

 

Response to Reviewer 1

Sent on 20 Sep 2020 by Ahlam Mallak, Madjid Fathi

Dear Dr. Silva, Thank you so much for accepting my invitation to review our work. Your valuable comments will be taken into consideration and attention. Below, I will reply to your comments one by one. Please not that your comments are the ones between quotations. Reviewer 1: "A sentence "three main categories; sensor faults, actuator and component/system faults", the authors describe different classes of faults. The sensor and actuator are ok, but about component/system as shown below It is needed to be revised. This type of fault, maybe, must be separated because It is possible having a fault in hardware systems such as a high temperature or low-efficiency processing of CPU. IN the other case, can be a fault due to bad implementation or coding the software." Author: Here the system described is an industry 4.0 mechanical or electronic. We are not talking about symptoms or warnings. Since human coding are counted as errors, not faults nor symptoms. As well as, any machine learning algorithm learns from the data provided by the system. So, we assume the data at least is accurate and coder or human inaccurate results to generate the data are not in our concern. Reviewer 1: "In "healthy state" it is not common in this case, could use "normal state"." Author: We are not using normal state. We are using the full efficiency state of all the system's components which is not normal, but the healthy state considered for this experiment. Reviewer 1: "The final of section 2 where is presented a brief of related works it is suggested to insert a comparative table where the authors can light the real contribution of the proposed method, in order to show vantages and disadvantages." Author: Thanks for the suggestion. It has been added in our research but was not included to the journal paper, since the scope was more applications that survey theory. Reviewer 1: "The sentence "In order to understand the mathematics behind RF, it is highly recommended to go through the explanation of decision trees and how they work in the first place." could be deleted." Author: In our research we have a detailed literature for decision tree and all the mentioned approaches. But it was not necessary for us to add them to the paper, since the journal is a technical journal and we assume that our target reader already knows. Reviewer 1: "Algorithm 1 doesn't sound being an algorithm, but a description of steps. It is suggested change for algorithm steps or uses another way presentation." Author: Thanks for the suggestion. The steps shows a Pseudo code of an algorithm such as random forests, hence it is an algorithmic representation of the method. Reviewer 1: "Section 4 is very short and doesn't bring important information such as (i) why is it used this source data from [27]? (ii) Is it possible to describe this machine plant test as a block diagram?" Author: Thank you for the suggestion. This article is a part of an ongoing research project and there the system and its components and research questions are answered. Besides this, the data was chosen because it is suitable to the research project's scopes and needed information. Reviewer 1: "Based in Table 3, is it possible to affirm the method based on RF to improve techniques to identify faults, but no more than CART? Can the authors clarify what characteristics RF is better, such as, processing data information?" Author: This observation has been already answered in the text right below the table. We have mentioned that we chose RF over CART because that RF is the optimization of CART to overcome overfitting problem. So CART might have a visible 0.4% higher accuracy (almost nothing) but it is highly likely that the results are overfitted to the training data, unlike RF. Reviewer1: "Table 5 and also text present information about the accuracy of RF, but the accuracy is less than 1%. Is it correct? These values sounds are very low. 0.985% or 98.5%?% Author: We have not claimed that the results are in Percentage. It is your personal assumption. the accuracy was 0.985 which is equivalent to 98.5% in percentage system. Which shows high accuracy for the mentioned algorithm. Reviewer 1: "Must insert the conclusion section at the end." Author: The conclusion is obviously been added. Please check the discussion and future work section. Reviewer 1: "At the Abstract and other parts of the manuscript, it is informed SQL queries was used. SQL is a language for database (DB) systems where is possible to create and select information from tables from DB, but It was not described the queries used such as "select from *". What kind of DB used?" Author: We have not claimed to use a database or SQL language in this work. It is left for future work. We show in this work a new architecture to use Random Forest as a hybrid approach between model-based and data driven approaches for diagnosis. If any other researcher or ourselves in the future would like to use this architecture for SQL rules or semantic rules such as SPARQL, it is completely possible. Thank you so much for the feedback. Best Regards, Ahlam Mallak

Reviewer 2

Sent on 15 Sep 2020 by Yannis Karnavas | Approved
Democritus Univ Thrace, Dept Elect & Comp Engn, Elect Machines Lab, GR-67100 Xanthi, Greece

In this work, the architecture of a hybrid FDD method, between model-based and data-driven approaches to achieve FDD for component faults, is introduced. In this hybrid method, the datadriven part is represented by an optimized and hyperparameter tuned RF, in order to generate dynamic, diagnostic graphs that are later converted into a set of diagnostic rules and fed into a predefined system diagnostic model, acting as the model-based part of the proposed FDD system. The proposed approach provides a dynamic solution, unlike other model-based FDD approaches.

The work is technically sound and the paper is well written.

Response to Reviewer 2

Sent on 20 Sep 2020 by Ahlam Mallak, Madjid Fathi

Dear Reviewer 2, Thank you for the time and effort reviewing our paper, and for your positive feedback. It is such an honor. Wish you a pleasant day! Best Regards, Ahlam Mallak

Reviewer 3

Sent on 14 Sep 2020 by Yinsheng Chen | Approved with revisions
Institute for Bioengineering of Catalonia

This paper presents a hybrid method for sensor array fault diagnosis.

1.Industry 4.0 is an important development direction, which is worth studying. It is suggested to explain some content of Industry 4.0 in the Abstract, because industry 4.0 is mentioned in the Keywords section.

2.Can the author explain the difference between the fault diagnosis of traditional sensor system and the fault diagnosis of sensor system in Industry 4.0? Can the traditional sensor fault diagnosis method be applied to the fault diagnosis of sensor system in Industry 4.0?

3.The proposed method is consistent with the fault diagnosis method based on pattern recognition in essence.Data-driven methods can make data selection, such as PCA. The random forest method is equivalent to a classifier to realize the classification of features.What are the differences between the proposed method and pattern recognition?

4.The resolution of some pictures in this paper is low, and the space is too large, please adjust the author.For example, FIg.3 and FIg. 4.

Response to Reviewer 3

Sent on 20 Sep 2020 by Ahlam Mallak, Madjid Fathi

Dear Reviewer 3, Thank you so much for your time and effort in reviewing our work. I will reply to your comments one by one below. ^^

That is a great idea. The reader is expected to know industry 4.0 and its applications because this special issue of Sci Journal is addressed to take technologies in the field of industry 4.0. Thus, the reader is expected to be familiar with the topic.

Fault detection also is known as anomaly detection has well-defined branches and categories such as fault classification, clustering, encoder-decoder,..etc. You can definitely apply the traditional methods to detect faults in the industry 4.0 as any sensor system. However, keep in mind the inputs or modalities used to capture the data to train. If all the features are extracted from sensors, then it is going to be similar to traditional sensor systems. But, if there are other modalities like images, sound, signals..etc the process will differ. (at least the feature acquisition from the source data part will).

Random forest (RF) is an optimization classifier (or regressor) for traditional classification and regression trees (CART). RF is a pattern recognition method, hence the fault detection with it belongs to the data-driven approaches. However, using the extracted rules by RF in a model-based schema can make RF a hybrid fault detection approach as explained in this method.

Thank you! Best Regards, Ahlam Mallak

Back to TopTop