Next Article in Journal
Hybrid Design Tools—Image Quality Assessment of a Digitally Augmented Blackboard Integrated System
Previous Article in Journal
Unstructured Text in EMR Improves Prediction of Death after Surgery in Children
 
 
Article
Peer-Review Record

Statistical Deadband: A Novel Approach for Event-Based Data Reporting

by Nunzio Marco Torrisi
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Submission received: 5 December 2018 / Revised: 11 January 2019 / Accepted: 15 January 2019 / Published: 18 January 2019

Round  1

Reviewer 1 Report

The paper presents a new approximation to the Send on Delta (SoD) paradigm based on selecting the Delta-value by statistical approaches. The focus of this new approach is described to be applied to the Industrial data Gateways between the fieldbus to the Scada, but it is not limited to.


Although the paper is of interest, some remarks should be taken into consideration:


- Several important SoD schemes are omitted in the "State of the Art"  that are relevant with the article, like  the relative SoD, the lineal SoD and the dynamic SoD schemes (Network based).

- In most part of the SoD literature, the "amplitude-sensitive equation" is called the trigger-function. I suggest to maintain this nomenclature.

- Please clarify the reconstruction signal process in both SoD strategies. Is it ZOH?

- Figures are not well numbered.

- Graphs of pages 9 and 10 are not very significant and clear. In fact, I can't see differences between them. It would be more helpful to present numerical results in a table.

- It would be nice to perform a deeper theoretical analysis to see the effectiveness of the proposed approach. For example a first approach to relate analytically the Delta-value with the bandwidth consumption for both SoD schemes. So further analysis could be started.

- It would be nice to present a graph showing the evolution of upperBB, lowerBB and Delta-value for the BD, in a short time period. Similar with the VD.

- It is for me difficult to understand the specific motivation to use this SoD schemes and not other ones (for example energy based). A comparison with other SoD strategies would be helpful, and not only with the AD.

- It is not clear for me how the Delta-value is calculated in the VD. As I can see, the trigger-function changes (the Volatility Indicator changes with each sample) but I can't deduce if the Delta-value is calculated in a similar way than in the BD.

- No quantitative results for the effectiveness "p" are given.  In fact, in the graphs we can see that although the effectiveness of the VD is better in all cases, the fidelity is quite lower than the AD. I suggest to create a "Performance function" to put together the effectiveness and the fidelity.

- I suggest to present only the graphs for one sampling period (f.ex. 210ms) and numerical values for the other ones, due the similarity in all the cases.


Some other remarks:

L138: Typo: It size is the same ... -> Its size
L142:  Typo: samoling schema ...

L144: where is table 1?.

L200: noindent

Grammar: -  In this algorithm is used the absolute value of %B because (The absolute value of ... is used in this algorithm ...


Author Response

Dear Reviewer,

Thank you for your constructive comments concerning my manuscript entitled “Statistical Deadband: A Novel Approach for Event-Based Data Reporting”. I have studied your comments carefully and made major corrections which I hope meet with your approval.  I answer your questions or comments in details in the following texts. Detailed answer to review:

 

Point 1: - Several important SoD schemes are omitted in the "State of the Art"  that are relevant with the article, like  the relative SoD, the lineal SoD and the dynamic SoD schemes (Network based).

 

Response 1: Added a sensors literature related to SoD schemes at the end of subsection 2.1 

 

Point 2- In most part of the SoD literature, the "amplitude-sensitive equation" is called the trigger-function. I suggest to maintain this nomenclature.

 

Response 2: Added the “trigger-function” sensors nomenclature for the SoD literature.

 

Point 3- Please clarify the reconstruction signal process in both SoD strategies. Is it ZOH?

 

Response 3: Revised line 81 detailing the default OPC/SCADA report strategy(ZOH).

 

Point 4- Figures are not well numbered.

 

Response 4: Fixed the latex cross-ref figure indexing error.

 

Point 5- Graphs of pages 9 and 10 are not very significant and clear. In fact, I can't see differences between them. It would be more helpful to present numerical results in a table.

 

Response 5: The Figure 2 at page 9 shows a compact view of the signals regenerated with the formula (11). The common data acquired by OPC/SCADA systems are pseudo periodic than graphs proposed permit to the reader an easy observation of the periodical nature of the random signals. 

It was added a the Table 2 to present a resume of normalized numerical results of the Figure 3. Added also some lines of comments for the results on the table 2.

 

Point 6- It would be nice to perform a deeper theoretical analysis to see the effectiveness of the proposed approach. For example a first approach to relate analytically the Delta-value with the bandwidth consumption for both SoD schemes. So further analysis could be started.

 

Response 6: In future works, dedicated to sensors context, we will investigate the theoretical analysis using the statistical approach.

 

Point 7- It would be nice to present a graph showing the evolution of upperBB, lowerBB and Delta-value for the BD, in a short time period. Similar with the VD.

 

Response 7: upperBB and lowerBB adapt dynamically to the samples variation expanding and contracting as volatility increases and decreases. There are many graphics showing this basic concept in Bollinger Bands literature, blogs and Wikipedia. The effect on the bands caused by the volatility of the signal sampled is direct and intuitive. 

 

Point 8- It is for me difficult to understand the specific motivation to use this SoD schemes and not other ones (for example energy based). A comparison with other SoD strategies would be helpful, and not only with the AD.

 

Response 8: The application area of the statistical deadband is SCADA collecting different kind of signals (periodical and random). The proposed deadband algorithm is the first using the statistical properties of the Bollinger Bands which are well known in financial literature and in few engineering applications. Comparison with deterministic and analytic approaches used for sensors requires in some case prior assumptions as control laws or preset intervals. 

 

Point 9- It is not clear for me how the Delta-value is calculated in the VD. As I can see, the trigger-function changes (the Volatility Indicator changes with each sample) but I can't deduce if the Delta-value is calculated in a similar way than in the BD.

 

Response 9: Thereif condition in algorithm 3 was correct to: |%B| > d. The VD algorithm doesn’t deduce a Delta-value.

 

Point 10- No quantitative results for the effectiveness "p" are given.  In fact, in the graphs we can see that although the effectiveness of the VD is better in all cases, the fidelity is quite lower than the AD. I suggest to create a "Performance function" to put together the effectiveness and the fidelity.

 

Response 10: Added more discussion about the graph related to Figure 3. ”The fidelity was measured as L2 norm distance. The increasing of the value of L2 norm distance means a decreasing of fidelity because the reconstructed signal(using ZOH) is more distant from the original signal. Therefore the signals reconstructed by used the samples of BD can be considered closer to the original signal than the VD algorithm. From the other side the samples, filtered by the VD algorithm, are quantitatively more then BD as shown in the effectiveness graph”. 

A Performance function is a great suggestion for future works. The amount of simulated data using the statistical approach was huge for effectiveness and the fidelity. These two metrics are known in conventional literature(computer manufacturing and sensors literature).   

 

Point 11- I suggest to present only the graphs for one sampling period (f.ex. 210ms) and numerical values for the other ones, due the similarity in all the cases.

 

Response 11: There is a similarity in all cases. All cases represent common setup in OPC Client group. The default and most common sampling values in OPC Client group setup are in local area networks with scanrate from 200ms to 300ms. Less than 200ms scanrate the monitoring and control is not suitable for OPC/SCADA over local area networks. 

 

Point 12Some other remarks:

L138: Typo: It size is the same ... -> Its size
L142:  Typo: samoling schema ...

L144: where is table 1?.

L200: noindent

Grammar: -  In this algorithm is used the absolute value of %B because (The absolute value of ... is used in this algorithm ...

Response 12: Fixed L138, L142, L144, L200. Grammar fixed too

Author Response File: Author Response.docx


Reviewer 2 Report

Comments to Author:

The work in this paper introduces two approaches to event-based data reporting algorithms namely bollinger deadband (BD), that is based on moving averages, and volatility deadband (VD), based on volatility. 


In my opinion, the paper has two important shortcomings:

It describes two approaches namely BD, VD, which are not themselves very novel, and more importantly it lacks any further analysis that is necessary. Like in Ref. [1], this work does not describe the algorithm's influence on statistical properties like mean and variance which would fundamentally alter the reporting and hence the process. Or it does not argue as to how effectiveness and fidelity address this.

The author doesn't provide any information on how numerical values of several parameters involved in algorithms are sought after or tuned. This is far from being straight-forward because like in ref. [28], an AI is used to learn the parameters for a particular type of data. An intuition of how this can be done has to be explored. The simulation results in section 5 also simply use parameters from [28] unlike trials on actual sensor data. 


General comments:

The language used in the paper has to be considerably improved.

The if condition in algorithm 3 should be |%B| > d. 

It is unclear as to what 'n' is in the equation (11).

Author Response

Dear reviewer,

Thank you for your constructive comments concerning my manuscript entitled “Statistical Deadband: A Novel Approach for Event-Based Data Reporting”. I have studied your comments carefully and made major corrections which I hope meet with your approval. I answer your questions or comments in details in the following texts. Detailed answer to review:

 

Point 1: It describes two approaches namely BD, VD, which are not themselves very novel, and more importantly it lacks any further analysis that is necessary. Like in Ref. [1], this work does not describe the algorithm's influence on statistical properties like mean and variance which would fundamentally alter the reporting and hence the process. Or it does not argue as to how effectiveness and fidelity address this.

Response 1: “….BD, VD, which are not themselves very novel, and more importantly it lacks any further analysis that is necessary.” - The submitted paper is the first paper discussing the BD and VD algorithms after the public release of the source code in the relative R package(https://cran.r-project.org/web/packages/deadband/deadband.pdf). 

The analysis presented in the paper cover the effectiveness and the computational cost in comparison with the classical deadband approach.

 

Point 2:The author doesn't provide any information on how numerical values of several parameters involved in algorithms are sought after or tuned. This is far from being straight-forward because like in ref. [28], an AI is used to learn the parameters for a particular type of data. An intuition of how this can be done has to be explored. The simulation results in section 5 also simply use parameters from [28] unlike trials on actual sensor data. 

 

Response 2:  The ref. [28] studies the AI to find optimal bands for stop loss financial target. In the submitted manuscript the bands are used in the algorithms tuned with ‘d’ parameter to measure the effectiveness over the filtered samples. Any effect of manipulations on the bands can be relativized by the manipulation of ‘d’. The normal use of the deadband algorithm in SCADAs requires just the setup of the sampling time and the ‘d’ parameter.   

 

Point 3: The language used in the paper has to be considerably improved.

 

Response 3:We submitted the revised version to the MDPI English editing service.

 

Point 4: The if condition in algorithm 3 should be |%B| > d.

 

Response 4: Yes, the condition is |%B| > d as in the public R code(file deadbandVD.R in the package). It was fixed in the algorithm 3.

 

Point 4: It is unclear as to what 'n' is in the equation (11).

 

Response 4: ’n’ in (11) is the generic n element in the dataset -  at line 182(ex 166) the relative sentence was revised as  “The original dataset, before the sampling used to generate the subset in the deadband package, has built by using the formula 11 to calculate any y_n element.”

 

Round  2

Reviewer 1 Report

The new version of the paper has improved the previous one. Maybe only point 6 related to a further theoretical analysis is not fulfilled, but it seems that authors reserve that analysis for future works. Anyway, authors could highlight it in the conclusions, because it is of importance for scientific research.

- Point 5, about graphs. I agree with authors about the nature of signals and so on, but Figure 2.a are almost not distinguishable from 2.b, 2.c and 2.e. I suggested that only with 2.a would be enough, and only to mention than for samples at 240, 252 and 300ms results looks very similar to the 210ms sampling. Therefore, you can save a lot of space. To compensate the lack of graphs 2.b,c,d, numerical results would suffice. Anyway, it is a minor issue and final decision is from the authors.

- Table 2 shows average results of the four sampling rates?, that is (sr1 + sr2 + sr3 + sr4)/4?. Please clarify.

- style L233: "Table 2 presents the normalized ....

- typo L250: "more then" -> "more than"


Author Response

Dear reviewer,

Thank you for your constructive comments concerning my manuscript entitled “Statistical Deadband: A Novel Approach for Event-Based Data Reporting”. I have considered your comments carefully and made some corrections, which I hope meet with your approval.  I answer your questions or comments in details in the following texts. 

 

Point 1: - (ex Point 5 in round 1), about graphs. I agree with authors about the nature of signals and so on, but Figure 2.a are almost not distinguishable from 2.b, 2.c and 2.e. I suggested that only with 2.a would be enough, and only to mention than for samples at 240, 252 and 300ms results looks very similar to the 210ms sampling. Therefore, you can save a lot of space. To compensate the lack of graphs 2.b,c,d, numerical results would suffice. Anyway, it is a minor issue and final decision is from the authors.

 

Response 1: I agree partially with the reviewer about the figure 2.a, 2.b, 2.c and 2.d are almost not distinguishable but the reader can visually and easily find many little differences without zooming the pdf image. The author considers important for the reader offer a quick look of the nature of pseudo periodic random signals because the study is focused on that type of signals. 

Removing 3 subfigures will save space but will not change the number of pages.

 

Point 2- Table 2 shows average results of the four sampling rates?, that is (sr1 + sr2 + sr3 + sr4)/4?. Please clarify.

 

Response 2:. For first are calculated the effectiveness and fidelity values for all sampling rates and after are calculated the normalized averages.

In the revised manuscript I proposed this alteration: “Table 2 presents the averages normalized of effectiveness and fidelity values calculated by using sampling rates of 210, 240, 252 and 300 milliseconds.”

 

Point 3- style L233: "Table 2 presents the normalized 

 

Response 3:. Revised as “presents the averages normalized of effectiveness and fidelity values calculated by using sampling rates of 210, 240, 252 and 300 milliseconds.”

 

Point 4-typo L250: "more then" -> "more than".

 

Response 4:. fixed


Reviewer 2 Report

Comments to the author:

The author hasn't described how certain parameters like 'n' and 'k' are determined for arbitrary data sets. In this article, the author borrows these parameter values from [26,33]. It would be appropriate and helpful to highlight, to a certain degree, their effect on the effectiveness and fidelity indicators besides the effect of deadband percent 'd'. 


Lines 105-113 provide insight on various SoD approaches, perhaps it is best to relocate them to start of Section 2. 


Author's use of n in (11) still remains unclear. Kindly mention if it is the same as period 'n' in Table 1? If so, mention it explicitly to avoid confusion. Else, use a different variable and accordingly replace it throughout the article including the algorithms. The simulations suggest the use of 'n' differently. Please clarify.


Author Response

Dear reviewer,

Thank you for your constructive comments concerning my manuscript entitled “Statistical Deadband: A Novel Approach for Event-Based Data Reporting”. I have considered your comments carefully and made some corrections, which I hope meet with your approval. I answer your questions or comments in details in the following texts. 

 

Point 1: - The author hasn't described how certain parameters like 'n' and 'k' are determined for arbitrary data sets. In this article, the author borrows these parameter values from [26,33]. It would be appropriate and helpful to highlight, to a certain degree, their effect on the effectiveness and fidelity indicators besides the effect of deadband percent 'd'.

 

Response 1: Added a clarification about ‘n’ an ‘k’:

“The two algorithms, BD and VD, adopt the values of n=20 and k=2 as default values such as in other engineering applications using the Bollinger theory[33]. An analytic investigation discussed by Leeds in [34] shows how to derive the bands from a general time series expression by fixing k and n.  Therefore the use in this work of the standard and recommended values of k=2 and n=20 in literature has as primary objective the usability of the standard practical technical analysis concepts”

 

Point 2- Lines 105-113 provide insight on various SoD approaches, perhaps it is best to relocate them to start of Section 2. 

 

Response 2: Relocated to the start of the Section 2.

 

Point 3- Author's use of n in (11) still remains unclear. Kindly mention if it is the same as period 'n' in Table 1? If so, mention it explicitly to avoid confusion. Else, use a different variable and accordingly replace it throughout the article including the algorithms. The simulations suggest the use of 'n' differently. Please clarify.

 

Response 3: The ’n’ in (11) is not necessary. The formula (11) intent is to represent a generator of pseudo periodic random signals. The contribute of ‘n’ was just a formalism to represent a vector of ‘n’ samples generated but for simulation purposes the number of samples are so far bigger than the Table 1. The use ‘n’ as index of the vector sample in (11) was removed to avoid confusion. 

 


Back to TopTop