Secure Data Aggregation Using Authentication and Authorization for Privacy Preservation in Wireless Sensor Networks
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe article is devoted to solving the Data Aggregation problem. The topic of the article is relevant. The structure of the article does not correspond to that accepted in MDPI for research articles (Introduction (including analysis of analogues), Models and methods, Results, Discussion, Conclusions). The level of English is acceptable. The article is easy to read. The figures in the article are of acceptable quality. The article cites 43 sources, some of which are not relevant.
The following comments and recommendations can be formulated regarding the material of the article:
1. Often, marking large data sets does not require special expertise, and it is customary to mark them on crowdsourcing platforms. The low threshold for entering such platforms as a performer is the reason for the low quality of the resulting markup. Therefore, an approach based on “overlap” is popular in crowdsourcing - each data sample is marked by several markers. This method allows you to get several versions of the required markup and a more confident result after averaging the results obtained from different markups. In order to bring several markups into one, aggregation methods are used. For example, for the classification task, most crowdsourcing platforms implement a majority opinion function. However, for tasks with more complex markings, such as detection and segmentation, everything is not so obvious, and something like a “majority opinion” function must be implemented independently. This is how it is done in reality. The prefix “Secure” does not exempt authors from using all of the above, it just needs to be done in a closed environment by people who have the appropriate clearance. The authors do aggregation differently. In this case, you need to justify the viability of your campaign.
2. There are three typical marker errors, which then affect the quality of aggregation:
2.1 Marking with boxes: incorrect location, incorrect mark, missing box, invalid marking;
2.2. Marking with time intervals: the same errors as in boxes, except for the last one - often two events can be marked with one interval;
2.3. Marking with segmentation masks: the same mistakes as in boxes, except for the first one - the most common mistake is a mask that is too rough.
What indicators do the authors use to detect such errors?
3. Before talking about aggregation, we must talk about choosing the size of the overlap. The standard value for many problems is an overlap of 3, since two markings are not enough - in the case where they do not match, the confidence in the final marking is 50%. Such confidence is equivalent to random guessing and is insufficient to obtain high-quality markings. Also, dynamic overlap is often used on crowdsourcing platforms: in the case when the aggregation of n labels does not reach a pre-selected confidence threshold, the overlap is increased by 1 (another person tags the sample). This scheme allows you to minimize overlap, thereby reducing the cost of time and financial resources. How did the authors take this into account?
4. Battle aggregation methods have several typical problems:
4.1. There is an insufficient number of influences on the quality of the final markup, of which there are only 2: the first is the choice of a threshold for the metric, the second is the choice of the confidence value in the second modification. As the threshold and confidence value increase, the quality of the final marking will increase (and the share of aggregated results will rapidly fall);
4.2. Huge costs of financial and time resources: additional marking takes time, and for markings that are finally rejected, you have to pay, since there is no knowledge of which of the markers made a mistake;
4.3. Poor quality markings.
How do the authors take this into account?
Author Response
Reviewer 1
-Thank you for your constructive feedback, Due to the scope of this research , we tried to use aggregation marking based on the state of the art , or literature reviewed of similar secure data aggregation protocols.
-In this research we used data fragmentation, timestamp , data freshness , probability of effective error detection for healthcare data labeling or marking for privacy
– We have therefore, introduced probability of error data detection, or any malicious activity notification of the network, that validates sensitivity of healthcare data privacy.
And we try to address most of the concerns of the questions you raised as below---THANK YOU :
Comments and Suggestions for Authors
The article is devoted to solving the Data Aggregation problem. The topic of the article is relevant. The structure of the article does not correspond to that accepted in MDPI for research articles (Introduction (including analysis of analogues), Models and methods, Results, Discussion, Conclusions). The level of English is acceptable. The article is easy to read. The figures in the article are of acceptable quality. The article cites 43 sources, some of which are not relevant.
The following comments and recommendations can be formulated regarding the material of the article:
- Often, marking large data sets does not require special expertise, and it is customary to mark them on crowdsourcing platforms. The low threshold for entering such platforms as a performer is the reason for the low quality of the resulting markup. Therefore, an approach based on “overlap” is popular in crowdsourcing - each data sample is marked by several markers. This method allows you to get several versions of the required markup and a more confident result after averaging the results obtained from different markups. In order to bring several markups into one, aggregation methods are used. For example, for the classification task, most crowdsourcing platforms implement a majority opinion function. However, for tasks with more complex markings, such as detection and segmentation, everything is not so obvious, and something like a “majority opinion” function must be implemented independently. This is how it is done in reality. The prefix “Secure” does not exempt authors from using all of the above, it just needs to be done in a closed environment by people who have the appropriate clearance. The authors do aggregation differently. In this case, you need to justify the viability of your campaign.
--T
---
For segmentation and detection,
1.we segmented the network by using fragmentation algorithm as is depicted in Figure 2: Proposed schematic process for secure data aggregation in WSNs
- for detection of errors for healthcare data marking we have proposed a new attacker analytical model I in section 3.1 that justify detection of knew malicious traffic -----THANK YOU
- There are three typical marker errors, which then affect the quality of aggregation:
2.1 Marking with boxes: incorrect location, incorrect mark, missing box, invalid marking; We used data freshness algorithm markings in our proposed methods that shows sensitivity of data label application in healthcare WSN application that preserves privacy.
2.2. Marking with time intervals: the same errors as in boxes, except for the last one - often two events can be marked with one interval.;….The data freshness algorithm we used utilizes time stamp utility which ensures that all new received packet transmitted in the network does not contain any untrusted source or errors and we receive new messages uncontaminated as marked data
2.3. Marking with segmentation masks: the same mistakes as in boxes, except for the first one - the most common mistake is a mask that is too rough… Please refer the data fragmentation process in section 3
What indicators do the authors use to detect such errors? ----we refer to the proposed attacker probability model in section 3.1 that detects any errors or unwanted network traffic
- Before talking about aggregation, we must talk about choosing the size of the overlap. Thee standard value for many problems is an overlap of 3, since two markings are not enough - in the case where they do not match, the confidence in the final marking is 50%. Such confidence is equivalent to random guessing and is insufficient to obtain high-quality markings. Also, dynamic overlap is often used on crowdsourcing platforms: in the case when the aggregation of n labels does not reach a pre-selected confi3ence threshold, the overlap is increased by 1 (another person tags the sample). This scheme allows you to minimize overlap, thereby reducing the cost of time and financial resources. How did the authors take this into account? …. PLEASE refer to the proposed attacker probability model in section 3.1that detects any errors or unwanted network traffic
- Battle aggregation methods have several typical problems:
4.1. There is an insufficient number of influences on the quality of the final markup, of which there are only 2: the first is the choice of a threshold for the metric, the second is the choice of the confidence value in the second modification. As the threshold and confidence value increase, the quality of the final marking will increase (and the share of aggregated results will rapidly fall);------We used an determined QoS in section 3.3.1
4.2. Huge costs of financial and time resources: additional marking takes time, and for markings that are finally rejected, you have to pay, since there is no knowledge of which of the markers made a mistake;…….PLEASE the proposed attacker probability model in section 3.1that detects any errors or unwanted network traffic, and the TIME COMPLEXITY in our proposed work being less help ensure cost is less-----THANK You
4.3. Poor quality markings.--- We have explained using above information to clarify any markings based on the state of the art of SDA we modelled -- THANK YOU
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors propose secure data aggregation using authentication and authorization (SDAAA) protocol to detect malicious attacks, particularly cyberattacks such as Sybil and sinkhole, to extend the network performance. These attacks are more complex to be addressed through existing cryptographic protocols. The SDAAA protocol comprises a node authorization algorithm that permits legitimate nodes to communicate within the network and the authors claim that the SDAAA protocol improves the Quality of Service (QoS) parameters. The protocol is tested in an intelligent healthcare WSN patient monitoring application scenario and verified using an OMNET++ simulator.
I recommend major revision for below reasons:
1. In abstract, the author should include some statistics to highlight the improvement in their protocol when compared to other protocols.
2. Latency parameter is not studied.
3. The trade-off between the parameters are to be explored more.
4. Optimization is missing.
5. Why the authors have not mentioned the number of simulation runs?
6. Confidence interval is missing to prove the validity.
7. Type of randomness in malicious node selection to be highlighted.
8. Overall, work is good but above points to be considered and the paper should be improved.
Comments on the Quality of English Language
Minor editing of English language required
Author Response
Reviewer 2
Comments and Suggestions for Authors
The authors propose secure data aggregation using authentication and authorization (SDAAA) protocol to detect malicious attacks, particularly cyberattacks such as Sybil and sinkhole, to extend the network performance. These attacks are more complex to be addressed through existing cryptographic protocols. The SDAAA protocol comprises a node authorization algorithm that permits legitimate nodes to communicate within the network and the authors claim that the SDAAA protocol improves the Quality of Service (QoS) parameters. The protocol is tested in an intelligent healthcare WSN patient monitoring application scenario and verified using an OMNET++ simulator.
I recommend major revision for below reasons:
- In abstract, the author should include some statistics to highlight the improvement in their protocol when compared to other protocols.-----We have included statistics to highlight the improvement in our protocols in the abstract ---- THANK You
- Latency parameter is not studied. ---latency parameter will be studied details in future work continuation of this research , due to the scope of this work is limited to do simulation now ---THANK You
- The trade-off between the parameters are to be explored more.----We have explained few lines of trade off energy efficiency and QoS metrics in introduction and result analysis sections ---- THANK You
- Optimization is missing.-----We have introduced optimization in attacker model from our previous work in detection of malicious intentions or cyber-attack in the network, using probability analysis in section 3.1 -- THANK You
- Why the authors have not mentioned the number of simulations runs?--- the simulations runs were 36 minutes which was already mentioned in the simulation’s parameters table in 2 section 4---0 THANK You
- Confidence interval is missing to prove the validity.----- We have included confidence interval in analysis of result, in the conclusion section to use confidence interval-- THANK You
- Type of randomness in malicious node selection to be highlighted.----We have discussed an idea randomness in malicious nodes in the realization of the throughput in section 4 and apply to all simulation runs ----THANK You
- Overall, work is good but above points to be considered and the paper should be improved.
---- The overall work has been improved THANK You
Reviewer 3 Report
Comments and Suggestions for Authors- The paper lacks discussion on the challenges and risks associated with privacy in Wireless Sensor Networks (WSN), particularly regarding data aggregation.
- The review of state of the art in Section I is unclear and poorly presented.
- The organization of the paper is unclear. For instance, the paper contains a section on the state of the art in Section I and related work in Section II. Additionally, there are repetitive concepts across different sections, and important aspects such as challenges, attacker models, and system architecture are not clearly identified. The proposed protocol lacks a clear presentation.
- In the chapter on design challenges, reference is made to schemes and protocols that have not been introduced in the related work section. Conversely, the challenges of the presented related work are not detailed.
- The work does not include an analysis of the privacy and security of the proposed scheme.
Comments on the Quality of English Language- The writing style of the paper lacks clarity, and the sentences are in general too long.
Author Response
Reviewer 3
Comments and Suggestions for Authors---
- The paper lacks discussion on the challenges and risks associated with privacy in Wireless Sensor Networks (WSN), particularly regarding data aggregation.-------We have introduced priavacy preservation section inclding the design objectives in section 2--- THANK You
- The review of state of the art in Section I is unclear and poorly presented.----We have removed the state of the art section 1 that neded required and added to the state of the art in related work section 2 for clarity---- THANK You
- The organization of the paper is unclear. For instance, the paper contains a section on the state of the art in Section I and related work in Section II. Additionally, there are repetitive concepts across different sections, and important aspects such as challenges, attacker models, and system architecture are not clearly identified. The proposed protocol lacks a clear presentation.---- As stated. We have removed the state-of-the-art section 1 that needed required and added to the state of the art in related work section 2 for and separated for clarity. We have also clarified the design challenges of the contending protocols and others . We have also introduced attacker Model in section 2.1. We also thank other reviewers who descrbed in overall as good. Now due to your recommendations, the work looks more improved ---- THANK You
- In the chapter on design challenges, reference is made to schemes and protocols that have not been introduced in the related work section. Conversely, the challenges of the presented related work are not detailed. We have rectified that by separating the state of the art from section 1 to the related work section, except the contending protocol which is our reference point and needed to be in section 1---- THANK You
- The work does not include an analysis of the privacy and security of the proposed scheme.---Privacy section has been included ---- THANK You
Comments on the Quality of English Language
- The writing style of the paper lacks clarity, and the sentences are in general too long.-----Most sentences have been ed edited to shorter sentences, and overall English has improved ---- THANK You
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI formulated the following comments to the previous version of the article:
1. Often, marking large data sets does not require special expertise, and it is customary to mark them on crowdsourcing platforms. The low threshold for entering such platforms as a performer is the reason for the low quality of the resulting markup. Therefore, an approach based on “overlap” is popular in crowdsourcing - each data sample is marked by several markers. This method allows you to get several versions of the required markup and a more confident result after averaging the results obtained from different markups. In order to bring several markups into one, aggregation methods are used. For example, for the classification task, most crowdsourcing platforms implement a majority opinion function. However, for tasks with more complex markings, such as detection and segmentation, everything is not so obvious, and something like a “majority opinion” function must be implemented independently. This is how it is done in reality. The prefix “Secure” does not exempt authors from using all of the above, it just needs to be done in a closed environment by people who have the appropriate clearance. The authors do aggregation differently. In this case, you need to justify the viability of your campaign.
2. There are three typical marker errors, which then affect the quality of aggregation:
2.1 Marking with boxes: incorrect location, incorrect mark, missing box, invalid marking;
2.2. Marking with time intervals: the same errors as in boxes, except for the last one - often two events can be marked with one interval;
2.3. Marking with segmentation masks: the same mistakes as in boxes, except for the first one - the most common mistake is a mask that is too rough.
What indicators do the authors use to detect such errors?
3. Before talking about aggregation, we must talk about choosing the size of the overlap. The standard value for many problems is an overlap of 3, since two markings are not enough - in the case where they do not match, the confidence in the final marking is 50%. Such confidence is equivalent to random guessing and is insufficient to obtain high-quality markings. Also, dynamic overlap is often used on crowdsourcing platforms: in the case when the aggregation of n labels does not reach a pre-selected confidence threshold, the overlap is increased by 1 (another person tags the sample). This scheme allows you to minimize overlap, thereby reducing the cost of time and financial resources. How did the authors take this into account?
4. Battle aggregation methods have several typical problems:
4.1. There is an insufficient number of influences on the quality of the final markup, of which there are only 2: the first is the choice of a threshold for the metric, the second is the choice of the confidence value in the second modification. As the threshold and confidence value increase, the quality of the final marking will increase (and the share of aggregated results will rapidly fall);
4.2. Huge costs of financial and time resources: additional marking takes time, and for markings that are finally rejected, you have to pay, since there is no knowledge of which of the markers made a mistake;
4.3. Poor quality markings.
How do the authors take this into account?
The authors responded to all my comments. I found their answers quite convincing. I support the publication of the current version of the article. I wish the authors creative success.
Author Response
REVIEWER
Some specific things:
- The introduction does not include a clear description of the challenges and requirements of data aggregation in WSN. ------- We have added additional information in the introduction section about challenges of the WSN sensor node constrained memory and that gives redundant data and that requires secure data aggregation to solve the problem of insecure data transmission in the application……… thank you
- The role and position of wireless sensor nodes (AN, MSN, DAN, EMNs) is not clear, and figure 1 does not help in this. -------- We have added enough explanations of the AN, MSN, DAN, EMNs, to clarify their function and position in figure 1 We have also improved the figure 1 with clear information…… Thank you
- The attacker model defined in the process does not explain the role of the attacker or its attack mechanisms.------ We have added additional information in the attacker model to clarify and explain the attacker mechanism in network …….. thank you
- Why the effect of QoS and accuracy leads to weak security control mechanisms must be justified-------- The QoS provide capacity for handling and allocating efficient data delivery with high throughput for privacy healthcare data flow in the network traffic. Without the QoS provision the accuracy of data by the security control mechanism in the network cannot be realized. -------- thank you
- Also, the sentence "Since data aggregation enhances security" is not justified.------- the reason why data aggregation enhances security is that data aggregation makes a single copy of all data in the network that are similar using the aggregator node. This action reduces any redundant data and improves energy efficiency of the sensor nodes which improves the life time of the network--------Thank you
- It is not explained what is a negotiated sensor node.------ Negotiated sensor nodes has been replaced with the right word to clarify sentence meaning------Thank you
- The trust model of the system is not clear (who provides the authentication and authorization certificates, how the validation of the certificates works.------- We have clarified the sentence by adding the role of the BS in conjunction with the aggregator node provide authentication and authorization in detection of attacks or errors in the network------ Thank you
- The paper should explain what does it mean a throughput of x%.-------- We have clearly added then right metrics and its corresponding percentages of the proposed protocol compared with the contending protocols to clarify the meaning of how well our proposed protocol outperforms other contending approaches------ Thank you
- Formulas are not fully explained.------- Formulars are now fully explained well and detail with the added model----Thank you.
I want to say thank you for your constructive feedback
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have carried out all suitable corrections suggested by the reviewers and they have improved the paper well. Hence the paper shall be accepted in present form.
Comments on the Quality of English LanguageMinor editing of English language required
Author Response
REVIEWER
Some specific things:
- The introduction does not include a clear description of the challenges and requirements of data aggregation in WSN. ------- We have added additional information in the introduction section about challenges of the WSN sensor node constrained memory and that gives redundant data and that requires secure data aggregation to solve the problem of insecure data transmission in the application……… thank you
- The role and position of wireless sensor nodes (AN, MSN, DAN, EMNs) is not clear, and figure 1 does not help in this. -------- We have added enough explanations of the AN, MSN, DAN, EMNs, to clarify their function and position in figure 1 We have also improved the figure 1 with clear information…… Thank you
- The attacker model defined in the process does not explain the role of the attacker or its attack mechanisms.------ We have added additional information in the attacker model to clarify and explain the attacker mechanism in network …….. thank you
- Why the effect of QoS and accuracy leads to weak security control mechanisms must be justified-------- The QoS provide capacity for handling and allocating efficient data delivery with high throughput for privacy healthcare data flow in the network traffic. Without the QoS provision the accuracy of data by the security control mechanism in the network cannot be realized. -------- thank you
- Also, the sentence "Since data aggregation enhances security" is not justified.------- the reason why data aggregation enhances security is that data aggregation makes a single copy of all data in the network that are similar using the aggregator node. This action reduces any redundant data and improves energy efficiency of the sensor nodes which improves the life time of the network--------Thank you
- It is not explained what is a negotiated sensor node.------ Negotiated sensor nodes has been replaced with the right word to clarify sentence meaning------Thank you
- The trust model of the system is not clear (who provides the authentication and authorization certificates, how the validation of the certificates works.------- We have clarified the sentence by adding the role of the BS in conjunction with the aggregator node provide authentication and authorization in detection of attacks or errors in the network------ Thank you
- The paper should explain what does it mean a throughput of x%.-------- We have clearly added then right metrics and its corresponding percentages of the proposed protocol compared with the contending protocols to clarify the meaning of how well our proposed protocol outperforms other contending approaches------ Thank you
- Formulas are not fully explained.------- Formulars are now fully explained well and detail with the added model----Thank you.
I want to say thank you for your constructive feedback
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have made some improvements to the article; however, the manuscript still suffers from more general issues of organization and clarity in writing. It remains challenging to evaluate its contributions as it continues to intertwine numerous concepts and technologies without providing a clear description of the context (such as usage scenarios, types of attackers, etc.). The quality of the paper is not sufficient for consideration in the journal.
Some specific things:
- The introduction does not include a clear description of the challenges and requirements of data aggregation in WSN.
- The role and position of wireless sensor nodes (AN, MSN, DAN, EMNs) is not clear, and figure 1 does not help in this.
- The attacker model defined in the process does not explain the role of the attacker or its attack mechanisms.
- Why the effect of QoS and accuracy leads to weak security control mechanisms must be justified.
- Also, the sentence "Since data aggregation enhances security" is not justified.
- It is not explained what is a negotiated sensor node.
- The trust model of the system is not clear (who provides the authentication and authorization certificates, how the validation of the certificates works.
- The paper should explain what does it mean a throughput of x%.
- Formulas are not fully explained.
Comments on the Quality of English LanguageRedaction must be improved
Author Response
REVIEWER 3
Some specific things:
- The introduction does not include a clear description of the challenges and requirements of data aggregation in WSN. ------- We have added additional information in the introduction section about challenges of the WSN sensor node constrained memory and that gives redundant data and that requires secure data aggregation to solve the problem of insecure data transmission in the application……… thank you
- The role and position of wireless sensor nodes (AN, MSN, DAN, EMNs) is not clear, and figure 1 does not help in this. -------- We have added enough explanations of the AN, MSN, DAN, EMNs, to clarify their function and position in figure 1 We have also improved the figure 1 with clear information…… Thank you
- The attacker model defined in the process does not explain the role of the attacker or its attack mechanisms.------ We have added additional information in the attacker model to clarify and explain the attacker mechanism in network …….. thank you
- Why the effect of QoS and accuracy leads to weak security control mechanisms must be justified-------- The QoS provide capacity for handling and allocating efficient data delivery with high throughput for privacy healthcare data flow in the network traffic. Without the QoS provision the accuracy of data by the security control mechanism in the network cannot be realized. -------- thank you
- Also, the sentence "Since data aggregation enhances security" is not justified.------- the reason why data aggregation enhances security is that data aggregation makes a single copy of all data in the network that are similar using the aggregator node. This action reduces any redundant data and improves energy efficiency of the sensor nodes which improves the life time of the network--------Thank you
- It is not explained what is a negotiated sensor node.------ Negotiated sensor nodes has been replaced with the right word to clarify sentence meaning------Thank you
- The trust model of the system is not clear (who provides the authentication and authorization certificates, how the validation of the certificates works.------- We have clarified the sentence by adding the role of the BS in conjunction with the aggregator node provide authentication and authorization in detection of attacks or errors in the network------ Thank you
- The paper should explain what does it mean a throughput of x%.-------- We have clearly added then right metrics and its corresponding percentages of the proposed protocol compared with the contending protocols to clarify the meaning of how well our proposed protocol outperforms other contending approaches------ Thank you
- Formulas are not fully explained.------- Formulars are now fully explained well and detail with the added model----Thank you.
English language has been improved
I want to say thank you for your constructive feedback
Round 3
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have corrected the specific issues identified in the manuscript.
The objectives of the paper and the challenges of the literature are also clearer.
Some small issues are still present in the paper:
In the abstract and results (lines 900+)
"has a throughput of 444KB/Sec representing 98%,"
the percentage has to be explained, the reader does not know this percentage over which values is calculated.
The sentence should be "has a throughput of 444KB/Sec representing 98% of the maxium channel capacity", for example.
The same with the energy efficiency, effected network and time complexity.
419: Cybil -> Sybil
524: the BS authorities the legitimate -> the BS authorizes the legitimate
Author Response
In the abstract and results (lines 900+)
"has a throughput of 444KB/Sec representing 98%,"
ANS: Authors have rectified and improved the meaning of the sentence
the percentage has to be explained, the reader does not know this percentage over which values is calculated.
ANS: Authors have rectified and improved the meaning of the sentence
The sentence should be "has a throughput of 444KB/Sec representing 98% of the maxium channel capacity", for example.
ANS: Authors have rectified and improved the meaning of the sentence
The same with energy efficiency, effected network and time complexity.
ANS: Authors have rectified and improved the meaning of the sentence
Than you for your feedback