Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper unravels the multidisciplinary considerations required for a responsible use of face recognition tools applied to the classification of genetic diseases. The authors emphasize the privacy, ethical, cybersecurity and forensic challenges. I understand the importance of the topic as well as the multidisciplinary nature of it because of my previous work in face recognition and responsible AI. I think all the relevant information is on the paper, however my concern is that it is not well organized and written.

For example, the paper starts with: "Cybersecurity is a paramount concern when employing Artificial Intelligence (AI) systems" which is misleading the reader to think the paper is about cybersecurity.

Artificial Intelligence acronym is defined twice in the introduction. The goal is paraphrased twice. There are repeated sentences.

It first says: "Rare disease affects fewer than five in every 10,000 people,"

and afterwards "Cumulatively rare diseases affect between 6-10% of the population [4,5], responding to one individual per 10- 17 people."

So which one it is?

This paragraph starts with "The number of smartphone mobile network subscriptions worldwide reached almost 95 6.6 billion in 2022 and is forecast to exceed 7.8 billion by 2028" This seems disconnected from previous paragraphs.

The concept and methods section is difficult to follow with the format based on bulleted lists.

Discussion starts with "Current advances in generative AI are transforming all relevant fields..." I do not see a relation of gen AI and this paper where the type of AI to be used for classification of diseases would probably not be a generative method but a discriminative one. I find the examples presented throughout the paper together with the analysis are valuable, but the paper needs better structure and needs to be written and reviewed more carefully by the authors.

I received a version with chunks of red font indicating last minute changes, so maybe this was not the final version to upload.

I would suggest a major revision not about the content which is good but about the structure and writing which does not let the content shine.

Comments on the Quality of English Language

English is ok, but overall writing needs improvements

Author Response

Reviewer 1

Comments and Suggestions for Authors

Dear reviewer, thank you for this remark. We have rewritten the Introduction to make it more clear and less misleading. Thank you

Artificial Intelligence acronym is defined twice in the introduction. The goal is paraphrased twice. There are repeated sentences.

Thank you for noticing, we have removed the duplicity and reformulated goals of the paper. Visualized via Tracked changes.

It first says: "Rare disease affects fewer than five in every 10,000 people,"

and afterwards "Cumulatively rare diseases affect between 6-10% of the population [4,5], responding to one individual per 10- 17 people."

So which one it is?

Thank you for this observation. Unfortunately, the official definition of rare diseases depends by region, which we have not described in more detail (lines 78-85). Generally, a rare disease is defined as a condition that affects a small percentage of the population. Here are the definitions from different sources:

European Union (EU): A rare disease is defined as one that affects fewer than 1 in 2,000 people. This definition is established under the European Commission’s Regulation on Orphan Medicinal Products.
United States (US): According to the Rare Diseases Act of 2002, a rare disease is defined as one that affects fewer than 200,000 people in the US, which translates to approximately 1 in 1,600 people, considering the US population.
World Health Organization (WHO): Although the WHO does not have a universally agreed-upon definition, it often aligns with regional definitions and highlights that rare diseases are characterized by their low prevalence in the general population.

We have removed the errors, duplicities and cleared the definitions.

Thank you for noticing, a paragraph introducing the face2gene app has been previously removed resulting in this discontinuance. We have corrected it. Lines 289-294.

The concept and methods section is difficult to follow with the format based on bulleted lists.

We have added some text to improve the readability of this section. Together with figures it shall be sufficient to comprehend even in various non-medical fields.

Dear Reviewer, thank you for pointing out this error. The word "generative" has been removed and the discussion section has also been reorganised. At the end of the chapter, we have added two more specific examples that may help readers understand the ethical and legal issues surrounding AI-driven facial image analysis, and we have outlined potential future research directions that we consider.

I would suggest a major revision not about the content which is good but about the structure and writing which does not let the content shine.

Dear reviewer, thank you for all your suggestions and remarks, we have done a major revision of the paper, with effort to reorganize it as suggested. All changes are visualized as tracked changes in attached document.

authors

Comments on the Quality of English Language

English is ok, but overall writing needs improvements

Submission Date 03 May 2024 Date of this review 03 Jun 2024 17:10:52

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The article explores the transformative potential of AI in detecting rare genetic diseases using large-scale facial image databases. While the article presents a compelling case for the benefits of such technology, several weaknesses can be noted:

1)Introduction and Objectives: The introduction effectively outlines the paper's aims but could benefit from a clearer articulation of the specific contributions of the review.

2)Concept and Methods: The concept and methodology section is comprehensive. However, it could be strengthened by providing more detailed explanations of the AI algorithms used and their potential limitations.

3)Ethical and Legal Considerations: The paper thoroughly discusses ethical and legal issues but should include more examples of real-world applications and potential challenges in different jurisdictions.

4)Forensic Applications: The potential use of AI in forensics is well-covered, but the paper should explore more deeply the specific ethical dilemmas and practical challenges in implementing these technologies in forensic contexts.

5)Cybersecurity: The discussion on cybersecurity is crucial but could be expanded to include more specific strategies for mitigating risks associated with handling sensitive data.

6)Recommendations and Future Research: The recommendations are well-founded, but the paper could benefit from a more detailed roadmap for future research, including specific areas where more data is needed and potential methods for addressing current biases in AI algorithms.

In conclusion, while the article provides a thorough examination of the potential and challenges of AI-driven facial image analysis for rare disease detection, it could benefit from more concrete strategies to mitigate bias, enhance privacy and security, and navigate the complex legal landscape. Additionally, a clearer roadmap for future research would strengthen its contributions to the field.

Comments on the Quality of English Language

Extensive editing of the English language is required.

Author Response

Reviewer 2

Comments and Suggestions for Authors

Submission Date 03 May 2024 Date of this review 26 May 2024 00:34:53

Responses point by point:

1)Introduction and Objectives: The introduction effectively outlines the paper's aims but could benefit from a clearer articulation of the specific contributions of the review.

Dear reviewer, thank you for your input. We appreciate your recognition of the aims outlined in the introduction. We understand the importance of clearly articulating the specific contributions of our review and have made the necessary revisions to enhance clarity in this section. We have revised the text of the Introduction and objectives > lines 46-68 and lines 130-148.

2) Concept and Methods: The concept and methodology section is comprehensive. However, it could be strengthened by providing more detailed explanations of the AI algorithms used and their potential limitations.

Thank you for your valuable feedback on the concept and methodology section of our paper. We appreciate your suggestion to provide more detailed explanations of the AI algorithm used and its potential limitations. We have incorporated this into our revisions to enhance the clarity and depth of our methodology section. We are mostly mentioning the Face2Gene app and its algorithm is well described and published and is referenced also in this paper as references [15] and [45] and some other. As the analytical algorithm is not subject of this work we prefer not to turn focus to his as there are many other. The focus is on the risks and limitations as well as advantages of employment of such AI agents. Your insights are instrumental in improving the quality of this paper. Changes were made throughout the whole Methodological section visualized as tracked changes, lines: 150 + and we have also added a new paragraph to chapter of Discussion, discussing the suggested AI agent for the step of facial analysis (face2gene) in the following diction: “While the Face2Gene app represents a significant advancement in AI-driven facial image analysis for the early detection of rare diseases, it also has several limitations. Legally, using the app requires navigating complex regulations around patient data privacy and consent, particularly across different jurisdictions. Ethically, there are concerns about potential biases in the AI algorithms that could lead to misdiagnosis or unequal access to accurate diagnosis for different populations. Forensically, the integration of the app into clinical practice must ensure rigorous standards to avoid misuse or over-reliance on AI without adequate human oversight. Finally, from a cybersecurity perspective, protecting the sensitive health data used and generated by the app is paramount to prevent unauthorised access and data breaches that could compromise patient confidentiality and trust. Similar considerations apply to any rapidly evolving AI agents used in other steps of the concept presented, including data anonymisation. ” Lines 648 – 659.

Dear reviewer, thank you for your suggestion. We have considered two Real-World Applications and Potential Challenges in Different Jurisdictions to enrich the discussion on ethical and legal issues surrounding AI-driven facial image analysis for rare disease detection. We add them to the end of Discussion chapter{lines 660-692} as with them the discussion can better illustrate the practical implications and challenges of deploying AI-driven facial image analysis for rare disease detection across various legal and ethical landscapes. The added examples of two real-world applications with their respective challenges across different jurisdictions are the following:

Application in the United States: Integration with National Health Databases In the United States, the integration of AI-driven facial recognition tools like Face2Gene with national health databases poses significant legal and ethical challenges. One potential application is using these tools for early screening of genetic disorders in newborns. While this could revolutionize early diagnosis and treatment, it raises concerns about compliance with the Health Insurance Portability and Accountability Act (HIPAA) and other state-specific privacy laws. These regulations require stringent measures to protect patient data and ensure consent. Additionally, the risk of data breaches and misuse of sensitive genetic information by unauthorized parties necessitates robust cybersecurity protocols. The ethical dilemma of potential biases in AI algorithms, leading to misdiagnoses or unequal treatment across different racial and ethnic groups, also demands careful consideration and mitigation strategies.
Application in the European Union: Cross-Border Data Sharing and GDPR Compliance Within the European Union, AI-driven facial image analysis for rare disease detection could be applied through collaborative cross-border health initiatives aimed at improving diagnostic accuracy and sharing genetic research data. However, this application faces substantial legal hurdles due to the General Data Protection Regulation (GDPR). GDPR imposes strict requirements on data privacy and cross-border data transfers, necessitating comprehensive data protection impact assessments and ensuring explicit informed consent from individuals. Challenges include harmonizing diverse national regulations within the EU member states and addressing potential conflicts with GDPR's principles of data minimization and purpose limitation. Moreover, ethical concerns about the potential for AI to perpetuate biases and the need for transparent, explainable AI models are crucial to maintaining public trust and achieving equitable healthcare outcomes across different regions.

Thank you for your remarks regarding the forensic applications. We appreciate your suggestion to delve more deeply into the specific ethical dilemmas and practical challenges associated with implementing these technologies in forensic contexts. We have now incorporated additional content to address these concerns. Specifically, we have elaborated on the ethical dilemmas related to privacy and consent, bias in criminal identification, due process, and the potential for misuse of AI technology. Additionally, we have discussed practical challenges such as data quality and integrity, interoperability with existing systems, legal admissibility of AI evidence, technical expertise and training requirements, and the necessity for ethical oversight and governance. We have elaborated this throughout the whole paper and added two separate paragraphs:

“The use of AI-driven facial image analysis in forensic applications presents several specific ethical dilemmas. Privacy and consent are major concerns, as individuals are often analyzed without their knowledge or consent. Bias in criminal identification can lead to unequal treatment of different demographic groups, resulting in wrongful accusations or convictions, particularly affecting minority communities. Ensuring due process and a fair trial is challenging with opaque AI algorithms, as defendants must be able to challenge the evidence against them. Additionally, there is a risk of misuse of AI technology for unauthorized surveillance or political targeting, raising significant ethical and civil liberties concerns.” Lines> 276-286

and lines 337-350
“Practical Challenges Specific to Forensic Applications

Implementing AI in forensic contexts of described workflows, involves several practical challenges. Data quality and integrity are critical, as forensic investigations of-ten deal with low-quality or partial data. Ensuring AI systems can accurately process such data while maintaining its integrity is essential. Interoperability with existing law enforcement systems requires standardization and compatibility across various plat-forms and jurisdictions. Legal admissibility of AI-derived evidence in court necessitates demonstrating the reliability and validity of the AI methods used. Providing adequate training and ongoing support to law enforcement personnel for effective use of AI tools is also crucial. Lastly, establishing robust ethical oversight and governance mechanisms to monitor AI systems and address any emerging ethical or legal issues is necessary for responsible implementation in forensic work.”

We believe these enhancements provide a comprehensive view of the complexities involved in the forensic application of AI technologies and hope they meet your expectations.

5)Cybersecurity: The discussion on cybersecurity is crucial but could be expanded to include more specific strategies for mitigating risks associated with handling sensitive data.

Thank you for this suggestion, we have slightly elaborated the sections addressing the cybersecurity aspects of the paper, albeit we believe they are quite extensive. We agree to include an expansion focused on specific strategies for mitigating risks associated with handling sensitive data. This would be useful for the readers, so we have included the following new subchapter 2.8: " Cybersecurity Aspects and Specific Strategies for Mitigating Risks

The cybersecurity aspect of implementing AI-driven facial image analysis for rare disease detection and forensic applications is critical due to the sensitive nature of the data involved. Protecting facial images and genetic information from unauthorized access and data breaches is paramount. Specific strategies to mitigate these risks include implementing robust encryption methods for data storage and transmission, ensuring that sensitive data is encrypted both at rest and in transit.

One specific strategy is the use of homomorphic encryption, which allows data to be processed and analyzed without the need to decrypt it. This technique ensures that sensitive information remains secure even while being computed, significantly reducing the risk of exposure.

In addition, the use of multi-factor authentication (MFA) to access databases can add an extra layer of security and reduce the risk of unauthorized access. Regular cybersecurity audits and vulnerability assessments should be conducted to identify and address potential security gaps. Advanced threat detection systems, such as intrusion detection systems (IDS) and intrusion prevention systems (IPS), can help identify and mitigate potential cyber-attacks in real time.

Implementing differential privacy techniques can protect individual data by adding noise to the data set, making it more difficult to extract personal information. Ensuring strict access controls so that only authorized personnel have access to sensitive data is essential. Educating and training staff on cybersecurity best practices can further strengthen an organization’s security posture.

Integrating these strategies, particularly the use of homomorphic encryption, can significantly mitigate the risks associated with handling sensitive data in AI-driven applications, ensuring the confidentiality, integrity and availability of the data.”

Thank you for this remark. We have discussed the Future research pathways briefly and as specific as possible. We added the following paragraph to the end of Discussion chapter:

„Future research should prioritise the following areas to improve AI-driven facial image analysis for rare disease detection and forensic applications:

Diverse data collection: Collect comprehensive datasets from underrepresented populations, including different ethnicities, ages and genders, to reduce bias in AI models.
Bias detection and mitigation: Develop and implement fairness-aware algorithms and bias correction techniques. Use transfer learning to improve model generalisability and reduce bias.
Explainable AI (XAI): Focus on building models that provide transparent and interpretable results to build trust and facilitate bias identification.
Robust validation frameworks: Establish rigorous validation protocols that test AI models across diverse demographic groups before deployment.
Multidisciplinary Collaboration: Encourage collaboration between geneticists, data scientists, ethicists and legal experts to ensure comprehensive evaluation and ethical deployment.
Standardised protocols: Develop standardised data collection, model training and validation protocols across jurisdictions to improve reliability and acceptability.“

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The paper was improved significantly to the extent that it is easily readable, however some minor issues remain. The two figures are repetition of what is written in the text so please choose either text or images instead of both. It is still not clear why the bullet list format for some of the subsections was chosen, and it seems disconnected from the paper content, see for example the outline subsection. Many of the points there are not mentioned anymore in the paper. I struggle with this format, but if the authors are used to publish their research in this format in other journals and they are confident enough on its usefulness then they can choose to leave it like that.

About personal privacy when it says: "The intended studies cannot be performed on the NID data itself but on databases specifically created and derived from it." I would say that the challenge could be that preserving privacy means limiting the possibility of even having access to the NID database in the first place. In the past due to privacy we could not even access a certain database to use it for generating synthetic data out of it for analysis. What privacy means in practice is a complicated topic and can challenge the possibility of analysis in unforeseen ways.

Author Response

Responses from authors:

Dear Reviewer, Thank you for your positive feedback on our revised version. We appreciate your further comments. You are correct the two figures are repetition of what is written in the text, as these were requested to be created by Editor upon our prior submission. Upon your remark, we have rephrased the whole section 2.3 with better formatting. Now it does not contain the same text as the figure, albeit we prefer to keep this bulleted list format.

As the bulleted sections seem disconnected from the paper content, we have added a paragraph on beginning that serves to link the conceptual system model to the broader discussions of the paper, emphasizing its relevance and importance. In similar way we have revised the section 2.4 so the text is not duplicated in the Figure and we have provided a bit more expanded-elaborated expanded version that provides more detail on each step, ensuring clarity and a better understanding of the processes involved. These concepts (2.3-Conceptual System Model and 2.4-Workflow-Based System Model) are not detailed further in the paper as the paper would be otherwise too extensive and unreadable.

Thank you for your insightful comment regarding the challenges of preserving personal privacy while accessing National ID (NID) databases. You have commented the statement “The intended studies cannot be performed on the NID data itself but on databases specifically created and derived from it.” And we fully acknowledge the complexities involved in balancing privacy protection with the need for data access.

Our statement aims to highlight the necessity of creating separate, controlled datasets to enhance privacy safeguards. However, we agree that even gaining initial access to the NID database poses significant challenges due to stringent privacy regulations.

We have revised the beginning of Section 2.7, where we have emphasized also your correct remark. Lines 691-699 were added

“The issue of personal privacy is critical when accessing National ID (NID) databases for research. While our approach emphasizes creating derived databases for enhanced privacy, it could be considered that preserving privacy means limiting the possibility of even having access to the NID database in the first place. Privacy regulations often restrict such access, complicating the generation of synthetic data and comprehensive analysis. In practice, preserving privacy requires strict legal and ethical guidelines, advanced anonymization techniques, and limited, monitored access to data. These measures can make analysis difficult but are necessary to balance privacy with research needs.

”

Thank you once again for your valuable feedback. We believe that addressing these points will significantly strengthen the rigor and relevance of our paper.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

I have no additional remarks on the revised version.

The authors have addressed my remarks.

Comments on the Quality of English Language

Minor editing of the English language is required.

Author Response

Responses from authors:

Dear Reviewer,

Thank you for your positive feedback on our revised version. We appreciate your acknowledgment that we have addressed your remarks. We will ensure that minor edits to the English language are made to further improve the clarity and quality of the manuscript.

Kind regards, authors

Author Response File: Author Response.pdf

Review Reports

Reviewer 1

Extensive editing of the English language is required.

Reviewer 2

Responses point by point:

I have no additional remarks on the revised version.

The authors have addressed my remarks.

Minor editing of the English language is required.