Next Article in Journal
Glass Finds from the Elite House of Roue, a Sasanian City Building in Western Iran: Composition and Classification Using XRF and Raman Spectroscopy
Next Article in Special Issue
Deep Learning Approaches for 3D Model Generation from 2D Artworks to Aid Blind People with Tactile Exploration
Previous Article in Journal
Reconnecting River-City: A Visibility and Accessibility Assessment of the Ping River’s View Characters in Chiang Mai City
Previous Article in Special Issue
ChatGPT as a Digital Assistant for Archaeology: Insights from the Smart Anomaly Detection Assistant Development
 
 
Article
Peer-Review Record

AI, Cultural Heritage, and Bias: Some Key Queries That Arise from the Use of GenAI

Heritage 2024, 7(11), 6125-6136; https://doi.org/10.3390/heritage7110287
by Anna Foka 1,* and Gabriele Griffin 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4:
Heritage 2024, 7(11), 6125-6136; https://doi.org/10.3390/heritage7110287
Submission received: 28 August 2024 / Revised: 14 October 2024 / Accepted: 27 October 2024 / Published: 29 October 2024
(This article belongs to the Special Issue AI and the Future of Cultural Heritage)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Overall, the manuscript provides a solid foundation; nevertheless, I have highlighted some instances where more clarification and development could strengthen the argument and improve the clarity of the discussion. Below, I outline my observations. 

Introduction:

The introduction effectively presents the issue of AI education and the risks associated with implicit biases introduced through the data used to train these systems. However, while the paper touches upon significant concerns such as gender and LGBTQ+ representation, I find that the discussion remains somewhat generic. It would be beneficial to provide more concrete examples of when and how these biases manifest, as well as the extent of their impact. Moreover, the ethical dilemma of applying modern inclusivity standards to historical contexts is introduced but not sufficiently addressed. For instance, incorporating women and black individuals into depictions of a view of the agora of Periclean Athens—where social norms and demographics were notably different—may create a more inclusive narrative but also risks distorting historical accuracy. I encourage the writers to investigate this ethical dilemma further, maybe proposing suggestions for how to strike a compromise between inclusivity and historical fidelity.

Bias Representation:

The manuscript effectively defines the concept of bias and the associated problems, but in section 3, the examples provided primarily focus on morphological and iconographic biases. I believe that either these parts should explore more ethical biases, particularly those related to historical representation, or the focus should be confined to the current debate of form and iconography. 

Annotations in Section 3.2:

Section 3.2 discusses an interesting idea concerning bias mitigation via annotations. I would advise that the authors provide specific instances, whether theoretical or practical, of how annotations could be utilized to correct AI flaws. This could provide details about how different forms of input (e.g., 2D pictures like photographs or paintings, or 3D data from photogrammetric or laser scanning surveys) necessitate different annotation procedures. In my opinion, either expanding this part with more specific examples or removing the annotation topic entirely will improve the paper's coherence.

Conclusions:

The conclusions raise an important issue regarding communication between datasets, which is indeed a challenge in AI applications. However, this problem is not fully developed within the paper. If the authors intend to present this as a potential solution or focal point, I would recommend expanding this section to offer a more in-depth exploration. If not, it may be preferable to omit this point in favor of maintaining focus on the primary topics discussed earlier.

Comments on the Quality of English Language

The overall quality of the English in the article is generally clear, but there are specific areas that could benefit from improvement for greater precision and readability. For example, in the sentence “Biases can mislead machine learning models into making incorrect assumptions,” the phrase "making incorrect assumptions" could be revised to "drawing inaccurate conclusions" for greater clarity. Similarly, “when AI is educated with flawed data, it risks absorbing unintentional distortions” could be simplified to "AI trained on flawed data may unintentionally incorporate biases."

There are also instances where sentences are unnecessarily complex. For example, the sentence “The article attempts to address the issues of bias, but it does not fully resolve the ethical dilemma of balancing historical accuracy with modern inclusivity standards” could be broken into two separate sentences for better flow: “The article raises the issue of bias. However, it does not fully resolve the ethical dilemma of balancing historical accuracy with modern inclusivity standards.”

Author Response

Reviewer 1

Introduction:

The introduction effectively presents the issue of AI education and the risks associated with implicit biases introduced through the data used to train these systems. However, while the paper touches upon significant concerns such as gender and LGBTQ+ representation, I find that the discussion remains somewhat generic. It would be beneficial to provide more concrete examples of when and how these biases manifest, as well as the extent of their impact. We have very limited word length and so had to keep this short. We also think we do more than just 'touch upon'. We had to accept that we had to be brief here.

 

Moreover, the ethical dilemma of applying modern inclusivity standards to historical contexts is introduced but not sufficiently addressed. For instance, incorporating women and black individuals into depictions of a view of the agora of Periclean Athens—where social norms and demographics were notably different—may create a more inclusive narrative but also risks distorting historical accuracy. I encourage the writers to investigate this ethical dilemma further, maybe proposing suggestions for how to strike a compromise between inclusivity and historical fidelity.

We have included a brief comment on this in the actual text but we'd also like to note the following: social demographics where quite different from the concept of citizenship and citizen rights in classical Athens. Perhaps women and marginalized groups did not take centre stage in the agora but the agora was a space where both trade and rituals were taking place, and we now know how diverse Greek and Roman antiquity was (see Lefkowitz, M. R., & Rogers, G. M. (Eds.). (2014). Black Athena Revisited. UNC Press Books). The stereotypical image we have of the Athenian collective (not just the citizens) has also to do with the fact that what we have left from this period is essentially the accounts of a few literate citizen men - while women’s and foreign voices come through them (Lape, S. (2010). Race and Citizen Identity in the Classical Athenian Democracy. Cambridge University Press; Kennedy, R. F. (2014). Immigrant Women in Athens: Gender, Ethnicity, and Citizenship in The Classical City. Routledge; McCoskey, D. E. (2021). Race: Antiquity and its Legacy. Bloomsbury Publishing). However, we simply do not have the word space to go into all of this in our text itself.

 

Bias Representation:

The manuscript effectively defines the concept of bias and the associated problems, but in section 3, the examples provided primarily focus on morphological and iconographic biases. I believe that either these parts should explore more ethical biases, particularly those related to historical representation, or the focus should be confined to the current debate of form and iconography. 

We have tried to tackle some bias to the extent that word-count allowed us to. We have a larger piece coming out concerning historical representation in another journal and several pieces cited here that are concerned with the same issues.

 

 

Annotations in Section 3.2:

Section 3.2 discusses an interesting idea concerning bias mitigation via annotations. I would advise that the authors provide specific instances, whether theoretical or practical, of how annotations could be utilized to correct AI flaws. This could provide details about how different forms of input (e.g., 2D pictures like photographs or paintings, or 3D data from photogrammetric or laser scanning surveys) necessitate different annotation procedures. In my opinion, either expanding this part with more specific examples or removing the annotation topic entirely will improve the paper's coherence. We have added about 3 paragraphs at the end of section 3.2 to deal with this.

 

Conclusions:

The conclusions raise an important issue regarding communication between datasets, which is indeed a challenge in AI applications. However, this problem is not fully developed within the paper. If the authors intend to present this as a potential solution or focal point, I would recommend expanding this section to offer a more in-depth exploration. If not, it may be preferable to omit this point in favor of maintaining focus on the primary topics discussed earlier. Again, we have added some text here to respond to this.

 

Comments on the Quality of English Language

The overall quality of the English in the article is generally clear, but there are specific areas that could benefit from improvement for greater precision and readability. For example, in the sentence “Biases can mislead machine learning models into making incorrect assumptions,” the phrase "making incorrect assumptions" could be revised to "drawing inaccurate conclusions" for greater clarity. Similarly, “when AI is educated with flawed data, it risks absorbing unintentional distortions” could be simplified to "AI trained on flawed data may unintentionally incorporate biases."

There are also instances where sentences are unnecessarily complex. For example, the sentence “The article attempts to address the issues of bias, but it does not fully resolve the ethical dilemma of balancing historical accuracy with modern inclusivity standards” could be broken into two separate sentences for better flow: “The article raises the issue of bias. However, it does not fully resolve the ethical dilemma of balancing historical accuracy with modern inclusivity standards.”

We have made all necessary corrections and worked on the language.

 

Reviewer 2 Report

Comments and Suggestions for Authors

The paper is well structured and the logical flow flows very clearly.

This paper give a first description of the problem but more information should be added in order to explain how in the future this type of problem could be approached and also adding more details of the architecture could be really useful.

Results support the conclusions drawn but the experiments and calculations should be described in more details: a note could be interesting to specify how the model has been trained (probably not all information is available) because the reader could find interesting having more information of how ChatGPT4.0 and DALL E are trained. Also some information related how the SW has been implemented and a block diagram could be useful.

Previous studies have been adequately cited.

The article is really interesting and could give good insight for next works on these specific topics, especially speaking about how to detect and eventually mitigate biases on trained models.

Author Response

Reviewer 2

The paper is well structured and the logical flow flows very clearly.

This paper give a first description of the problem but more information should be added in order to explain how in the future this type of problem could be approached and also adding more details of the architecture could be really useful.

The last 3 paragraphs of 3.2 are providing more concrete examples.

 

Results support the conclusions drawn but the experiments and calculations should be described in more details: a note could be interesting to specify how the model has been trained (probably not all information is available) because the reader could find interesting having more information of how ChatGPT4.0 and DALL E are trained.

This has now been added under 3.2.

Reviewer 3 Report

Comments and Suggestions for Authors

The research pursued within the manuscript is not enough in its present form. Much of the text is introduction/ state of the art. The experimental part is insufficient. A deeper and more structured research must be done in order to be published. For example, a comparison among diverse AI tools can be developed. 

Besides, references in the text do not respect the journal format, and some of them are out-of-date. Same occurs with figures, as their size is not suitable for publishing. 

Comments on the Quality of English Language

Minor editing of English language required, including essay (e.g. Our article) and UK/US english misspelling (e.g. digitalization).

Author Response

Reviewer 3

The research pursued within the manuscript is not enough in its present form. Much of the text is introduction/ state of the art. The experimental part is insufficient. A deeper and more structured research must be done in order to be published. For example, a comparison among diverse AI tools can be developed. 

We have added some more material on AI tools but are limited for word space on this.

 

Besides, references in the text do not respect the journal format, and some of them are out-of-date. Same occurs with figures, as their size is not suitable for publishing. 

We have re-done all the references and hope they now conform.

 

Comments on the Quality of English Language

Minor editing of English language required, including essay (e.g. Our article) and UK/US english misspelling (e.g. digitalization).

We have worked on the language. We don't understand the reference to 'essay'; we use the word 'article' because this is what we think we have written.

Reviewer 4 Report

Comments and Suggestions for Authors

The authors discuss how bias is present in cultural heritage collections (CHCs) and their digital versions and how AI can exacerbate this bias. The manuscript examines the challenges and potential of utilising GenAI for cultural heritage, emphasising the need for human expertise to ensure accurate representations. It underscores a lack of well-annotated datasets with structured metadata in cultural heritage communities and recommends integrating bias mitigation techniques throughout the process to address these issues.

(i)                       This manuscript has potential. The authors should fully discuss how AI can amplify bias and hinder effective AI implementation due to a lack of well-annotated datasets and structured metadata in CHCs. They should also cover the contribution of insufficient humanities expertise in generative AI platforms, which can lead to biased interpretations and classifications. Additionally, they should address limitations to the effectiveness of AI applications due to a lack of interconnectivity and interoperability among digitised collections. For the manuscript to contribute to the discourse of generative AI in heritage management, its quality must be improved. Therefore, this review recommends incorporating a few references and expanding on some areas.

(ii)                    The authors should provide a more comprehensive and robust analysis of the challenges and potential solutions for using AI in cultural heritage collections. This will strengthen the theoretical framework and offer practical insights and recommendations for future research and policy development.

(iii)                  The section on bias-perpetuated algorithms can be expanded by reading O’Neil, C. (2017).  Weapons of math destruction: How big data increases inequality and threatens democracy. Penguin Random House. Although the article (Gilliland-Swetland, A.J. (2002). Digital preservation and metadata: history, theory, practice. The Journal of Academic Librarianship, 28, 165-166) is somewhat dated, it could provide a broader context. The discussion of the human-in-the-loop (HITL) approaches in AI, which involve human oversight to mitigate bias, using Monarch, R(M). ( 2021).  Human-in-the-Loop Machine Learning: Active learning and annotation for human-centred AI. Manning, as a starting point can enhance the theoretical foundation of this section. The authors can also discuss case studies where HITL has been successfully implemented in cultural heritage projects to add practical insights into the section.

(iv)                   The ethical considerations in AI are well-discussed, but we recommend referencing either, Müller, V. C. (2020). Ethics of artificial intelligence and robotics. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy (1-70) for a comprehensive overview of ethical issues in AI, OR Colley, S. (2015). Ethics and Digital Heritage. In: Ireland, T., Schofield, J. (eds) The Ethics of Cultural Heritage. Ethical Archaeologies: The Politics of Social Justice, vol 4. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1649-8_2 to strengthen the ethical framework of the discussion.

(v)                     The manuscript discusses the lack of interconnectivity and interoperability in CHCs. The authors can provide practical solutions for enhancing interconnectivity by consulting Jones, E & Seikel, M. (2016) Linked Data for Cultural Heritage, Routledge

 

(vi)                   The authors should discuss the implications of their work for future directions and policy-making. This literature can be useful: Verdegem, P. (2021).  AI for Everyone? Critical Perspectives. University of Westminster Press and Joy. C. L. 2016). The Politics of Heritage Management in Mali: From UNESCO to Djenne. Francis Taylor, which provides practical guidelines for policy-makers.

Author Response

Reviewer 4

(i)                       This manuscript has potential. The authors should fully discuss how AI can amplify bias and hinder effective AI implementation due to a lack of well-annotated datasets and structured metadata in CHCs. We cite several sources that write extensively on this (e.g. Kizhner 2022) and have also strengthened this throughout the text.

They should also cover the contribution of insufficient humanities expertise in generative AI platforms, which can lead to biased interpretations and classifications. Done.

Additionally, they should address limitations to the effectiveness of AI applications due to a lack of interconnectivity and interoperability among digitised collections. Done.

(ii)                    The authors should provide a more comprehensive and robust analysis of the challenges and potential solutions for using AI in cultural heritage collections. This will strengthen the theoretical framework and offer practical insights and recommendations for future research and policy development. Done.

(iii)                  The section on bias-perpetuated algorithms can be expanded by reading O’Neil, C. (2017).  Weapons of math destruction: How big data increases inequality and threatens democracy. Penguin Random House. Although the article (Gilliland-Swetland, A.J. (2002). Digital preservation and metadata: history, theory, practice. The Journal of Academic Librarianship, 28, 165-166) is somewhat dated, it could provide a broader context. We have briefly referenced the first text but not the second due to space reasons.

 

The discussion of the human-in-the-loop (HITL) approaches in AI, which involve human oversight to mitigate bias, using Monarch, R(M). ( 2021).  Human-in-the-Loop Machine Learning: Active learning and annotation for human-centred AI. Manning, as a starting point can enhance the theoretical foundation of this section. The authors can also discuss case studies where HITL has been successfully implemented in cultural heritage projects to add practical insights into the section. We have added text on this and referenced Monarch.

 

(iv)                   The ethical considerations in AI are well-discussed, but we recommend referencing either, Müller, V. C. (2020). Ethics of artificial intelligence and robotics. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy (1-70) for a comprehensive overview of ethical issues in AI, OR Colley, S. (2015). Ethics and Digital Heritage. In: Ireland, T., Schofield, J. (eds) The Ethics of Cultural Heritage. Ethical Archaeologies: The Politics of Social Justice, vol 4. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1649-8_2 to strengthen the ethical framework of the discussion.. We have referenced Mueller now.

(v)                     The manuscript discusses the lack of interconnectivity and interoperability in CHCs. The authors can provide practical solutions for enhancing interconnectivity by consulting Jones, E & Seikel, M. (2016) Linked Data for Cultural Heritage, Routledge

Now referenced.

(vi)                   The authors should discuss the implications of their work for future directions and policy-making. This literature can be useful: Verdegem, P. (2021).  AI for Everyone? Critical Perspectives. University of Westminster Press and Joy. C. L. 2016). The Politics of Heritage Management in Mali: From UNESCO to Djenne. Francis Taylor, which provides practical guidelines for policy-makers.

These are great suggestions but we're seriously out of space so we did not pursue these.

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

1. The experiment pursued is still not enough. In Materials and Methods section the authors say: “In this article we draw on two kinds of source material to answer the research question: existing literature on bias mitigation in CHCs, and an experiment we conducted with image generation using a GenAI platform”. This experiment is not enough to build up the basis of the article. Practically all the information comes from the literature. Besides using AI to create kouros images, they should include other cultural heritage objects and periods (maybe the results vary). Otherwise, they should say instead that literature is the main source material, and a test (more than an experiment) has been done to prove some of the problems stated within the text. Therefore, maybe this document fits better a Review rather than an Article (Types of Publications).

2. The paragraph added between lines 300-330 about annotation strategies can be adapted to a table format to ease the reading.

3. Conclusions are more like the content of an introduction than a conclusion section. In addition, they are not related to the objectives: “this article aims to discuss the challenges that automation brings as well as provide solutions from beyond the cultural heritage sector”. Authors should clearly state the challenges found and indicate precisely the solutions proposed. Conclusions in their current form are vague and unspecific.

4. Regarding the references: In the text, reference numbers should be placed in square brackets [ ], and placed before the punctuation; for example [1], [1–3] or [1,3]. For embedded citations in the text with pagination, use both parentheses and brackets to indicate the reference number and page numbers; for example [5] (p. 10). or [6] (pp. 101–105).

5. Revise all the references, some do not follow the journal format, i.e.: (see [33]) LINE 119; (see e.g. [32]) LINE 139; (see [38] for practical solutions for enhancing inter- 181 connectivity). LINE 181.

6. The size of Figures 1 and 2 are not correct, they cannnot occupy one entire sheet.

Author Response

Dear Editor,

we have made revisions to the text as requested and hope it now meets your requirements. Our responses to the issues raised are in bold below.

  1. The experiment pursued is still not enough. In Materials and Methods section the authors say: “In this article we draw on two kinds of source material to answer the research question: existing literature on bias mitigation in CHCs, and an experiment we conducted with image generation using a GenAI platform”. This experiment is not enough to build up the basis of the article. Practically all the information comes from the literature. Besides using AI to create kouros images, they should include other cultural heritage objects and periods (maybe the results vary). Otherwise, they should say instead that literature is the main source material, and a test (more than an experiment) has been done to prove some of the problems stated within the text. Therefore, maybe this document fits better a Review rather than an Article (Types of Publications).

We have amended our text accordingly and added a further experiment.

 

  1. The paragraph added between lines 300-330 about annotation strategies can be adapted to a table format to ease the reading.

We cannot see how one would turn that material into a table and have hence not done so. The text is fully understandable.

 

  1. Conclusions are more like the content of an introduction than a conclusion section. In addition, they are not related to the objectives: “this article aims to discuss the challenges that automation brings as well as provide solutions from beyond the cultural heritage sector”. Authors should clearly state the challenges found and indicate precisely the solutions proposed. Conclusions in their current form are vague and unspecific.

We have augmented the conclusion.

 

  1. Regarding the references: In the text, reference numbers should be placed in square brackets [ ], and placed before the punctuation; for example [1], [1–3] or [1,3]. For embedded citations in the text with pagination, use both parentheses and brackets to indicate the reference number and page numbers; for example [5] (p. 10). or [6] (pp. 101–105).

We have revised the references as requested above.

 

  1. Revise all the references, some do not follow the journal format, i.e.: (see [33]) LINE 119; (see e.g. [32]) LINE 139; (see [38] for practical solutions for enhancing inter- 181 connectivity). LINE 181.

We do not understand why the references that have also a comment (e.g. see...) are wrong and it is therefore not clear to us how these should be revised. This is a matter for the copyediting stage.

 

  1. The size of Figures 1 and 2 are not correct, they cannot occupy one entire sheet.

We have not been advised what the correct size is but we have re-sized the figures. Again, this is a matter for the copy-editing stage.

 

 

We look forward to hearing from you.

 

 

Kind regards,

Anna Foka, Gabriele Griffin

Back to TopTop