Children of AI: A Protocol for Managing the Born-Digital Ephemera Spawned by Generative AI Language Models

: The recent public release of the generative AI language model ChatGPT has captured the public imagination and has resulted in a rapid uptake and widespread experimentation by the general public and academia alike. The number of academic publications focusing on the capabilities as well as practical and ethical implications of generative AI has been growing exponentially. One of the concerns with this unprecedented growth in scholarship related to generative AI, in particular, ChatGPT, is that, in most cases, the raw data, which is the text of the original ‘conversations,’ have not been made available to the audience of the papers and thus cannot be drawn on to assess the veracity of the arguments made and the conclusions drawn therefrom. This paper provides a protocol for the documentation and archiving of these raw data.


Introduction
In recent months there has been wide-spread public attention regarding the use of artificial intelligence (AI) in various fields.The public releases of the image generator DALL-E and of the generative AI language model ChatGPT (Chat Generative Pre-trained Transformer) in early 2022 caught the public's imagination.Since then, free-ranging debate emerged regarding the present and potential future abilities of generative AI, the dangers is may pose and the ethics of its usage.ChatGPT is a type of deep learning model that uses transformer architecture to generate coherent and contextually relevant human-like responses based on the input it receives [1].
Since its initial release in 2018, ChatGPT has undergone several revisions, mainly focusing on its increased capabilities in providing longer segments of coherent text and the contextual answering of questions, including the addition of human preferences and feedback.The release of ChatGPT 2.0 in September 2019 relied on a training data set with 1.5 billion parameters, while ChatGPT 3 (June 2020) was trained (by human trainers) on 175 billion parameters.ChatGPT 3.5 was released to the general public to encourage experimentation [2], with a temporal cut for the addition of its training data in September 2021.

The Problem
A primary mandate of academic publishing is ethical academic conduct.To ensure the integrity, transparency, and reproducibility of published research, the Committee on Publication Ethics (COPE) has issued a set of guidelines on the deposition and management of data and datasets that were used to explain and substantiate findings that are reported in academic publications [33].In brief, this entails the deposition of research data in a common readable format in a curated, public, institutional or governmental data depository; the publication of the data and their collection methodology as a stand-alone data publication; or the appendage of the data as supplementary material to the article hosted on the publisher's servers.This mandate is followed by and large with some disciplines being more compliant than others (e.g., medical research).
At this point in time, academic research into ChatGPT and other similar generative AI language models' abilities and limitations is expanding at a rapid rate across most disciplines.The question arises as to how the research data associated with the publications are being managed.While much of the following discussion focusses on ChatGPT, this also applies to the output of other generative AI language models.
The nature of ChatGPT and similar generative AI tools means that each response to a given task will be different.While responses may be structurally and conceptually similar [34], they are not identical.Thus, it is not possible to recreate an identical or near identical response.Furthermore, all conversations with ChatGPT, for example, are deleted after 30 days to maintain server space [35].Irrespective of this, all conversations with generative AI language models are virtual artefacts (sensu [36]), which will eventually disappear due to server upgrades or data warehouse restructuring.Consequently, the original conversations are equivalent to and should be treated like an experiment's raw data, and thus should be archived.At present, academic papers that have been written in relation to ChatGPT, for example, have taken five different approaches: 1.
articles that quote extracts of the conversation in the body of the paper and provide the full text of the conversation as a supplementary document [13] or an appendix [37] 3.
articles that quote extracts of the conversation in the text but do not provide access to the full text [15,22,23,25,29,31,38,39] 4.
articles that discuss non-quote specific conversation(s) and do not provide access to the full text [30]; and 5.
articles that discuss ChatGPT but do not refer to specific conversations, instead discussing the topic at a more abstract level [26,37].
Setting aside the first group, where the conversation makes up the core of the paper, and the second group where the full text is supplied as an appendix or a supplementary file, the other three groups do not allow readers to understand the full context of the conversation and cannot independently assess the validity of the author's interpretation of the interaction.

Functional Considerations
While conversations with generative AI language models such as ChatGPT have great similarity with formal interviews in anthropological, and ethnographical and sociological settings [40], they differ on a key aspect related to real-world interactions.For example, human-to-human interviews are conducted in a linear fashion, with one question following on from, and building on, an answer, while in theory, a human respondent could be asked to re-answer the question, and this normally occurs with transitioning phrases [41] and a concomitant second-guessing by the respondent trying to taking cues from the interviewer as to why the previous answer was insufficient (else it would not have been asked in exactly the same way).ChatGPT, on the other hand, allows the human participant to request the generative AI language model to answer the question again, generating a new response to the question.Thus any archiving of a conversation with ChatGPT needs to archive all "regenerations" of a question if they were conducted, while the paper needs to identify which instance of regeneration was being used or cited.
ChatGPT as a service is not static, but is both being continually upgraded in terms of functionality and server performance and due to its ability to 'learn.'The latter is reinforced through user feedback, which is solicited both when a user tasks ChatGPT to regenerate a response, with ChatGPT delivering the second version (Figure 1), and by OpenAI staff monitoring selected conversations ("conversations that may be reviewed by our Al trainers to improve our systems").
phrases [41] and a concomitant second-guessing by the respondent trying to taking cues from the interviewer as to why the previous answer was insufficient (else it would not have been asked in exactly the same way).ChatGPT, on the other hand, allows the human participant to request the generative AI language model to answer the question again, generating a new response to the question.Thus any archiving of a conversation with ChatGPT needs to archive all "regenerations" of a question if they were conducted, while the paper needs to identify which instance of regeneration was being used or cited.
ChatGPT as a service is not static, but is both being continually upgraded in terms of functionality and server performance and due to its ability to 'learn.'The latter is reinforced through user feedback, which is solicited both when a user tasks ChatGPT to regenerate a response, with ChatGPT delivering the second version (Figure 1), and by OpenAI staff monitoring selected conversations ("conversations that may be reviewed by our Al trainers to improve our systems").Thus it can be posited that a ChatGPT-like model is time-dependent, and therefore, the date and time of conversation should also be recorded, akin to the practice of stating the access date of web pages in standard referencing.

Ethical Considerations
Standard ethnographic research practice mandates that interviews are based on a mutual understanding of trust in which the conversation is confidential and any interpretation of the conversation is carried out with expressed and informed consent, even though power dynamics and their changing nature need to be considered [42].In the current understanding, works created by generative AI do not accrue copyright for the AI system [43] as they fail to meet the human authorship requirement.They can, however, generate copyright for the human actor in the interaction if the latter has substantive guiding involvement [44].In the same vein, an 'interview' with a generative AI language model differs from sociological or ethnographic interviews because generative AI is, at least, at this point in time, not a sentient entity and thus cannot provide informed consent.It follows that, at least, at this point of legal understanding, 'conversations' with ChatGPT can be archived.
In traditional ethnographic research practice, notebooks and interview transcripts were commonly deemed personal data, 'owned' by the respective researcher.In recent years, research ethics to militate against falsified research findings have led to the mandate to archive and make accessible the original or 'raw' research data.In the space of qualitative research, this posed the ethical conundrum to allow access, while, at the same time, maintaining the confidentiality of information provided in the interviews [45][46][47][48].This can be overcome by anonymizing or de-identifying the respondents, although contextual information in the interviews may, in some instances, allow for a re-identification of the informants [49][50][51].
Conversations with generative AI language models do not fundamentally differ from the transcripts of ethnographic interviews with informants, with the generative AI representing the interviewee.Where privacy issues are involved, for example, where genuine patient records might be used in the assessment of the capabilities of generative AI language models, standard and well-established depersonalization and identity substitution protocols can and should be followed.Thus it can be posited that a ChatGPT-like model is time-dependent, and therefore, the date and time of conversation should also be recorded, akin to the practice of stating the access date of web pages in standard referencing.

Ethical Considerations
Standard ethnographic research practice mandates that interviews are based on a mutual understanding of trust in which the conversation is confidential and any interpretation of the conversation is carried out with expressed and informed consent, even though power dynamics and their changing nature need to be considered [42].In the current understanding, works created by generative AI do not accrue copyright for the AI system [43] as they fail to meet the human authorship requirement.They can, however, generate copyright for the human actor in the interaction if the latter has substantive guiding involvement [44].In the same vein, an 'interview' with a generative AI language model differs from sociological or ethnographic interviews because generative AI is, at least, at this point in time, not a sentient entity and thus cannot provide informed consent.It follows that, at least, at this point of legal understanding, 'conversations' with ChatGPT can be archived.
In traditional ethnographic research practice, notebooks and interview transcripts were commonly deemed personal data, 'owned' by the respective researcher.In recent years, research ethics to militate against falsified research findings have led to the mandate to archive and make accessible the original or 'raw' research data.In the space of qualitative research, this posed the ethical conundrum to allow access, while, at the same time, maintaining the confidentiality of information provided in the interviews [45][46][47][48].This can be overcome by anonymizing or de-identifying the respondents, although contextual information in the interviews may, in some instances, allow for a re-identification of the informants [49][50][51].
Conversations with generative AI language models do not fundamentally differ from the transcripts of ethnographic interviews with informants, with the generative AI representing the interviewee.Where privacy issues are involved, for example, where genuine patient records might be used in the assessment of the capabilities of generative AI language models, standard and well-established depersonalization and identity substitution protocols can and should be followed.
It is understood that some research may rely on specific prompt instructions that create specific non-standard outcomes (e.g., prompts that invert the ethical valence [52]), and to what authors may consider 'proprietary' for the purpose of ancillary research.In standard scientific research, it is expected that a paper has a methodology section that is public and sets out the research conditions in a way that the experiment can be replicated.Research into and with generative AI language models is no different in this regard.
Consequently, all interactions with generative AI language models that are being analyzed and used in a research paper represent the original research data that need to be treated in the same way as interview data in qualitative research.Just because the data were created by a generative AI language model instead of a human does not make these data any different, and they need to be curated and managed in the same manner.
It is spurious to argue that this would cause an undue (i.e., time-consuming) burden on the researcher.Data management protocols have been established to improve transparency of research and research findings, and to reduce academic misconduct.Treating 'interview' data with generative AI language models with a different standard to interview data with human participants would reopen the door to potential misrepresentation of data and possible academic misconduct.
The required data curation would, at the bare minimum, entail their retention and curation in line with data management policies of the academic institution the author(s) is/are affiliated with and the standards of the academic disciplines they are part of.To do so effectively and comprehensively, a uniform minimum protocol for data collection and documentation is suggested.It is accepted that this protocol is limited to the text-based output of generative AI models.

Protocol
We propose the following five-step process for ChatGPT (and other generative AI language model) data collection and archiving: (Step 1) record the metadata, comprised of version and version date, which in the case of ChatGPT can be found at the bottom of the interface (Figure 2), as well as the date and time the conversation occurred, using GMT as the standard.
(Step 2) conduct the conversation as required. ( Step 3) add the end time of the conversation to the metadata entry.(Step 4) copy the text of the conversation into a text editor or word processor and save the file(s), making sure that all iterations of task generation (if any) are captured and identified as such (e.g., Regeneration 1/3, 2/3, etc.).
(Step 5) generate a complete data document that contains the metadata and text of each conversation.
(Step 6) submit the data document to an approved public or institutional data repository or append it to the publication as a supplementary file.
search into and with generative AI language models is no different in this regard.
Consequently, all interactions with generative AI language models that are being analyzed and used in a research paper represent the original research data that need to be treated in the same way as interview data in qualitative research.Just because the data were created by a generative AI language model instead of a human does not make these data any different, and they need to be curated and managed in the same manner.
It is spurious to argue that this would cause an undue (i.e., time-consuming) burden on the researcher.Data management protocols have been established to improve transparency of research and research findings, and to reduce academic misconduct.Treating 'interview' data with generative AI language models with a different standard to interview data with human participants would reopen the door to potential misrepresentation of data and possible academic misconduct.
The required data curation would, at the bare minimum, entail their retention and curation in line with data management policies of the academic institution the author(s) is/are affiliated with and the standards of the academic disciplines they are part of.To do so effectively and comprehensively, a uniform minimum protocol for data collection and documentation is suggested.It is accepted that this protocol is limited to the text-based output of generative AI models.

Protocol
We propose the following five-step process for ChatGPT (and other generative AI language model) data collection and archiving: (Step 1) record the metadata, comprised of version and version date, which in the case of ChatGPT can be found at the bottom of the interface (Figure 2), as well as the date and time the conversation occurred, using GMT as the standard.
(Step 2) conduct the conversation as required. ( Step 3) add the end time of the conversation to the metadata entry.(Step 4) copy the text of the conversation into a text editor or word processor and save the file(s), making sure that all iterations of task generation (if any) are captured and identified as such (e.g., Regeneration 1/3, 2/3, etc.).
(Step 5) generate a complete data document that contains the metadata and text of each conversation.
(Step 6) submit the data document to an approved public or institutional data repository or append it to the publication as a supplementary file.

Conclusions
Over the past decade, the development of generative artificial intelligence systems has accelerated dramatically, resulting in the recent public release of the generative AI language model ChatGPT.ChatGPT has captured the public's imagination with widespread experimentation by academia and the general public alike.Numerous academic disciplines experimented with the capabilities of ChatGPT in relation to their research directions, examining its ability to provide accurate responses.The number of publications

Conclusions
Over the past decade, the development of generative artificial intelligence systems has accelerated dramatically, resulting in the recent public release of the generative AI language model ChatGPT.ChatGPT has captured the public's imagination with widespread experimentation by academia and the general public alike.Numerous academic disciplines experimented with the capabilities of ChatGPT in relation to their research directions, examining its ability to provide accurate responses.The number of publications on the capabilities of ChatGPT and the practical and ethical implications of the use and abuse of generative AI has been growing exponentially.
This unprecedented growth in scholarship related to generative AI, in particular, ChatGPT, occurs in a large unregulated space, wherein, in most cases, the raw data, which is the text of the original 'conversations,' are not being made available to the audience of the papers.In consequence, they cannot be drawn on to assess the veracity of the arguments made in the publications and the conclusions drawn therefrom.This paper has provided a protocol for the documentation and archiving of these raw data.

Figure 1 .
Figure 1.Request for user feedback by ChatGPT following the provision of a regenerated response.

Figure 1 .
Figure 1.Request for user feedback by ChatGPT following the provision of a regenerated response.

Figure 2 .
Figure 2. Version date (as shown in the footer of the interaction window).

Figure 2 .
Figure 2. Version date (as shown in the footer of the interaction window).