You are currently viewing a new version of our website. To view the old version click .
Data
  • Data Descriptor
  • Open Access

11 October 2024

Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”

and
1
Knowledge Construction Lab, Leibniz-Institut für Wissensmedien, 72076 Tübingen, Germany
2
Department of Psychology, Eberhard Karls University, 72076 Tübingen, Germany
*
Author to whom correspondence should be addressed.

Abstract

With the release of ChatGPT, text-generating AI became accessible to the general public virtually overnight, and automated text generation (ATG) became the focus of public debate. Previously, however, little attention had been paid to this area of AI, resulting in a gap in the research on people’s attitudes and perceptions of this technology. Therefore, two representative surveys among the German population were conducted before (March 2022) and after (July 2023) the release of ChatGPT to investigate people’s attitudes, concepts, and knowledge on ATG in detail. This data descriptor depicts the structure of the two datasets, the measures collected, and potential analysis approaches beyond the existing research paper. Other researchers are encouraged to take up these data sets and explore them further as suggested or as they deem appropriate.
Dataset License: CC-BY Attribution 4.0 International.

1. Summary

Recent developments in the availability of automated text generation (ATG) technology have a major impact on many areas of society, ranging from education to science, economics, journalism, and media. The adequate usage and adoption of LLMs (Large language models) like GPT, Claude, or Llama can depend on what people believe their capabilities are. Moreover, people’s attitudes influence the acceptance of AI for text generation, which has been integrated into many areas. It is, therefore, of great importance to understand what people know about these developments and how they perceive and evaluate them. The data descriptor presented here comprises two sets of survey data collected at two distinct time points (March 2022 and July 2023). The data were collected from two different German samples to capture the current attitudes, concepts, and experiences related to the technology at each of these moments. Therefore, the first survey depicts those measures before the release of ChatGPT, while the second survey was conducted after its launch, which happened in November 2022. The second survey is an exact replication of the first, with the addition of variables to account for people’s experiences with this new technology, i.e., a knowledge test, experience with ChatGPT, attitudes toward ChatGPT, and lay attitudes toward ATG and ChatGPT.
These data sets mark two pivotal moments in capturing specific attitudes and concepts toward ATG, i.e., before the public focus on large language models (LLM) and shortly after the release of the first open access LLM-based Chatbot. As such, they provide a valuable benchmark for future studies on people’s attitudes and perceptions, offering two snapshots at the earliest possible points in time. Moreover, as the surveys were carried out on large representative samples, we invite researchers to analyze these data sets in more detail and to look at potential subgroups, which we have not conducted in the connected publication [].
Both surveys were distributed online via Mingle (Bilendi, Cologne, Germany) to collect data from German samples, which were representative of age, gender, and educational level (see Table 1).
Table 1. Composition of the final samples regarding gender, age, and educational level.
This project was funded by the Leibniz-Institut für Wissenmedien, STB Data Science, in Germany and is part of the project “AI for science communication: Acceptance and laypeople comprehension”. The main results are published in the journal Behavioral Sciences (https://doi.org/10.3390/bs14050353, accessed on 03 October 2024).

2. Data Description

On the platform of the Open Science Framework (OSF), two separate primary data files in RDS format are stored for each survey. Moreover, an analysis R file is stored, in which these two data files are merged as one of the first steps. However, they can be analyzed separately, especially since the second sample consists of different participants.
In the data files, each row is a participant, and each column is a variable. For the variable names and their corresponding column names in the data sets, see Table 2. As most scales consisted of multiple items that are not yet aggregated or further processed in the data set for reasons of transparency, the columns in the data set are provided with the suffix “_[Item number]” and represent the single items of one measure. The codebook attached to the OSF project contains a detailed description of all variables, as well as all items and their choice options.
Table 2. Overview of all variables used in the surveys.
We recommend analyzing the data using R or R-studio. However, it is also possible to import the data set to other statistical analysis programs. To analyze the data with our provided script, it is necessary to install R or R-Studio in version 4.3.1 (16 June 2023 ucrt) or newer. The necessary packages are listed in the beginning of the analysis script and need to be installed in advance.

3. Methods

3.1. Survey Design

The surveys were developed to capture people’s experience, attitudes, concepts, knowledge, and preferences toward ATG and ChatGPT, and to examine whether there were changes between these two time points.
In each survey, all participants were presented with the full set of questions, with one exception: in the beginning, participants were asked about their experience with ATG. If they indicated that they had previously read an AI-written text, they were directed to an open text entry question, where they could specify the type(s) of text they had read. This question created missing values for a proportion of respondents. The same procedure was carried out in Survey 2. However, in the second survey, we had the chance to ask specifically about their experience with ChatGPT, which again generated a subset of participants with ChatGPT experience who were later forwarded to the attitudes toward ChatGPT scale. Therefore, individuals who indicated no experience with ChatGPT have missing values at this measure. For each scale, the items were presented in random order to avoid order effects.

3.2. Survey Platform

The data collection and questionnaire setup were conducted using Qualtrics (Qualtrics LLC; CEO: Ryan Smith; Address: 2250 N. University Pkwy, 48-C, Provo, UT 84604, USA). Qualtrics was chosen for its robust features, including customizable survey design and secure data storage.

3.3. Participant Recruitment

The panel provider Bilendi and respondi was commissioned with the recruitment of the samples. It is ISO certified (ISO 20252:2019) and processes personal data in accordance with the European General Data Protection Regulation (DSGVO). It sent out invitations via its panel Mingle. The provider was also responsible for quota management. For the quotas, according to which participants were recruited, and the composition of the final samples, see Table 1. According to Mingle, participant compensation follows a points system in which points can be collected depending on the length and type of the study. The participants themselves decide whether to convert the points into cash, vouchers, or donations. For participating in Survey 1, respondents received the equivalent of EUR 1.30 for a mean duration of 11.77 min. In Survey 2, they received EUR 1.93 for a mean duration of 15.52 min.1

3.4. Data Filtering

In Survey 1, from 1111 participants who completed the questionnaire, 83 people did not give their consent to data use. Thus, the final sample size resulted in N = 1028. In Survey 2, from 1116 participants who filled in the questionnaire completely, 51 withdrew their data at the end of the survey. Another 101 persons were excluded due to an incorrect answer to the attention check implemented in the knowledge test (item “KNOW_16”). Therefore, the final sample in Survey 2 consisted of N = 1013 respondents.

3.5. Data Curation and Storage

The original data from both surveys are archived on the local servers of the Leibniz-Institut für Wissensmedien for at least ten years after publication. The prepared and fully anonymized primary data are publicly available at OSF. The data stored at Qualtrics will be deleted by 31 March 2025, at the latest. Qualtrics complies with the European guidelines on data storage. Bilendi and respondi had no access to the survey data during data collection and can therefore not connect the participant data to the survey data. In turn, we as authors had no access to Bilendi and respondi’s participant data at any point in time.
There are no missing values in both data sets as the data were only analyzed if participants fully completed the surveys and gave their written informed consent to using their data for research purposes on the last page. Furthermore, participants in Survey 2 were excluded from the analysis if they incorrectly answered the attention check item implemented in the knowledge test.

3.6. Measures and Procedure

Table 2 shows all measures used in the surveys in the order they were presented. The original Qualtrics questionnaire can be found under the material here: https://osf.io/sn75h/ (accessed on 03 October 2024). The original data set also contains variables for the exact age, education, and occupation. Note that participants entered their occupations in an open text entry, which is why the variable requires further processing. Please see [] or the R script for the correct answers to the knowledge test.

4. Limitations

The present data set contains two snapshots of different samples at two points in time and is, therefore, unsuitable for analyzing intrapersonal changes. Since the surveys were conducted with samples from the German population, any comparisons or inferences regarding other countries should be approached with caution. Consequently, the scale translations need to be validated when replicating these studies in different languages.
Moreover, since the first survey was conducted before the public awareness caused by the launch of ChatGPT, the concepts and scales can still be interpreted as too general. The concepts of functionality, data retrieval, responsibility, and human control are not tailored to specific technologies but capture different aspects that apply more or less to various systems. Furthermore, there is certainly high variability in the answer options “perhaps” of the concepts and the answer option “don’t know” of the knowledge test, for which this data set cannot provide explanations. For example, future research could focus on the most frequently used AI systems and adapt the questions accordingly. Open answer options could offer further insights into the sources of uncertainties or gaps in knowledge.

5. Potential Applications of the Data Set

This open-access data set can be seen as an important starting point for subsequent research. This data set not only provided a baseline measure of attitudes toward ATG before public attention was drawn to this technology, but also offered a broad overview of various constructs that may become relevant when ATG and LLMs gain wider adoption in the general population. Future research may also delve deeper into specific misconceptions, risks, and opportunities that the population perceives. The following research questions might serve scientists in this research field in using this data set for exploratory purposes. Potential areas of further research include socio-demographic aspects, which can provide further information about differences between population groups and thus enrich our knowledge about a potential AI divide:
  • RQ1: Do attitudes, preferences, or knowledge differ regarding age, gender, or educational background?
Moreover, further investigation of the relationship between personal attitudes and behavioral intentions can provide information about factors that determine actual usage:
  • RQ2: How well do lay attitudes toward ChatGPT, attitudes toward using the technology, and performance expectancy predict behavioral intentions?
  • RQ3: How well do the TAM/UTAUT subscales predict behavioral intention and ChatGPT use in Survey 2?
To our knowledge, no scale captures specific attitudes toward ATG technology or measures adequate to differentiate between users and non-users or even different levels of experience. Therefore, our self-developed measurements can serve as a basis for scale development and validation:
  • RQ4: How valid is the newly developed scale in measuring lay attitudes toward ChatGPT? Can it be adapted to LLMs in general?
  • RQ5: How valid is the newly developed scale in measuring attitudes toward ChatGPT?
  • RQ6: Do participants with and without ChatGPT experience differ in their ratings?
  • RQ7: What do the open answers on ChatGPT usage tell us?

Author Contributions

Both A.L.H. and J.K. conceptualized this study, edited, and revised the manuscript. A.L.H. drafted the manuscript, conducted the studies, and monitored the recruitment of participants. A.L.H. analyzed and interpreted the data and is responsible for data curation. J.K. supervised the studies and was responsible for funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Leibniz-Institut für Wissensmedien, STB Data Science.

Institutional Review Board Statement

This study was conducted in accordance with the guidelines of the Local Ethics Committee of the Leibniz-Institut für Wissensmedien, which approved the study design and methods (Approval number: LEK 2023/022, approved on 17 May 2023).

Data Availability Statement

The data, analysis script, and material are openly available in OSF at https://osf.io/sn75h/ (accessed on 03 October 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Note

1
Mean duration was calculated by excluding outliers detected via the interquartile range (IQR) method. Following this, we defined outliers as observations that fell below Q1 (first quartile) − 1.5 × IQR or above Q3 (third quartile) + 1.5 × IQR. Thus, for Survey 1, the data from 75 respondents whose duration was equal to or longer than 28.27 min were excluded from the calculation of the average duration. For Survey 2, the data from 98 participants whose duration was equal to or longer than 40.52 min were not included in the average duration calculation. Note that extreme outliers regarding duration of survey completion can be caused by missing the submission of the last survey page or breaks during the survey.

References

  1. Lermann Henestrosa, A.; Kimmerle, J. Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany. Behav. Sci. 2024, 14, 353. [Google Scholar] [CrossRef] [PubMed]
  2. Schepman, A.; Rodway, P. Initial Validation of the General Attitudes towards Artificial Intelligence Scale. Comput. Hum. Behav. Rep. 2020, 1, 100014. [Google Scholar] [CrossRef] [PubMed]
  3. Sundar, S.S.; Kim, J. Machine Heuristic: When We Trust Computers More than Humans with Our Personal Information. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland, 4–9 May 2019; pp. 1–9. [Google Scholar] [CrossRef]
  4. Said, N.; Potinteu, A.E.; Brich, I.R.; Buder, J.; Schumm, H.; Huff, M. An Artificial Intelligence Perspective: How Knowledge and Confidence Shape Risk and Opportunity Perception. Comput. Hum Behav. 2022, 149, 107855. [Google Scholar] [CrossRef]
  5. Venkatesh, V.; Morris, M.G.; Davis, G.B.; Davis, F.D. User Acceptance of Information Technology: Toward a Unified View. MIS Q. 2003, 27, 425–478. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.