Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”

Lermann Henestrosa, Angelica; Kimmerle, Joachim

doi:10.3390/data9100116

Open AccessData Descriptor

Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”

by

Angelica Lermann Henestrosa

^1,*

and

Joachim Kimmerle

^1,2

¹

Knowledge Construction Lab, Leibniz-Institut für Wissensmedien, 72076 Tübingen, Germany

²

Department of Psychology, Eberhard Karls University, 72076 Tübingen, Germany

^*

Author to whom correspondence should be addressed.

Data 2024, 9(10), 116; https://doi.org/10.3390/data9100116

Submission received: 11 September 2024 / Revised: 3 October 2024 / Accepted: 9 October 2024 / Published: 11 October 2024

Download Versions Notes

Abstract

With the release of ChatGPT, text-generating AI became accessible to the general public virtually overnight, and automated text generation (ATG) became the focus of public debate. Previously, however, little attention had been paid to this area of AI, resulting in a gap in the research on people’s attitudes and perceptions of this technology. Therefore, two representative surveys among the German population were conducted before (March 2022) and after (July 2023) the release of ChatGPT to investigate people’s attitudes, concepts, and knowledge on ATG in detail. This data descriptor depicts the structure of the two datasets, the measures collected, and potential analysis approaches beyond the existing research paper. Other researchers are encouraged to take up these data sets and explore them further as suggested or as they deem appropriate.

Dataset: https://osf.io/sn75h/.

Dataset License: CC-BY Attribution 4.0 International.

Keywords:

automated text generation; public attitudes toward AI; ChatGPT impact; automated journalism; conceptions on AI

1. Summary

Recent developments in the availability of automated text generation (ATG) technology have a major impact on many areas of society, ranging from education to science, economics, journalism, and media. The adequate usage and adoption of LLMs (Large language models) like GPT, Claude, or Llama can depend on what people believe their capabilities are. Moreover, people’s attitudes influence the acceptance of AI for text generation, which has been integrated into many areas. It is, therefore, of great importance to understand what people know about these developments and how they perceive and evaluate them. The data descriptor presented here comprises two sets of survey data collected at two distinct time points (March 2022 and July 2023). The data were collected from two different German samples to capture the current attitudes, concepts, and experiences related to the technology at each of these moments. Therefore, the first survey depicts those measures before the release of ChatGPT, while the second survey was conducted after its launch, which happened in November 2022. The second survey is an exact replication of the first, with the addition of variables to account for people’s experiences with this new technology, i.e., a knowledge test, experience with ChatGPT, attitudes toward ChatGPT, and lay attitudes toward ATG and ChatGPT.

These data sets mark two pivotal moments in capturing specific attitudes and concepts toward ATG, i.e., before the public focus on large language models (LLM) and shortly after the release of the first open access LLM-based Chatbot. As such, they provide a valuable benchmark for future studies on people’s attitudes and perceptions, offering two snapshots at the earliest possible points in time. Moreover, as the surveys were carried out on large representative samples, we invite researchers to analyze these data sets in more detail and to look at potential subgroups, which we have not conducted in the connected publication [1].

Both surveys were distributed online via Mingle (Bilendi, Cologne, Germany) to collect data from German samples, which were representative of age, gender, and educational level (see Table 1).

This project was funded by the Leibniz-Institut für Wissenmedien, STB Data Science, in Germany and is part of the project “AI for science communication: Acceptance and laypeople comprehension”. The main results are published in the journal Behavioral Sciences (https://doi.org/10.3390/bs14050353, accessed on 03 October 2024).

2. Data Description

On the platform of the Open Science Framework (OSF), two separate primary data files in RDS format are stored for each survey. Moreover, an analysis R file is stored, in which these two data files are merged as one of the first steps. However, they can be analyzed separately, especially since the second sample consists of different participants.

In the data files, each row is a participant, and each column is a variable. For the variable names and their corresponding column names in the data sets, see Table 2. As most scales consisted of multiple items that are not yet aggregated or further processed in the data set for reasons of transparency, the columns in the data set are provided with the suffix “_[Item number]” and represent the single items of one measure. The codebook attached to the OSF project contains a detailed description of all variables, as well as all items and their choice options.

We recommend analyzing the data using R or R-studio. However, it is also possible to import the data set to other statistical analysis programs. To analyze the data with our provided script, it is necessary to install R or R-Studio in version 4.3.1 (16 June 2023 ucrt) or newer. The necessary packages are listed in the beginning of the analysis script and need to be installed in advance.

3. Methods

3.1. Survey Design

The surveys were developed to capture people’s experience, attitudes, concepts, knowledge, and preferences toward ATG and ChatGPT, and to examine whether there were changes between these two time points.

In each survey, all participants were presented with the full set of questions, with one exception: in the beginning, participants were asked about their experience with ATG. If they indicated that they had previously read an AI-written text, they were directed to an open text entry question, where they could specify the type(s) of text they had read. This question created missing values for a proportion of respondents. The same procedure was carried out in Survey 2. However, in the second survey, we had the chance to ask specifically about their experience with ChatGPT, which again generated a subset of participants with ChatGPT experience who were later forwarded to the attitudes toward ChatGPT scale. Therefore, individuals who indicated no experience with ChatGPT have missing values at this measure. For each scale, the items were presented in random order to avoid order effects.

3.2. Survey Platform

The data collection and questionnaire setup were conducted using Qualtrics (Qualtrics LLC; CEO: Ryan Smith; Address: 2250 N. University Pkwy, 48-C, Provo, UT 84604, USA). Qualtrics was chosen for its robust features, including customizable survey design and secure data storage.

3.3. Participant Recruitment

The panel provider Bilendi and respondi was commissioned with the recruitment of the samples. It is ISO certified (ISO 20252:2019) and processes personal data in accordance with the European General Data Protection Regulation (DSGVO). It sent out invitations via its panel Mingle. The provider was also responsible for quota management. For the quotas, according to which participants were recruited, and the composition of the final samples, see Table 1. According to Mingle, participant compensation follows a points system in which points can be collected depending on the length and type of the study. The participants themselves decide whether to convert the points into cash, vouchers, or donations. For participating in Survey 1, respondents received the equivalent of EUR 1.30 for a mean duration of 11.77 min. In Survey 2, they received EUR 1.93 for a mean duration of 15.52 min.1

3.4. Data Filtering

In Survey 1, from 1111 participants who completed the questionnaire, 83 people did not give their consent to data use. Thus, the final sample size resulted in N = 1028. In Survey 2, from 1116 participants who filled in the questionnaire completely, 51 withdrew their data at the end of the survey. Another 101 persons were excluded due to an incorrect answer to the attention check implemented in the knowledge test (item “KNOW_16”). Therefore, the final sample in Survey 2 consisted of N = 1013 respondents.

3.5. Data Curation and Storage

The original data from both surveys are archived on the local servers of the Leibniz-Institut für Wissensmedien for at least ten years after publication. The prepared and fully anonymized primary data are publicly available at OSF. The data stored at Qualtrics will be deleted by 31 March 2025, at the latest. Qualtrics complies with the European guidelines on data storage. Bilendi and respondi had no access to the survey data during data collection and can therefore not connect the participant data to the survey data. In turn, we as authors had no access to Bilendi and respondi’s participant data at any point in time.

There are no missing values in both data sets as the data were only analyzed if participants fully completed the surveys and gave their written informed consent to using their data for research purposes on the last page. Furthermore, participants in Survey 2 were excluded from the analysis if they incorrectly answered the attention check item implemented in the knowledge test.

3.6. Measures and Procedure

Table 2 shows all measures used in the surveys in the order they were presented. The original Qualtrics questionnaire can be found under the material here: https://osf.io/sn75h/ (accessed on 03 October 2024). The original data set also contains variables for the exact age, education, and occupation. Note that participants entered their occupations in an open text entry, which is why the variable requires further processing. Please see [1] or the R script for the correct answers to the knowledge test.

4. Limitations

The present data set contains two snapshots of different samples at two points in time and is, therefore, unsuitable for analyzing intrapersonal changes. Since the surveys were conducted with samples from the German population, any comparisons or inferences regarding other countries should be approached with caution. Consequently, the scale translations need to be validated when replicating these studies in different languages.

Moreover, since the first survey was conducted before the public awareness caused by the launch of ChatGPT, the concepts and scales can still be interpreted as too general. The concepts of functionality, data retrieval, responsibility, and human control are not tailored to specific technologies but capture different aspects that apply more or less to various systems. Furthermore, there is certainly high variability in the answer options “perhaps” of the concepts and the answer option “don’t know” of the knowledge test, for which this data set cannot provide explanations. For example, future research could focus on the most frequently used AI systems and adapt the questions accordingly. Open answer options could offer further insights into the sources of uncertainties or gaps in knowledge.

5. Potential Applications of the Data Set

This open-access data set can be seen as an important starting point for subsequent research. This data set not only provided a baseline measure of attitudes toward ATG before public attention was drawn to this technology, but also offered a broad overview of various constructs that may become relevant when ATG and LLMs gain wider adoption in the general population. Future research may also delve deeper into specific misconceptions, risks, and opportunities that the population perceives. The following research questions might serve scientists in this research field in using this data set for exploratory purposes. Potential areas of further research include socio-demographic aspects, which can provide further information about differences between population groups and thus enrich our knowledge about a potential AI divide:

RQ1: Do attitudes, preferences, or knowledge differ regarding age, gender, or educational background?

Moreover, further investigation of the relationship between personal attitudes and behavioral intentions can provide information about factors that determine actual usage:

RQ2: How well do lay attitudes toward ChatGPT, attitudes toward using the technology, and performance expectancy predict behavioral intentions?
RQ3: How well do the TAM/UTAUT subscales predict behavioral intention and ChatGPT use in Survey 2?

To our knowledge, no scale captures specific attitudes toward ATG technology or measures adequate to differentiate between users and non-users or even different levels of experience. Therefore, our self-developed measurements can serve as a basis for scale development and validation:

RQ4: How valid is the newly developed scale in measuring lay attitudes toward ChatGPT? Can it be adapted to LLMs in general?
RQ5: How valid is the newly developed scale in measuring attitudes toward ChatGPT?
RQ6: Do participants with and without ChatGPT experience differ in their ratings?
RQ7: What do the open answers on ChatGPT usage tell us?

Author Contributions

Both A.L.H. and J.K. conceptualized this study, edited, and revised the manuscript. A.L.H. drafted the manuscript, conducted the studies, and monitored the recruitment of participants. A.L.H. analyzed and interpreted the data and is responsible for data curation. J.K. supervised the studies and was responsible for funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Leibniz-Institut für Wissensmedien, STB Data Science.

Institutional Review Board Statement

This study was conducted in accordance with the guidelines of the Local Ethics Committee of the Leibniz-Institut für Wissensmedien, which approved the study design and methods (Approval number: LEK 2023/022, approved on 17 May 2023).

Informed Consent Statement

Written informed consent was obtained from all participants involved in this study. Participants were invited to complete the online surveys via the online market research platform Mingle in March 2022 (Study 1) and in June 2023 (Study 2).

Data Availability Statement

The data, analysis script, and material are openly available in OSF at https://osf.io/sn75h/ (accessed on 03 October 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Note

1

Mean duration was calculated by excluding outliers detected via the interquartile range (IQR) method. Following this, we defined outliers as observations that fell below Q1 (first quartile) − 1.5 × IQR or above Q3 (third quartile) + 1.5 × IQR. Thus, for Survey 1, the data from 75 respondents whose duration was equal to or longer than 28.27 min were excluded from the calculation of the average duration. For Survey 2, the data from 98 participants whose duration was equal to or longer than 40.52 min were not included in the average duration calculation. Note that extreme outliers regarding duration of survey completion can be caused by missing the submission of the last survey page or breaks during the survey.

References

Lermann Henestrosa, A.; Kimmerle, J. Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany. Behav. Sci. 2024, 14, 353. [Google Scholar] [CrossRef] [PubMed]
Schepman, A.; Rodway, P. Initial Validation of the General Attitudes towards Artificial Intelligence Scale. Comput. Hum. Behav. Rep. 2020, 1, 100014. [Google Scholar] [CrossRef] [PubMed]
Sundar, S.S.; Kim, J. Machine Heuristic: When We Trust Computers More than Humans with Our Personal Information. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland, 4–9 May 2019; pp. 1–9. [Google Scholar] [CrossRef]
Said, N.; Potinteu, A.E.; Brich, I.R.; Buder, J.; Schumm, H.; Huff, M. An Artificial Intelligence Perspective: How Knowledge and Confidence Shape Risk and Opportunity Perception. Comput. Hum Behav. 2022, 149, 107855. [Google Scholar] [CrossRef]
Venkatesh, V.; Morris, M.G.; Davis, G.B.; Davis, F.D. User Acceptance of Information Technology: Toward a Unified View. MIS Q. 2003, 27, 425–478. [Google Scholar] [CrossRef]

Table 1. Composition of the final samples regarding gender, age, and educational level.

Quota		Survey 1		Survey 2
Quota		Target Quota	Final Composition	Target Quota	Final Composition
gender	Female	49.82	51.26	49.54	50.94
	male	50.18	48.54	50.46	48.86
	diverse	0.91	0.19	-	0.20
age	18–29 year	18.91	17.90	19.76	18.56
	30–39 year	18.00	17.70	19.40	18.56
	40–49 year	16.91	17.02	17.97	18.56
	50–59 year	22.36	22.57	23.74	24.48
	60–69 year	23.64	24.81	19.12	19.84
education *	low	34.00	33.66	27.00	26.65
	middle	32.00	31.52	33.00	34.45
	high	34.00	34.82	41.00	38.89

* Educational levels were built as follows: no or elementary/primary school degree = low, secondary school degree = middle, high school degree = high.

Table 2. Overview of all variables used in the surveys.

Variable	Number of Items	Scale Formation	Cronbach Alpha Survey 1/2	Scale Type	Example Item/Choice Option	Answer Options	Source	Column Name
Self-assessed knowledge	1	-	-	Ordinal	How would you rate your level of knowledge regarding AI?	No—Limited—Some—Good—Extensive	Self-developed	SELFSKILL
General attitudes °	20	mean	0.91/0.92	Likert	AI is exciting.	Strongly disagree—Rather disagree—neutral—Rather agree—Strongly agree	Adopted from [2]	GENATT
Machine heuristic	4	mean	0.75/0.79	Likert	When an AI performs a task, it is—accurate.	Strongly disagree—Rather disagree—Partly/partly—Rather agree—Strongly agree	Adopted from [3]	MACHEU
Media use ¹	4/5	-	-	Ordinal	Which of the following voice-based applications do you use and how often do you use them?—chatbots	Never—Seldom—Occasionally—Often—Constant	Self-developed	USE_1:_6
Media use for scientific information retrieval ²	11/14	-	-	Ordinal	How often do you use the following media to obtain information on scientific topics?—blogs		Self-developed	MEDIA_USE_1:_14
Heard about ATG	1	-	-	Ordinal	Have you ever heard that AIs can write texts?		Self-developed	HEARD
Read AI text	1	-	-	Ordinal	Have you ever read texts written by an AI?		Self-developed	READ
Type of AI text read	1	-	-	Text entry	What kind(s) of text(s) have you already read? Please describe them briefly.		-	AITXT
Knowledge *	15	Sum score of correctly answered items	-	Nominal	AI language models (e.g., ChatGPT) calculate for their answers which word is most likely to come next.	Right—Wrong—Don’t know	Partly self-developed, partly adapted from [4]	KNOW_1:_15
Function of ATG ^†	5/6	proportions	-	Likert	The AI uses existing words and texts and reassembles them.	Not at all—Rather unlikely—Perhaps—Rather likely—For certain	Self-developed	FUNC_1:_6
Sources of ATG ^†	6/7		-	Likert	The AI generates the content itself, without human intervention.			SOURC_1:_7
Human control ^†	4/5		-	Likert	The human sees the end product and edits the text if necessary.			CONTR_1:_5
Responsible	9		-	Likert	The programmer			RESP_1:_9
Performance expectancy	3	mean	0.78/0.81	Likert	AI-written texts would make me more productive.	Strongly disagree—Rather disagree—Partly/partly—Rather agree—Strongly agree	Adapted from [5]	TAM_1:_3
Effort expectancy	3		0.71/0.71	Likert	I think AI-written texts are clear and understandable.			TAM_4:_6
Attitude toward using the technology °	4		0.78/0.79	Likert	I would like to read AI-generated texts.			TAM_7:_10
Anxiety	3		0.71/0.68	Likert	I have reservations about reading AI-written texts.			TAM_11:_13
Behavioral intention °	3		0.64/0.67	Likert	I intend to read AI-written texts in the future.			TAM_14:_16
Permission to write like human °	4		0.63/0.62	Likert	AIs should be allowed to write about the same topics as humans.		Self-developed	TAM_17:_20
Intentions to read AI texts	18	proportions	-	Likert	Please indicate how likely you would be to read AI-written texts on the following topics.—Politics	Not at all—Rather unlikely—Perhaps—Rather likely—For certain		INTEN_1:_18
Comparison human vs. AI	18	proportions	-	Likert	If you had the choice, who would you prefer to be informed about the following topics?—Politics	Rather from a human—No preference—Rather from an AI		COMP_1:_18
ChatGPT use *	1	-	-	Text entry	You stated at the beginning that you have used ChatGPT before. Please describe what you have used or are using ChatGPT for.	-	Self-developed	USEGPT
Attitudes toward ChatGPT *^,°	16	mean	-/0.89	Likert	I am satisfied with the answers from ChatGPT.	Strongly disagree—Rather disagree—Partly/partly—Rather agree—Strongly agree	Self-developed	ATTGPT_1:_16
Lay attitudes toward ATG and ChatGPT *^,°	9	mean	-/0.82	Likert	I am optimistic about the impact of automated text generation (e.g., ChatGPT) on society.		Self-developed	LAYGPT_1:_9
Pro and contra arguments	1	-	-	Text entry	You now have the opportunity to freely express your thoughts on the topic.For example: What advantages and disadvantages do you see in automated text creation using AI?	-	Self-developed	PROCON

* Only measured in Survey 2. ° Contain reverse coded items, see R script. ¹ In Survey 2, the option ”ChatGPT” was added. ² In Survey 2, the options ”ChatGPT“, “Books“, and ”Podcasts“ were added. ^† In Survey 2, an additional item to those scales was added, which captured the technology behind ChatGPT.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lermann Henestrosa, A.; Kimmerle, J. Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”. Data 2024, 9, 116. https://doi.org/10.3390/data9100116

AMA Style

Lermann Henestrosa A, Kimmerle J. Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”. Data. 2024; 9(10):116. https://doi.org/10.3390/data9100116

Chicago/Turabian Style

Lermann Henestrosa, Angelica, and Joachim Kimmerle. 2024. "Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”" Data 9, no. 10: 116. https://doi.org/10.3390/data9100116

APA Style

Lermann Henestrosa, A., & Kimmerle, J. (2024). Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”. Data, 9(10), 116. https://doi.org/10.3390/data9100116

Article Menu

Data Descriptor for “Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany”

Abstract

1. Summary

2. Data Description

3. Methods

3.1. Survey Design

3.2. Survey Platform

3.3. Participant Recruitment

3.4. Data Filtering

3.5. Data Curation and Storage

3.6. Measures and Procedure

4. Limitations

5. Potential Applications of the Data Set

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Note

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI