Communication in Human–AI Co-Creation: Perceptual Analysis of Paintings Generated by Text-to-Image System

: In recent years, art creation using artiﬁcial intelligence (AI) has started to become a mainstream phenomenon. One of the latest applications of AI is to generate visual artwork from natural language descriptions where anyone can interact with it to create thousands of artistic images with minimal effort, which provokes the questions: what is the essence of artistic creation, and who can create art in this era? Considering that, in this study, the theoretical communication framework was adopted to investigate the difference in the interaction with the text-to-image system between artists and nonartists. In this experiment, ten artists and ten nonartists were invited to co-create with Midjourney. Their actions and reﬂections were recorded, and two sets of generated images were collected for the visual question-answering task, with a painting created by the artist as a reference sample. A total of forty-two subjects with artistic backgrounds participated in the evaluated experiment. The results indicated differences between the two groups in their creation actions and their attitude toward AI, while the technology blurred the difference in the perception of the results caused by the creator’s artistic experience. In addition, attention should be paid to communication on the effectiveness level for a better perception of the artistic value.


Introduction
In the last decade, the growing implementation of artificial intelligence (AI) technology in the field of art has triggered a fierce discussion on AI art. Since the generative adversarial network (GAN) portrait painting titled "Edmond de Belamy" was constructed in 2018, AI art has already entered the public's vision. One of the latest applications of AI is the generation of images based on natural language descriptions, which enhances the efficiency and effect of the transformation from creativity to visuality to a great extent. In the past, whether in traditional or digital painting creation, the author needed to be skilled in using tools and to have rich technical experience to accurately map the brain's imagination to the visual layer. However, in co-creation with text-to-image AI generators, both artists and nonartists can input the text description to produce many high-quality images. During traditional painting creation, artists and nonartists in a painting task indicated quantitative and qualitative differences in some studies, such as artists spending more time on planning their painting, having more control over their creative processes, having more specific skills, and having more efficiency than nonartists [1,2]. Whether such differences still exist in the new human-AI interaction mode and what new changes arise are worth discussing.
A series of text-to-image AI systems, such as Disco Diffusion [3], Midjourney [4], Stable Diffusion [5], OpenAI's DALL-E 2 [6], and Google's Imagen [7], is making a big splash. The generation mechanism is to use a language-vision model to understand the A series of text-to-image AI systems, such as Disco Diffusion [3], Midjourney [4], S ble Diffusion [5], OpenAI's DALL-E 2 [6], and Google's Imagen [7], is making a big spla The generation mechanism is to use a language-vision model to understand the "prom input by users, and then the generator is guided to produce high-quality images. They capable of synthesizing images with any style and content based on a prompt. Besid users can control the system to iterate more variations. With the rise of AI art, many art have also started to use AI to assist in creation. According to the Colorado State Fair co petition's website [8], the art piece "Théâtre D'opéra Spatial," which was generated Midjourney, won first place in the digital art category. As the formation of generat using natural language text to create various styles of creative images occurs, the quest that arises immediately is: what is the essence of artistic creation, and what is the c capability of artists? Though everyone thought art was one thing robots could never maybe we will face the challenges of emerging AI technology.
This research aimed to analyze and understand how text-to-image technology affe art creation and appreciation. Additionally, the main discussion focused on the differe in activities and results between artists and nonartists from the perspective of art comm nication. Figure 1 shows that this study could be divided into three sections. In Sectio a literature review was made to explore the research framework of the generation me anism of visual art collaboration with AI. In Section 2, nine experts with artistic and aesthetic backgrounds were invited to select a suitable AI system and painting samp according to their art appreciation. In Section 3, the data were collected from the creat of the samples and from the subjects participating in the questionnaire for analysis a discussion. Finally, the conclusions of this study were given.  The procedures for this study: the horizontal line divides three sessions, and the arrows indicate the direction of functions and processes. The original name of "DD" is "Disco Diffusion", while that of "SD" is "Stable Diffusion". of the LAION-5B database. Similar to Google's Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model to text prompts [20]. Furthermore, it has a better balance between speed and quality and can generate images within seconds [5]. The main novelty of DALL-E 2 seems to be an extra layer of indirection with the prior network, which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best-performing variant [6].
With the emergence of such open-source implementations, the use of advanced textto-image synthesis for generating images is becoming more widespread, which represents a relevant trend in the AI Art community [21].

Communication between Artists and Audiences
Artistic creation is a process for artists to explore and express ideas and concepts. A great painting has much more below the surface than is first seen on the surface. Therefore, it must access the mind as well as the senses [22]. Similar to how humans do not really know how they breathe, artists do not truly know how they create: while they may rely on a set of fundamental principles, such as how to arrange elements, light, colors, and other components, most of their creative decisions happen intuitively [23]. The experimental result of Eindhoven and Vinacke demonstrated that artists have more control over their creative activities and produce better results than nonartists in the creative process of painting [1]. Kay also found that nonartists, semiprofessional artists, and professional artists differed on certain process-related variables [2].
The interplay between the internal (cognitive) representation and the external (physical) representation is a fascinating problem in cognitive psychology, art, science, and philosophy [24]. The various painting attributes, such as colors, shapes, and boundaries, are selectively redistributed to the brain for processing. For example, color may be experienced as warm or cold or as cheerful or somber [25]. Audiences can also perceive the painter's actions by observing the brushstroke of the painting [26]. Apart from that, from a psychological viewpoint, Kozbelt examined various experiments on artists' perception and depiction skills and showed evidence suggesting possible perceptual differences between artists and nonartists [27,28]. Aesthetic appreciation is an active process influenced by several objective features: external and subjective factors that engage both bottomup and top-down processes [29]. In the series of studies on experimental aesthetics by Lyu et al. [30][31][32], the perception of artistic style was affected by individual attributes such as knowledge background and gender. Thus, the perception of art is a complex interaction process between the top and bottom levels, which is affected by various subjective and objective factors.
According to communication theory, the process of artist expression is called encoding, and the way the artwork is perceived by the audience is regarded as decoding [33,34]. Jakobson proposed six constitutive factors with six functions in communication: the addresser, addressee, context, message, contact, and code [34]. For example, an artist (addresser) sends a message to an audience (addressee) through his/her painting. The artist's work, as the message with a story (context), plays a role in the connection between himself/herself and the audience (contact). Finally, his/her message must be based on a shared meaning system (code) by which his/her work is structured [22]. There are three levels of problems, namely technical, semantic, and effectiveness levels, that were identified in the study on the communication of paintings [31,35]. Among them, the technical level focuses on letting the addressee receive a message through visual attraction, and the semantic level requires that the addressee is allowed to understand the message's meaning without misinterpreting it. The effectiveness level concerns the effect of the audience's feelings. During the creative process of AI art, the artists choose AI algorithms according to their intentions for creating the artwork, and audience acceptance is a critical defining step in deciding whether it is "art" [36]. Studying the process of art perception can help build a bridge between artists and the audience [37,38].

Artworks Generated by Human-AI Co-Creation
Artworks are increasingly being created by machines through algorithms with little or no input from humans. At the Christie's auction in 2018, the portrait "Edmond de Belamy", generated by generative adversarial networks (GAN), was auctioned for $432,500, which indicates that AI has begun to enter our field of vision at a rapid speed [39]. Recent works have addressed a variety of tasks such as classification, object detection, similarity retrieval, multimodal representations, and computational aesthetics, among others [21]. The neural style transfer in which AI technology first intervened in the field of art has been widely used in the platforms such as Prisma, Deep Dream Generator, and other art content production platforms. In 2022, text-to-image AI art generators are much more popular and have been applied to creating conceptual scenes, creative designs, and fictional illustrations. In this case, it can be seen that the processes in various art creations are changing. Meanwhile, some new jobs have also been immediately emerging, such as prompt sale [40].
With the explosion of AI-related technologies and their continuous application in the field of art, there is a growing body of research initiatives and creative applications arising at the intersection of AI and art. Artistic creation is embedded with cultural, historical, and institutional frameworks that directly interact with the artist's own creative process [23]. Lacking human consciousness, AI does not understand what it is doing and is merely a suite of statistical models calculating favorable odds through enormous variations. Considering that, AI cannot create art, but it can create patterns that an audience will likely perceive as art [41]. The human artist, as the author, is always the mastermind behind the work, and the computer is a tool [42]. However, AI technology is not like traditional tools. Its randomness changes the way humans control it. As a sparking trigger of inspiration, artists collaborate with AI agencies to augment the artistic process [41].
As for text-based generative art, it is also argued that creativity does not lie in the final artifact but rather in the interaction with the AI and the practices that may arise from the human-AI interaction [43]. It is not hard to imagine a future where text prompts could be generated by language models, thereby completely dehumanizing the creative artistic process and severely distorting the human perception of the meaning behind an image [44]. Most studies reported that visual artworks can be recognized to some extent by humans, especially by experts of a specific art field [45,46], but other experimental results showed that individuals are unable to accurately identify AI-generated artwork [32,47]. Based on our previous research, the deep learning model, trained by large amounts of data on paintings, can simulate human painting skills on the technical level. In contrast, people prefer paintings connecting the semantic and emotional levels [31].

Research Framework
Based on the literature review, in this study, the research framework of communication in the AI painting generated by the text-to-image system was constructed, as shown in Figure 2. In the process of communication between the artist (Addresser) and the audience (Addressee), there is the artist model and the audience model, which construct the complex processing from creation to perception. Different from the traditional coding process, artists translated their intention and emotion into prompts instead of representing them by directly using form. However, existing paintings were taken as the data for training the AI model, which means that the creation path was changed by adding the interaction between human and AI. As for the side in the perception of artworks generated by the AI model, there were still three stages: visual experience, meaning experience, and emotional experience. Ideally, audiences could still contact the artist by receiving the message through decoding and feeling poetic in a referential context. emotional experience. Ideally, audiences could still contact the artist by receiving the message through decoding and feeling poetic in a referential context. The communication research framework of AI paintings generated by the text-to-image system: The left part is the artist encoding model, and the right is the audience decoding model. The AI generator in the middle is regarded as the communication interface between the artist and the audience.
As the AI generator replaces the represented action of humans, what role do professional art knowledge and experience play in this human-computer interaction process? In the age of AI, what is the critical capability of artists? Instead of fear replacement, it is more important to explore the irreplaceable value of human beings. Therefore, this experiment was designed to discuss process coding and visual perception by comparing the differences in the human-AI interaction between artists and nonartists. The theme "sweet home" was used as the creative theme of the painting, and artists and nonartists were invited to map their inner feeling in visual form by inputting descriptive prompts. Additionally, AI paintings were generated as experiment stimuli by interacting with the textto-image system. In addition to the analysis of the observations on creative action and open coding from the creator's self-report, the evaluation items of perception for the stimulus were designed from the visual attributes (technical level), the semantic matching (semantic level), and the emotional experience (effectiveness level). Based on the framework of communication, the study was meant to explore the essence of artistic creation and artists' unique capabilities by comparing the difference between the two groups in the interaction with the text-to-image system and in the perception of generations.

Stimuli
In the text-to-image system selection stage, an artist and a nonartist were invited to interact with four public text-to-image systems, namely Disco Diffusion, Midjourney, Stable Diffusion, and DALL·E 2. The theme sweet home was chosen as the theme of creation because a person's home is unique and full of individual imagination and interpretation. They were asked to co-create an oil painting with AI by inputting a prompt to describe The communication research framework of AI paintings generated by the text-to-image system: The left part is the artist encoding model, and the right is the audience decoding model. The AI generator in the middle is regarded as the communication interface between the artist and the audience.
As the AI generator replaces the represented action of humans, what role do professional art knowledge and experience play in this human-computer interaction process? In the age of AI, what is the critical capability of artists? Instead of fear replacement, it is more important to explore the irreplaceable value of human beings. Therefore, this experiment was designed to discuss process coding and visual perception by comparing the differences in the human-AI interaction between artists and nonartists. The theme "sweet home" was used as the creative theme of the painting, and artists and nonartists were invited to map their inner feeling in visual form by inputting descriptive prompts. Additionally, AI paintings were generated as experiment stimuli by interacting with the text-to-image system. In addition to the analysis of the observations on creative action and open coding from the creator's self-report, the evaluation items of perception for the stimulus were designed from the visual attributes (technical level), the semantic matching (semantic level), and the emotional experience (effectiveness level). Based on the framework of communication, the study was meant to explore the essence of artistic creation and artists' unique capabilities by comparing the difference between the two groups in the interaction with the text-to-image system and in the perception of generations.

Stimuli
In the text-to-image system selection stage, an artist and a nonartist were invited to interact with four public text-to-image systems, namely Disco Diffusion, Midjourney, Stable Diffusion, and DALL·E 2. The theme sweet home was chosen as the theme of creation because a person's home is unique and full of individual imagination and interpretation. They were asked to co-create an oil painting with AI by inputting a prompt to describe the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a room full of toys by the fireplace.
Result 1 the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI.

Prompt 2
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.
Result 2 the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. the theme. It was suggested that the structure of the prompt should start with "an oil painting of" and should refer to the cases in the community to establish the experience of the relationship between text description and visual generation. In order to eliminate the interference of artistic style, artists' names and art schools were prohibited. Based on prompt 1 and prompt 2 provided by the artist and the nonartist, respectively, a comparison was made, which is shown in Table 1. Then, nine art and/or aesthetic background experts were encouraged to select which method was more suitable for generating oil paintings of a sweet home. As a result, they all agreed that the attributes of the generated samples by Midjourney were more similar to those of oil paintings, and the concord of color could express the feeling of a sweet home on the effectiveness level [25]. Additionally, its degree of matching with text descriptions was much higher than that of the other two systems. Among them, Disco Diffusion confused the structure of elements and the canvas layout, while Stable Diffusion had an adequate understanding close to Midjourney but missed the artistic oil-painting style. Beyond that, DALL·E 2 better understood the feeding text, whereas its unity of tone was slightly weaker than Midjourney. Therefore, Midjourney was picked as the AI tool to collaborate with two group creators to generate paintings as experimental samples. An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Result 2
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI.
In the experimental sample-generation phase, a total of ten artists and ten nonartists participated in theme painting creation by interacting with Midjourney, whose information is displayed in Table 2. In selecting creators, the following criteria were used to distinguish artists from nonartists: An artist should have painting experience and should have to derive some income from their pictures. A nonartist was any subject who had not engaged in this type of creative activity. Before the experiment, they had never used similar tools to assist in painting and had only heard of the power of AI. Table 2. The information of the artists and nonartists: the age and painting years of the artists and the age of the nonartists were listed. The label "AP01" represents the artist that created the P01 painting, while "NH01" is the nonartist who created the H01 painting. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose. Prompt An oil painting of a room full of toys by the fireplace.

Artists
An oil painting of love harbor full of laughter and warmth.
An oil painting of parents happily walking in the park hand in hand, and an active dog is chasing me.

No. P04 P05 P06
Prompt A warm tone oil painting of a little pink bear holding a honey jar to enjoy the cool under the shade of the big tree in front of the yellow wooden house, and beautiful flowers and grass, and gurgling streams beside the wooden house on a bright summer day.
A warm tone oil painting of mother toasting bread for her daughter in a Europe style room.
An oil painting of a Samoyed dog with a space helmet and a space suit floating in outer space. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose.

Type
Midjourney + Artist Paintings No.

P01 P02 P03
Prompt An oil painting of a room full of toys by the fireplace.
An oil painting of love harbor full of laughter and warmth.
An oil painting of parents happily walking in the park hand in hand, and an active dog is chasing me.

No. P04 P05 P06
Prompt A warm tone oil painting of a little pink bear holding a honey jar to enjoy the cool under the shade of the big tree in front of the yellow wooden house, and beautiful flowers and grass, and gurgling streams beside the wooden house on a bright summer day.
A warm tone oil painting of mother toasting bread for her daughter in a Europe style room.
An oil painting of a Samoyed dog with a space helmet and a space suit floating in outer space. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose.

Type
Midjourney + Artist Paintings No.

P01 P02 P03
Prompt An oil painting of a room full of toys by the fireplace.
An oil painting of love harbor full of laughter and warmth.
An oil painting of parents happily walking in the park hand in hand, and an active dog is chasing me.

No. P04 P05 P06
Prompt A warm tone oil painting of a little pink bear holding a honey jar to enjoy the cool under the shade of the big tree in front of the yellow wooden house, and beautiful flowers and grass, and gurgling streams beside the wooden house on a bright summer day.
A warm tone oil painting of mother toasting bread for her daughter in a Europe style room.
An oil painting of a Samoyed dog with a space helmet and a space suit floating in outer space.

No. P04 P05 P06
Prompt A warm tone oil painting of a little pink bear holding a honey jar to enjoy the cool under the shade of the big tree in front of the yellow wooden house, and beautiful flowers and grass, and gurgling streams beside the wooden house on a bright summer day.
A warm tone oil painting of mother toasting bread for her daughter in a Europe style room.
An oil painting of a Samoyed dog with a space helmet and a space suit floating in outer space.

Prompt
An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.
Painting Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

Prompt
An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

Prompt
An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Type Midjourney + Nonartist Paintings No. H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

Type Artist Painting
Painting Painting Type Midjourney + Nonartist Paintings No.

H01 H02 H03
Prompt An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.

Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.
As for the perceptual evaluation of stimuli, forty-two participants with artistic backgrounds were recruited into the questionnaire survey. A PDF file containing a QR code link to the online questionnaire and to the thirteen samples was emailed to them. In addition, the requirements that each slide should be viewed on a computer screen no less than 14 inches for more details and that the online questionnaire should be filled in after scanning the QR code on the mobile phone were highlighted. A painting was displayed randomly on each slide with its prompt for rating followed by all of the paintings being displayed for ranking. Finally, 42 valid data were received for statistical analysis.

Questionnaire Participants
A total of 42 subjects (15 males and 27 females) participated in the experiment. About 47% were 20~30 years old; 17% were 31~40 years old; 14% were 41~50 years old; 17% were 51~60 years old; and 5% were over 61 years old, indicating a relatively even distribution of age groups apart from the youngest group. In terms of professions, they all had experience in painting or art research, so the questionnaire data can be featured with the reliability.
The participants were asked to rate the paintings' degree of each attribute and to rank them according to their subjective aesthetic experience. The procedure is described below in detail.

Questionnaire Design
The questionnaire comprised two parts. Part one was a rating test in which the participants should provide subjective ratings for the thirteen paintings on nine visual attributes, as described in Table 4. The evaluation attributes belonged to three levels: the technical level (f1-f3), the semantic level (f4-f6), and the effectiveness level (f7-f9). The items explored the perceptive degree of the attributes in the paintings, and subjects scored the responses using a 5-point Likert scale from 1 ("Very low") to 5 ("Very high"). In part two (ranking test), the subjects were asked about their most preferred painting and attribute (see Table 5). Table 4. Part one: questionnaire for subjective ratings of paintings on the nine attributes.

Attributes 1 2 3 4 5
As for the perceptual evaluation of stimuli, forty-two participants with artistic backgrounds were recruited into the questionnaire survey. A PDF file containing a QR code link to the online questionnaire and to the thirteen samples was emailed to them. In addition, the requirements that each slide should be viewed on a computer screen no less than 14 inches for more details and that the online questionnaire should be filled in after scanning the QR code on the mobile phone were highlighted. A painting was displayed randomly on each slide with its prompt for rating followed by all of the paintings being displayed for ranking. Finally, 42 valid data were received for statistical analysis.

Questionnaire Participants
A total of 42 subjects (15 males and 27 females) participated in the experiment. About 47% were 20~30 years old; 17% were 31~40 years old; 14% were 41~50 years old; 17% were 51~60 years old; and 5% were over 61 years old, indicating a relatively even distribution of age groups apart from the youngest group. In terms of professions, they all had experience in painting or art research, so the questionnaire data can be featured with the reliability.
The participants were asked to rate the paintings' degree of each attribute and to rank them according to their subjective aesthetic experience. The procedure is described below in detail.

Questionnaire Design
The questionnaire comprised two parts. Part one was a rating test in which the participants should provide subjective ratings for the thirteen paintings on nine visual attributes, as described in Table 4. The evaluation attributes belonged to three levels: the technical level (f1-f3), the semantic level (f4-f6), and the effectiveness level (f7-f9). The items explored the perceptive degree of the attributes in the paintings, and subjects scored the responses using a 5-point Likert scale from 1 ("Very low") to 5 ("Very high"). In part two (ranking test), the subjects were asked about their most preferred painting and attribute (see Table 5). Please subjectively rate each painting according to visual attributes, with a maximum of 5 points and a minimum of 1 point.  Which ones are the creations of artists?

Statistical Analysis
Based on the observation data, the time spent by artists and nonartists and the number of interactions was recorded. For the reflections obtained from the interview, the grounded theory method was used to code the opening data. For the rating data in the questionnaire, descriptive statistics and ANOVA were firstly adapted to test whether there was a significant difference between the three types of paintings. For items reaching the significance level, we used the Duncan multiple comparative methods to test whether there was a significant difference among the three averages. In addition, multidimensional preference analysis (MDPREF) was performed to determine the relationships between stimulus and attributes. Finally, percent statistics and Chi-square were used to analyze the raking data.

Which one is the most professional?
Which one is the sweetest?
Which one is the most creative?
Which ones are the creations of artists?

Statistical Analysis
Based on the observation data, the time spent by artists and nonartists and the number of interactions was recorded. For the reflections obtained from the interview, the grounded theory method was used to code the opening data. For the rating data in the questionnaire, descriptive statistics and ANOVA were firstly adapted to test whether there was a significant difference between the three types of paintings. For items reaching the significance level, we used the Duncan multiple comparative methods to test whether there was a significant difference among the three averages. In addition, multidimensional preference analysis (MDPREF) was performed to determine the relationships between stimulus and attributes. Finally, percent statistics and Chi-square were used to analyze the raking data.

Coding of Reflections in Human-AI Co-Creation
According to the results of the variation analysis in Table 6, after the two groups of creators co-created with Midjourney, they displayed a significant difference in their behavior during the time spent, the number of modified prompts, and the number of clicked U buttons. The average time spent by artists was 22 min, which was significantly higher than the 14 min spent by nonartists. Apart from that, artists tried more than 6 times to modify the prompts and averaged about 4 U-button clicks for repeated attempts, which was far higher than the activity frequency of nonartists. Apparently, there were still obvious differences between the two groups in the co-creation of AI. Reflections obtained from the two groups of unstructured interviews were coded with grounded theory methods in three steps: (a) initial open coding, (b) intermediate coding, and (c) advanced coding [48]. First, the essence of the interview recordings was synthesized during the initial coding step. Next, new codes focused on similarities and differences were formulated, and selective codes were developed. Finally, the codes were intermediated into six core categories, as can be seen in Table 7.
All of the creators in this experiment used Midjourney to generate paintings for the first time. The coding results showed that creators with artistic backgrounds paid more attention to such core categories, such as visual performance, semantic matching, subject control in the interaction mode, and creative stimulation in creation experience, whereas the nonartists focused on the semantic matching and culture cognition. In the category of technological ethics, there were some different thoughts.

Descriptive Statistics and ANOVA Analysis of Rating Data
The purpose of this study was to find out whether there are any differences in the perception of co-creation paintings with AI between creators with and without artistic backgrounds. According to the results of the variation analysis in Table 8, after the subjects viewed the three types of paintings, no significant difference was shown on the technical level (i.e., "Color harmony", "Element accuracy", and "Layout coordination"), the semantic level (i.e., "Element accuracy", "Content matching", and "Scene matching"), or the effectiveness level (i.e., "Creativity" and "Preference"), which demonstrated that the perception effect of painting technology, semantic matching, artistic creativity and preference were similar among three types of paintings. It is worth noting that there were significant differences in the option of "Sweetness" (p < 0.001). The scores of AI generation by artists (3.43 points) and nonartists (3.45 points) were significantly higher than that by the artist Yong Wang (2.68 points), which related to how subjects communicate with paintings. Table 7. The codes used to code the creators' reflections on co-creation with AI: was used to mark the feedback from the artists; indicates the feedback from the nonartists; and • represents the feedback from both of the two groups.

Core Category Selective Coding Open Coding
Visual performance Artistic style Paintings generated by AI can be identified because of high standardization (AP05, AP09-10).

Semantic matching
Element accuracy Did not generate elements accurately based on the prompt (NH04).

Expression characters
In some results, the animal state was a little decadent, which did not meet the prompt (AP06).

Subject control
Unlike traditional brushes and paints, they can help you realize that what you think is what you get, and it is completely under your control (AP01, AP05-07).
More iterations can make the results closer to inner thoughts (AP03-05).
Prompt grammar rules • Prompting rules are related to the final generated effect to a great extent (AP02-04, AP06, NH02-04).

Creation experience
Creation assistance There are still differences in using language to express emotions instead of painting, even though all the elements described are generated (AP07-09).

Creative generation
Compared with the result of matching with prompts, some unexpected surprise is preferred (AP02, AP04).
It is like Pandora's Box. If it is not a surprise, it may be a shock (AP01, AP06).

Culture cognition Cross cultural differences
The originally generated image is full of Indian style home decorations with cultural differences (NH03).

Technological ethics
Work displacement AI cannot generate my unique styles and cannot replace senior painters (AP10).
A little confused about own core competitiveness (AP06-07). • Maybe some work related to painting will be impacted by AI (AP06, NH05).

Copyright issues
Due to the mixture and collage of painting styles, the ownership of copyright is a complex issue (AP01, AP03-07).

MDPREF Analysis of Rating Data in Attribute Vectors
The cognitive space was set up by conducting a multidimensional preference analysis (MDPREF) which expressed the relationship between the stimuli and their attributes. A matrix was created from the raw data to illustrate the mean scores of the nine fundamental relations in each of the thirteen paintings, as shown in Table 9. The matrix allowed SPSS statistics software to compute MDS and generate a two-dimensional (2D) spatial plot demonstrating the relationship between two crucial correspondence indications. Kruskal's stress was 0.14589, which was less than 0.2, and the determination coefficient (RSQ) was 0.92544, which was close to 1.0, revealing that the spatial relationships between the thirteen paintings and nine attributes could be appropriately represented in 2D. Moreover, the stress index indicated that the 2D plot and the original data exhibited a satisfactory fit, while the RSQ denoted that the 2D plot could explain 90.92% of the variance [49]. The cognitive matrix is shown in Figure 3. Table 9. Average score rating in nine perceptual attributes: the highest score of each attribute was marked in a red color, and the lowest score was marked in blue a color.
Appl. Sci. 2022, 12, 11312 8 of 19 Table 2. The information of the artists and nonartists: the age and painting years of the artists and the age of the nonartists were listed. The label "AP01" represents the artist that created the P01 painting, while "NH01" is the nonartist who created the H01 painting. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose. Table 3. The thirteen paintings and prompts: there are three groups including Midjourney + artist paintings (P01-P06), Midjourney + nonartist paintings (H01-H06), and artist painting.

Type
Midjourney + Artist Paintings No.

P01 P02 P03
Prompt An oil painting of a room full of toys by the fireplace.
An oil painting of love harbor full of laughter and warmth.
An oil painting of parents happily walking in the park hand in hand, and an active dog is chasing me.

Painting
No.

P04 P05 P06
Prompt A warm tone oil painting of a little pink bear holding a honey jar to enjoy the cool under the shade of the big tree in front of the yellow wooden house, and beautiful flowers and grass, and gurgling streams beside the wooden house on a bright summer day.
A warm tone oil painting of mother toasting bread for her daughter in a Europe style room.
An oil painting of a Samoyed dog with a space helmet and a space suit floating in outer space.

P01
. Sci. 2022, 12, 11312 8 of 19 Table 2. The information of the artists and nonartists: the age and painting years of the artists and the age of the nonartists were listed. The label "AP01" represents the artist that created the P01 painting, while "NH01" is the nonartist who created the H01 painting. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose. An oil painting of love harbor full of laughter and warmth.

Artists
An oil painting of parents happily walking in the park hand in hand, and an active dog is chasing me.

Painting
No.

P04 P05 P06
Prompt A warm tone oil painting of a little pink bear holding a honey jar to enjoy the cool under the shade of the big tree in front of the yellow wooden house, and beautiful flowers and grass, and gurgling streams beside the wooden house on a bright summer day.
A warm tone oil painting of mother toasting bread for her daughter in a Europe style room.
An oil painting of a Samoyed dog with a space helmet and a space suit floating in outer space.

P02
, 11312 8 of 19 Table 2. The information of the artists and nonartists: the age and painting years of the artists and the age of the nonartists were listed. The label "AP01" represents the artist that created the P01 painting, while "NH01" is the nonartist who created the H01 painting. They were asked to write a prompt describing a visual form that could express their imagination of a sweet home. The basal commands in Midjourney were to use the V1, V2, V3, or V4 buttons to create variations of their chosen image and then to click the U1, U2, U3, or U4 buttons to add details to the chosen image. To avoid interference due to unfamiliarity with the tools, the researcher observed and supported the whole process but did not affect the participants' writing and selection. Individual differences were so great as to suggest that each person attained their final product in their own way. Finally, the nine experts mentioned above filtered through six samples from each group by excluding samples that were similar. Twelve paintings (P01-P06 by the artists, and H01-H06 by the nonartists) are listed in Table 3. In addition, a painting created in the 1980s by artist Yong Wang on the topic of a sweet home also was chosen as the thirteenth stimulus, functioning as the reference sample. This painting recorded his poor kitchen environment at a time when his wife was busy cooking for the whole family. The limited living environment and his wife's busyness form an artistic conflict, highlighting that the inner sweetness is the critical value of a home. Furthermore, the stimuli for this experiment were classified into three types according to the research purpose.

Prompt
An oil painting of one family, balloons, toys and food in amusement park.
An oil painting of a family playing in the yard of a house, also including trees, sun, birds.
An oil painting of kids playing, cat napping, and parents cooking while chatting.

No. H04 H05 H06
Prompt An oil painting of a two-and-a-half floor house with red roofs and gray walls, surrounding with a beautiful garden full of plants and flowers, and a crystal-clear stream flowing through the garden.
An oil painting of a family having dinner and a fish in the center of the table.
An oil painting of a father reading a newspaper in front of the computer, a mother cooking in the kitchen, a little son sitting on the sofa watching the cartoon named tom and jerry, and a big daughter just bringing a golden retriever into the room.

Painting Description
Smoke curls up from the kitchen, roosters look for food, and the simple open-air kitchen emits the smell of cooking.

Experiment Procedures
During the creative process, the observer recorded the cost time of each creator, the number of adjustments to the statements, and the number of times the U button was clicked for variations. After the creators submitted the paintings co-created with Midjourney, they had a one-on-one interview to self-report their experience and to think about the interaction process and results. Then, the recordings were coded to discuss the difference in the process of human-AI interaction.
An oil painting o having dinner an the center of th Painting

Type Artist Paint
Painting Descripti

Experiment Procedures
During the creative process, the observe number of adjustments to the statements, a clicked for variations. After the creators subm ney, they had a one-on-one interview to sel the interaction process and results. Then, the ence in the process of human-AI interaction. According to the distribution of visual vectors in Figure 3, the nine visual attributes can be grouped into four categories: category I included the visual attributes of "Element accuracy (f2)", "Content matching (f5)", and "Scene matching (f6)"; "Layout coordination (f3)" and "Tone matching (f4)" belonged to group II; and "Color harmony (f1)" and "Preference (f9)" were in group III; while in group IV, "Sweetness (f7)" and "Creativity (f8)" were individually separated. The vector of attribute f7 (Sweetness) intersected with category I at nearly 90 • . Based on the MDPREF analysis, the attribute vectors of semantic matching were irrelevant to sweetness and creativity.

Artist
The thirteen paintings were presented in the cognitive space of preferences in the form of point coordinates. The locations of stimulus paintings that were grouped together represented that they had a similar rating, while the locations of stimulus paintings that were separated represented that the paintings held different attributes. Each painting could be projected onto every attribute vector. According to the distribution of paintings in Figure 3, the most generations interacted with by AI and creators with artistic backgrounds (P01-P05) could be projected onto the positive pole of most attribute vectors, whereas P06 was far away from others of the same type and had more negative perceptions. Furthermore, the paintings co-created by AI and nonartists were located in three clusters. H03 and H06 had higher perceptions of high-level attributes, and H04 was better on low-level attributes. In contrast, H01, H02, and H05 gathered and projected onto the negative pole of all the attribute vectors. As for the reference sample, the paintings by artists performed better on semantic matching. According to the distribution of visual vectors in Figure 3, the nine visual attributes can be grouped into four categories: category I included the visual attributes of "Element accuracy (f2)", "Content matching (f5)", and "Scene matching (f6)"; "Layout coordination (f3)" and "Tone matching (f4)" belonged to group II; and "Color harmony (f1)" and "Preference (f9)" were in group III; while in group IV, "Sweetness (f7)" and "Creativity (f8)" were individually separated. The vector of attribute f7 (Sweetness) intersected with category I at nearly 90°. Based on the MDPREF analysis, the attribute vectors of semantic matching were irrelevant to sweetness and creativity.
The thirteen paintings were presented in the cognitive space of preferences in the form of point coordinates. The locations of stimulus paintings that were grouped together represented that they had a similar rating, while the locations of stimulus paintings that were separated represented that the paintings held different attributes. Each painting could be projected onto every attribute vector. According to the distribution of paintings in Figure 3, the most generations interacted with by AI and creators with artistic backgrounds (P01-P05) could be projected onto the positive pole of most attribute vectors, whereas P06 was far away from others of the same type and had more negative perceptions. Furthermore, the paintings co-created by AI and nonartists were located in three clusters. H03 and H06 had higher perceptions of high-level attributes, and H04 was better on low-level attributes. In contrast, H01, H02, and H05 gathered and projected onto the negative pole of all the attribute vectors. As for the reference sample, the paintings by artists performed better on semantic matching.

Analysis of Subjective Ranking
To further determine whether there were perceptual differences among the three types of paintings, in this study, subjects were invited to choose what they considered to be the most professional, sweet, and creative painting. Finally, the work that they thought was most like human paintings was picked. Figure 4 shows the proportion of people selecting the most professional, sweet, and creative painting among all the subjects. As for

Analysis of Subjective Ranking
To further determine whether there were perceptual differences among the three types of paintings, in this study, subjects were invited to choose what they considered to be the most professional, sweet, and creative painting. Finally, the work that they thought was most like human paintings was picked. Figure 4 shows the proportion of people selecting the most professional, sweet, and creative painting among all the subjects. As for the professional aspect, the top three paintings were H06 (26%), H04 (24%), and H03 (17%); while considering the sweet aspect, the order was P03 (21%), H06 (19%), and P04 (17%); and in the creativity aspect, the top three were P04 (33%), H03 (29%), and P06 (26%). (17%); while considering the sweet aspect, the order was P03 (21%), H06 (19%), and P04 (17%); and in the creativity aspect, the top three were P04 (33%), H03 (29%), and P06 (26%). A Chi-Square test was conducted to analyze the differences in the subjective ranking of professional, sweet, and creative aspects and to analyze the selection of the human painting according to age, gender, and education. Only female and male subjects had significant differences in the selection of which one was the human painting. Since the number of some samples selected was less than five people, the exact probability method was adopted to calculate the Chi-Square value χ 2 = 18.891, p < 0.05. The proportion of female subjects choosing P03 and P04 was obviously higher than the average of 64.29%, while A Chi-Square test was conducted to analyze the differences in the subjective ranking of professional, sweet, and creative aspects and to analyze the selection of the human painting according to age, gender, and education. Only female and male subjects had significant differences in the selection of which one was the human painting. Since the number of some samples selected was less than five people, the exact probability method was adopted to calculate the Chi-Square value χ 2 = 18.891, p < 0.05. The proportion of female subjects choosing P03 and P04 was obviously higher than the average of 64.29%, while males preferred to choose H04 and H06, which was higher than the average of 35.71%. Table 10 shows the top three paintings that the subjects thought were most like those created by humans. The order was H04 (21%), P03 (13%), and Artist (13%). In a combined interview with the participants, the clues that affected their judgment included various details, such as the stroke and texture in H04 and P03, as well as a structure and tone style similar to the textbook in the artist's painting.

Experiment Procedures
During the creative pro number of adjustments to t clicked for variations. After t ney, they had a one-on-one the interaction process and r ence in the process of human Artist (13%)

Differences of Coding in Co-Creation with AI
According to the action observation data, the artists still kept their behavior characteristics, which differed from nonartists, in the creative process [1,2], such as more control over tools and repeated actions. Even in the process of interaction with AI, actions different from those of nonartists still existed. However, it can be seen from the interview data that artists were not satisfied with the control effect of AI, and they even felt a little out of control. The artists' attitude towards technology was related to their experience. The artists (AP05, AP09-10) with more painting experience claimed that they could identify the paintings generated by AI due to some similarities and firmly believed that they would not be replaced. However, the creators with relatively little painting experience had contradictory attitudes toward AI. On the one hand, they affirmed the professionalism of AI paintings in terms of color and brush strokes, and they felt that the paintings could generate some surprise even though they were not being very obedient. On the other hand, they considered the possibility of potential competition and had some confusion about core ability. Additionally, some artists were surprised by accidents and thought that they had control of their creativity, although their paintings were different from the descriptive text, such as sample P04, while others (AP01, AP05-07) felt a loss of control of the AI compared with traditional tools. Based on the analysis of the prompt, more artists used metaphors instead of direct descriptions of real-life scenes and constantly sought the vision that they wanted by iteration. For example, AP03 imagined home as a harbor of love, and the P06 creator compared herself to a Samoyed dog and stated that floating in the endless space was the sweetest destination. It can be seen that metaphors, as the basic mechanism of art, were still widely used in the coding process of artists and artificial intelligence. Generally, in the process of interaction with AI, artists still kept the original parts during creation. However, unlike traditional tools, the loss of control may bring surprise or fright [23,36]. Moreover, due to their different experiences and skills, they had different attitudes toward AI.
As for most nonartists, their creative process was simple and direct, and they were generally excited about a series of excellent results. They preferred to depict certain people in a scene based on their memory or hope. The work of H06, for instance, restored the author's childhood memory of watching the cartoon Tom and Jerry, and H02 depicted the author's expectation of their grandson's arrival in the future. AI as an interface helped the crowd of people without painting skills to visualize their imagination (NH01, NH04-08, NH09). Considering that, there is an example of this point. NH06 generated an Indian painting, but as a Chinese man, it was difficult for him to resonate and feel any sweetness. Instead of focusing on artistic techniques and creativity, they were more focused on semantic matching and cultural consistency.
To sum up, there were differences in actions between artists and nonartists as well as differences in their attitudes and concerns that were influenced by personal knowledge. Ultimately, the text-to-image system has introduced a new human-AI interaction mode as a transformation interface from internal imagination to visual form. Due to the randomness and variation of AI generation, artists gradually lose confidence in the ability to control tools like before.

Differences in Decoding in Communication with Creators
Except for the perception of sweetness, most attributes had no significant difference in scores, which showed that co-creation with the text-to-image system really reduced the function of painting ability in artworks. The assistance of AI not only made the perception of human-AI co-creation with and without artistic background converge, but also blurred the difference between AI generation and human painting. It is worth noting that there were significant differences in the perception of sweetness and that the score of the artist's painting was much lower than that of the AI generator. It seems that, as the audience could not decode the effectiveness level without the Yong Wang s life experience in the countryside in the 1980s, they could not feel the sweetness of the painting.
Combining the rating score and the distribution of the thirteen paintings in the perceptual matrix, more samples (P01-05) were created by the collaboration of Midjourney and the creators with artistic backgrounds projected onto the positive direction of most attribute vectors, whereas generations without artistic backgrounds were divided into two extremes. Additionally, the result indicated that, owing to art expertise, the communication between the artist and the audience could be more stable unless the coding with a strong personal thinking or experience system was too difficult to understand and could not resonate with audiences, such as the space dog in P06 and the outdoor cooking in the artist Yong Wang's painting. As for the nine attributes, the cognition for the accuracy of element shaping in painting was closely correlated with content and scene matching with prompts, which demonstrated the process from shape to meaning. In addition, the perception of color harmony grouped with sweetness and preference did not relate to semantic matching. Color could express feeling on the effectiveness level [25] even though the paintings failed in structure and significance. This was also the reason to select the Midjourney instead of the other systems. Thus, color perception was an important channel for feeling the degree of sweetness and affecting the preference. Apart from that, semantic matching did not seem to be closely related to high-level perception. As the prompt of the artist sample was obtained based on the painting description, of course, the score of the attributes at the semantic level was higher, but the perception at a high level was still lower. On the contrary, although P04 failed in semantic matching, the special combination could still impress the audience with its sweetness and creativity. Furthermore, the audience model was an active process influenced by several subjective features [29,35]. Subjects usually used their cognitive system to decode the meaning of the painting so that the results generated based on text did not affect their perception of high-level features because of the high semantic matching. The fitness degree of prompts affected the artists' perception of the AI control ability.
The ranking result demonstrated that more subjects considered the AI productions as more professional than the painting by the artist Yong Wang, and even the samples created by nonartists obtained the most votes. AI technology was able to imitate artistic presentation techniques very well, although it only relied on the features' statistics without knowing the image's intention [31]. P03, as the sweetest painting, showed the artist's skill in transmitting emotion through visual information. Additionally, creativity could still be handled by the group with artistic backgrounds. Although the rating score of P04 and P06 in the nine terms was not high, their unique representation, different from ordinary thinking, improved the perception of creativity. However, to enable the audience to decode and communicate with artists successfully, it is not enough to rely solely on creativity, and links in culture, experience, and other aspects are also required [41]. As for the differences of gender in the selection of artists' paintings, although there was evidence showing gender differences in style perception [31], considering the small sample size of this experiment, it is appropriate to discuss it in future general test research.
AI algorithms have simulated excellent visual patterns, similar to traces of drawings by humans. Through interaction with technology such as text-to-image systems, nonartists can express their creativity by breaking the limitations of their drawing skills. Artists must face the narrowing distance in technical skills with people featuring nonartistic backgrounds. Therefore, a high level of communication with the audience should be paid more attention.

Conclusions
Understanding how humans collaborate with AI and perceive the generated results is complex and necessary in the age of machine learning. From the perspective of art communication, this study explored the difference in coding in co-creation and decoding in perception with a text-to-image system between artists and the nonartists. Furthermore, the overall conclusion of the present research can fall into two parts: Firstly, the actions and reflections of the creators supported the view that the action characteristics of artists were still different from those nonartists as well as that their attitudes and concerns were related to their knowledge. Secondly, AI blurred the differences in painting techniques enhanced through professional training, whereas stable performance in art action was strictly tied to experience in creation. Additionally, the evidence of the perception of human-AI cocreation suggested that it is necessary to pay attention to emotional communication above the form of formal features and semantic matching in the interaction with AI technology.
This study had several limitations. Firstly, the painting samples in this study were all displayed on a digital screen, which was different from the feeling of watching an offline exhibition. However, with the development of the metaverse concept and the significant impact of COVID-19, virtual reality space will be a new trend for showing paintings in the future. Secondly, since there was not a wide range of age involved in this study, the results were more applicable to 20 to 30 years old adults. In this case, in the future, the research team will balance the age distribution and cover various professional backgrounds to further understand the differences in the perception of AI art between different subjects. Thirdly, considering that there were only 42 subjects in each experiment in this study, a more general conclusion could be obtained if the number of subjects is increased.