You are currently viewing a new version of our website. To view the old version click .
Sustainability
  • Article
  • Open Access

15 April 2023

Artificial Intelligence-Empowered Art Education: A Cycle-Consistency Network-Based Model for Creating the Fusion Works of Tibetan Painting Styles

,
,
and
1
School of Art, Southwest Minzu University, Chengdu 610041, China
2
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Sustainable Education Technologies in Big Data and Artificial Intelligence Era

Abstract

The integration of Tibetan Thangka and other ethnic painting styles is an important topic of Chinese ethnic art. Its purpose is to explore, supplement, and continue Chinese traditional culture. Restricted by Buddhism and the economy, the traditional Thangka presents the problem of a single style, and drawing a Thangka is time-consuming and labor-intensive. In response to these problems, we propose a Tibetan painting style fusion (TPSF) model based on neural networks that can automatically and quickly integrate the painting styles of the two ethnicities. First, we set up Thangka and Chinese painting datasets as experimental data. Second, we use the training data to train the generator and the discriminator. Then, the TPSF model maps the style of the input image to the target image to fuse the two ethnicities painting styles of Tibetan and Chinese. Finally, to demonstrate the advancement of the proposed method, we add four comparison models to our experiments. At the same time, the Frechet Inception Distance (FID) metric and the questionnaire method were used to evaluate the quality and visual appeal of the generated images, respectively. The experimental results show that the fusion images have excellent quality and great visual appeal.

1. Introduction

Thangka is a popular research topic in Chinese ethnic painting, and it usually represents the typical painting style of Tibet and other Tibetan-related areas. In the complex Tibetan Buddhist culture, Thangka stands out in the field of Chinese painting with its long history of development [1,2]. For centuries, other ethnic groups have had close contacts with Tibetans, and Thangkas have constantly merged and drawn on the painting styles of other ethnic groups.
“Thangka” is also called Tangga, which means canvas, and mainly refers to religious scroll paintings mounted and hung in colorful satin. Thangka was introduced from India along with Buddhism in the seventh century. During the Han, Tang, Song and Yuan dynasties, the communication between the Tubo and Han people became closer, which also contributed to the fusion of earlier Thangka and Tubo flag painting [3]. Under the continuous nourishment of the Tibetan cultural background for thousands of years, Thangka presents a unique style, which has been widely inherited and developed. So, it is also known as the “Encyclopedia of Tibetan Culture” [4]. However, with the development of the society and economy, learners have more and more diverse needs for Thangka styles, and the single style of traditional Thangka can no longer meet the needs of the public aesthetic. Meanwhile, it usually takes a long time for learners to learn Thangka style. Compared with other ethnic groups and other forms of painting, Thangka is also difficult to be selected as teaching content in the classroom. As a result, this makes it difficult for many learners to access Thangka, which is extremely detrimental to the inheritance and development of Thangka.
Although most Thangka creators still use traditional forms, more people have become interested in artificial intelligence painting over the past few decades [5]. The trend of using artificial intelligence technology to generate images, fuse image styles, and help people learn to paint, etc., is irreversible [6]. In recent years, deep neural networks have continuously entered the public’s field of vision and are widely used in image feature recognition [7], image style fusion [8], image generation [9], etc. Early non-realistic rendering [10] and texture migration [11] are the main traditional style migration methods. Mainstream style transfer models include Generative Adversarial Networks (GAN) [12], Cycle-consistent Generative Adversarial Networks (CycleGAN) [13], Star Generative Adversarial Networks (StarGAN) [14], Star Generative Adversarial Networks v2 (StarGAN V2) [15], Style Generative Adversarial Networks (StyleGAN) [16], Anime Generative Adversarial Networks [17], Conditional Generative Adversarial Networks [18], Cari Generative Adversarial Networks [19], Adversarial consistence loss-Generative Adversarial Networks [20] and other style migration models.
At present, a large number of scholars use artificial intelligence technology to assist learners in painting creation. For example, deep dream generator assists learners in style transfer, Dall-E2 helps learners generate images through text descriptions, Nvidia Canvas helps learners convert images with abstract strokes into images with realistic photographic effects, AI Gahaku helps learners convert real portraits for abstract painting style effects, and Deoldify AI can assist learners to colorize black and white videos or photos. However, few scholars have studied the fusion of Thangka styles. To fill this gap, we propose a Thangka Painting Style Fusion (TPSF) model based on CycleGAN. The TPSF model can automatically and efficiently integrate the lines, colors, and other forms and content of Tibetan and Chinese paintings, effectively solving the problem of a single Thangka style. At the same time, the TPSF model is easy to operate, which is beneficial to assist learners to understand the style of Thangka, and to effectively solve the problem of the difficulty of teaching Thangka.
How the model is constructed determines the characteristics of the model, and an analysis showed that the TPSF model has the following characteristics. First, the model is stable. As a highly robust network model, the quality of the paintings generated by the TPSF model is stable. Second, the training process of the model is unsupervised learning. The TPSF model does not require a labeling process for the samples, and it can directly be trained and modeled on the data. These characteristics of the TPSF model can facilitate the fusion of Tibetan and Chinese painting styles.
The contributions of this paper are as follows.
  • The TPSF model is proposed to solve problems regarding the limited content and the similar styles of Thangka. Additionally, a digital approach to the fusion of Tibetan-Chinese painting styles is provided.
  • We propose that the use of a TPSF model in art learning empowers art education and provides learners with a new model of interactive learning.
  • Comparison experiments were performed on real data sets. Firstly, the converged objective function proved the feasibility of the model. Secondly, the TPSF model outperformed the other four comparison models according to the Frechet Inception Distance (FID) metric, which proves how advanced it is. Finally, the questionnaire method was used to evaluate the visual appeal of the generated images.
Section 2 of this article describes the relevant work. Section 3 describes the framework design, whereby learners use the model process and the objective function of the TPSF model. At the same time, four groups of comparative experiments, FID metrics and the questionnaire method were used to confirm the progressiveness of the TPSF model and the attractive visual effects of the fusion works.

3. TPSF Model Design

The TPSF model consists of dual generators and discriminators. At the same time, we use a cycle-consistency loss, similar to that used by Zhou et al. [55] and Godar et al. [54], to drive the dual generators G and F in the TPSF model. The TPSF generators are mainly composed of three parts: style encoder, residual block, and style decoder. The style decoder mainly uses convolution, in regularization, leaky rectified linear units (ReLU), as well as image augmentation methods to improve the resolution of the fused image. In addition, the residual block module is also used to enhance the data effect between the style decoder and encoder. In the encoder part, activation functions such as transpose convolution, IN normalization, and the ReLU are used to recover the amount of data used. Then, the image resolution is increased by ReflectionPad2d, and the image is convolved again to return to its original size. This method can solve the problem of processing the edge information of objects. In the objective function part, the cycle-consistency loss function used in the TPSF model is mainly used to limit the image generated by the generator, which can maintain the characteristics of the original image domain. The expected goal of our experiment is to realize the spatial mapping of the real input data of the Tibetan painting image domain X to the real data of the Chinese painting image domain Y and to fuse the styles of different ethnic groups.
We assume that two different image domains are provided in the model, which are the real Thangka style image domain X = { x 1 , x 2 , . . . x N } , x i R , and the real Chinese painting style image domain Y = { y 1 , y 2 , . . . y N } , y i R , where R represents the real dataset. And we constructed two ethnic-style-fusion generators G and F for X and Y, respectively, and two additional ethnic style fusion discriminators D X and D Y .

3.1. TPSF Generators

The specific operation is shown in Figure 1, and specific information about the TPSF generators is shown in Table 1.
Figure 1. TPSF generators structure diagram: epoch _ r e a l _ X in the image denotes the real input Tangka image set, and epoch _ r e a l _ Y denotes the real input Chinese painting image set.
Table 1. The information table of TPSF generators.
The following is the detailed design of the TPSF generators’ operation.
  • The input is two real three-channel image sets that are 256 × 256 pixels and are named e p o c h _ r e a l _ X and e p o c h _ r e a l _ Y .
  • This image set enters the decoder for undersampling, and the first layer uses a convolution kernel with the number 64 and a size of 7 × 7. The sliding step length is one and the fill size is three. Then, the instance normalization occurs, and the ReLU is finally implemented.
  • The second and third layers use 128 and 256 convolutional kernels of size 3 × 3, respectively, and both the second and third layers slide two steps. Additionally, they have a padding size of one, undergo instance normalization, and finally implement ReLU activation.
  • The last layers use the nine residual block model. Nine convolutional kernels are present in the residual module, each with 256 3 × 3 convolutional kernels. They slide one step, undergo instance normalization, and finally implement ReLU activation.

3.2. TPSF Discriminators

The specific operation of the TPSF discriminators is shown in Figure 2, the specific information is shown in Table 2.
Figure 2. The structure diagram of TPSF discriminator: epoch _ r e a l _ X in the image denotes the real input Tangka image set, and epoch _ r e a l _ Y denotes the real input Chinese painting image set.
Table 2. The information table of the TPSF discriminators.
  • The input is a real three-channel image set, each channel of which is 256 × 256 pixels, and are named e p o c h _ r e a l _ X and e p o c h _ r e a l _ Y . The output is the fake target data X = G ( Y ) and Y = G ( X ) .
  • The first layer has 64 four × four convolutional kernels with a sliding step of two and a fill size of one. Additionally, it undergoes an instance normalization process and, finally, a ReLU activation process.
  • The reason for the ReLu activation is that the second to fifth layers all use four × four convolutional kernels, which are 128, 256. That has 521 in number with a sliding step size of two, and padding of one, and they are subjected to average pooling.

3.3. Interactive Learning Process of TPSF Model

The process of learners using the TPSF model to create fusion works is shown in Figure 3. Before the experiment, the learners chose an ethnic painting style that they liked, and we collected such ethnic paintings to create an experimental dataset. Then, we input the selected ethnic painting style dataset and Thangka style dataset into the TPSF model for training. In the experiment, the Thangka and target ethnic painting styles were fused by the dual generator and discriminator of the TPSF model. The fusion works generated automatically, quickly and efficiently through the TPSF model are rich in content and novel in style. After the experiment, the learners selected their favorite images from the fusion works produced by the TPSF model and created their own unique collection of Thangka style fusion works. The datasets collected from the learners can be shared with other users, which provides more possibilities for more scholars to create Thangka style fusions. Additionally, learners can efficiently generate fusion works with the TPSF model and thereby avoid the influence of low aesthetic experience and creative ability on their works. At the same time, the model can also enable learners to participate in the creation of Thangka painting styles and arouse learners’ interest in learning about art.
Figure 3. Flow chart of TPSF model interactive learning.

3.4. Objective Function of TPSF Model

The TPSF model consists of three main parts: two sets of adversarial loss, one set of cycle-consistency loss, and an identity loss function. The generative adversarial loss function mainly consists of a GAN, recognition network, and objective function loss, in which the least squares loss is used to replace the original negative log-likelihood loss to ensure the robustness of the objective function and to obtain more experimental results. The specific function is to map two sets of real Tibetan style images to generate and convert the images in the Tibetan style and cause the generated images to be closer to the target images in terms of distribution. The adversarial loss is expressed as
L G A N ( G , D Y , X , Y ) = E y p d a t a ( y ) [ log D Y ( y ) ] + E x p d a t a ( x ) [ log ( 1 D y ( G ( x ) ) ) ] ,
where the style fusion generator G tries to generate an image set G ( x ) similar to the target painting style image set Y. By contrast, the style fusion discriminator D Y aims to distinguish the fake image Y from the real image Y. G aims to minimize this objective, whereas the opponent D tries to maximize this objective, which is denoted as Equation (2). A similar adversarial loss will be used for the mapping function F:Y → X and its discriminator D Y , which is denoted as Equation (3).
L G A N ( G , D Y , X , Y ) = E y p d a t a ( y ) [ ( D Y ( y ) 1 ) 2 ] + E x p a d a t a ( x ) [ D Y ( G ( x ) ) 2 ] ,
L G A N ( G , D X , Y , X ) = E x p d a t a ( x ) [ ( D X ( x ) 1 ) 2 ] + E y p a d a t a ( y ) [ D X ( F ( y ) ) 2 ] .
Identity loss is used to ensure the continuity of the style fusion into an image and is denoted as Equation (4). When X i passes through one of the generators, identity loss can cause the generated image G ( x ) to be as close to the original image as possible, which prevents generators G and F from changing the hue of the input image.
L i d e n t i t y ( G , F ) = E y p d a t a ( y ) [ G y y 1 ] + E x p d a t a ( x ) [ F x x 1 ] .
In order to solve the non-matching data training problem of the adversarial network, so as to achieve the style transfer between Thangka and Chinese painting image sets, the cycle-consistency loss is necessary, which prevents the generators G and F from contradicting each other while increasing the mapping Image realism. That is, the adversarial loss causes the generated Chinese painting image set G x to conform to the distribution of the input Thangka image domain y and thereby preserves multiple mapping relationships, but it does not cause the generated Chinese painting image domain G x to retain its content from the real Chinese painting dataset X training process. E p o c h _ r e a l _ x G ( x ) F ( G ( x ) ) for forward cycle consistency. Similarly, for the generation process of the input image domain y, G and F should also satisfy the backward cycle consistency, i.e., y F ( y ) G ( F ( y ) ) y . For each image element X i of the X, the cycle process of image fusion can bring back the stylistic features of X i to the input X. Thus, the cycle-consistency loss should be expressed as
L c y c ( G , F ) = E x p d a t a ( x ) [ F ( G ( x ) ) x 1 ] + E y p d a t a ( y ) [ G ( F ( y ) ) y ] .
The full jobs objective is as follows. The λ controls the relative importance of the two objectives.
L ( G , F , D X , D Y ) = L G A N ( G , D Y , X , Y ) + L G A N ( F , D X , Y , X ) + λ L c y c ( G , F ) ,
and the TPSF model aims to solve
G * , F * = arg min G , F max D x , D Y L ( G , F , D X , D Y ) .
In terms of training details, the least squares loss [56] is used to make the model more robust. For example, in Equation (2), the goal of the discriminator D is minimized to E y p d a t a ( y ) [ ( D Y ( y ) 1 ) 2 ] + E x p a d a t a ( x ) [ D Y ( G ( x ) ) 2 ] , which means that D Y ( y ) is as close as possible to 1 and D Y ( G ( x ) ) is as close as possible to 0. Among them, the goal of the generator G is to try to generate an image G ( X ) similar to the image domain Y image, which means that the goal of G is to minimize the opponent D, which tries to maximize the output; that is, the generator is expected to minimize D Y ( G ( x ) ) , so D Y ( G ( x ) ) needs to be as close to 0 as possible. The goal of the discriminator D Y is to determine the difference between the generated image G ( x ) and the real input sample Y, which means that the goal of D Y is to maximize a G ( x ) that tries to minimize the difference, that is, the discriminator is expected to maximize D Y ( y ) ; therefore, D Y ( y ) needs to be as close to o n e as possible.

4. Experiment and Results

4.1. Setup of Experiment

The computer configuration required for the TPSF model includes an AMD Ryzen 7 5800X processor, Window10 × 64 operating system, and TiTan XP × 2 graphics card. The TPSF model code was mainly written in Python, and the framework was implemented with Pytorch. The small-squares loss was applied instead of the original maximum likelihood function, the weight of l a m b d a was set to 10.0 , the batch size was set to 1, and the learning rate of the Adam optimization parameter was set to 0.001 for the actual optimization after repeated iterations.

4.2. Results and Analysis

We collected a Tibetan and Chinese painting style dataset and established a Tibetan painting style fusion model. The entire dataset contained 400 works in total, which consisted of 200 Tibetan Thangkas and 200 Han Chinese paintings. Additionally, we applied 10-fold cross-validation. Thus, the dataset was divided into ten parts; nine of them were used as training data and one was used as test data.
Figure 4 shows the TPSF model loss of the training process, which is composed of GAN loss, cycle-consistency loss, and identity loss. According to the loss function graph, the cycle-consistency and identity loss in the graph greatly fluctuated, but the two showed an overall downward trend, which showed that the training results of the fusion of the Tibetan painting styles reached the expected goal. Although the GAN loss of the generator was on the rise during the operation of the generator, it indicated that the closer the reconstructed image was to the original image, the more realistic the generated image was.
Figure 4. Three kinds loss of TPSF model, including two adversarial losses, a cycle-consistency loss, and a identity loss.
To objectively evaluate the experimental results, four sets of comparative models and the Frechet Inception Distance (FID) metric were used in the comparative experiments, and the experimental results were visualized.
When evaluating the veracity and variety of generated images, FID is a reliable and thorough evaluation metric that is more similar to human vision. The closer the data distribution is to the actual data distribution, the more accurate the picture generation will be. And the smaller the FID score, the closer the created data distribution is to the actual data distribution. Thus, calculating the FID score of the target image and the fused image allows one to assess the TPSF model’s quality.
The average FID of the 10 training results of the TPSF model is shown in Figure 5. The model has a recursive network structure. The G ( X ) score of the input Thangka image set was 236.82, and the G ( Y ) score of the input Chinese painting image set was 166.06. The four comparison models added were all one-way networks, so under the FID score, the score of the Thangka image set G ( X ) input of StarGAN V2 was 337.56; the score of the Thangka image set G ( X ) input of StyleGAN was 321.89; the score of the Thangka image set G ( X ) input of GAN was 368.03; and the score of the Thangka image set G ( X ) input of StarGAN was 371.45. By arranging the average FID score of each model in descending order, we found that the FID score of the TPSF model was the smallest, which proved that the fusion works produced by this model were of higher quality.
Figure 5. Average value of FID: sort in descending order according to the FID score of the model.
To demonstrate the objectivity of the experiment, 50 professional reviewers (professors and students from relevant disciplines) and 100 general reviewers were invited to evaluate 10 randomly selected fused images in four aspects: attractive color, attractive visual, help study and ease operate. Details are shown in Figure 6. The questionnaire showed that most of the reviewers found the images generated by the model visually appealing and the model could help them learn the Thangka. Of the 50 professional judges, 81% found the randomly selected fusion work very attractive and 74% found the TPSF model easy to operate and effective in helping them learn the Thangka. Of the 100 general jurors, 84% found the TPSF model effective in helping them learn the Thangka and 86% found it easy to operate.
Figure 6. The questionnaire for fusion images. The numbers in the figure are the proportion of votes that were considered the best in each evaluation of the fusion work.
As shown in Figure 7, by comparing the real, fake, and idt images, we found that the fused works produced by the TPSF model had the style characteristics of both Tibetan and Chinese paintings. And they also had a strong visual appeal. In addition, some experimental results are shown in Figure 8.
Figure 7. Some example results of TPSF model for qualitative evaluation. Group one is the selected three groups of real Tibetan painting experimental data and three groups of real Han painting test data. Group two is the fusion data of three groups of ethnic painting styles corresponding to group one. Group three is the equivalent data generated by the three corresponding to group one. The style images of group one are in the public domain.
Figure 8. Some Tibetan-Han style examples which were generated by the TPSF model.

5. Conclusions

We propose a Tibetan painting style fusion model based on CycleGAN. First, we collected a Tibetan Thangka and Chinese painting dataset to use as the experimental training and test data. We used the training dataset to train a generator and two discriminators. Meanwhile, we added cycle-consistency loss to the model so that the output works had Tibetan and Chinese painting style characteristics, rich picture content, and real picture effects. To more accurately verify the effectiveness of the integration of the Tibetan painting styles, four groups of contrast models were added to the experiment and were used for experimental evaluation. The TPSF model outperformed the other models and could quickly generate visually appealing fused image compositions. The advantages of the TPSF model to automatically and quickly integrate the painting styles of the two ethnicities can help more learners participate in the creation of the ethnic painting and stimulate more learners to have a strong interest in the study of ethnic painting.

Author Contributions

Methodology, H.W.; Software, L.W.; Formal analysis, H.W.; Investigation, Y.C. and L.W.; Data curation, Y.C.; Writing—original draft, Y.C.; Writing—review & editing, Y.C., L.W., X.L. and H.W.; Visualization, Y.C.; Supervision, X.L.; Funding acquisition, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant number 62276216.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

https://github.com/90ii/TPSF-data.git (accessed on 28 January 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Béguin, G.; Colinart, S. Les Peintures du Bouddhisme Tibétain; Réunion des Musées Nationaux: Paris, France, 1995; p. 258. [Google Scholar]
  2. Jackson, D.; Jackson, J. Tibetan Thangka Painting: Methods and Materials; Serindia Publications: London, UK, 1984; p. 10. [Google Scholar]
  3. Elgar, J. Tibetan thang kas: An overview. Pap. Conserv. 2006, 30, 99–114. [Google Scholar] [CrossRef]
  4. Beer, R. The Encyclopedia of Tibetan Symbols and Motifs; Serindia Publications: London, UK, 2004; p. 373. [Google Scholar]
  5. Cetinic, E.; She, J. Understanding and creating art with AI: Review and outlook. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2022, 18, 1–22. [Google Scholar] [CrossRef]
  6. Hao, K. China has started a grand experiment in AI education. It could reshape how the world learns. MIT Technol. Rev. 2019, 123, 1. [Google Scholar]
  7. Song, J.; Li, P.; Fang, Q.; Xia, H.; Guo, R. Data Augmentation by an Additional Self-Supervised CycleGAN-Based for Shadowed Pavement Detection. Sustainability 2022, 14, 14304. [Google Scholar] [CrossRef]
  8. Ramesh, A.; Dhariwal, P.; Nichol, A.; Chu, C.; Chen, M. Hierarchical text-conditional image generation with clip latents. arXiv 2022, arXiv:2204.06125. [Google Scholar]
  9. Gregor, K.; Danihelka, I.; Graves, A.; Rezende, D.; Wierstra, D. Draw: A recurrent neural network for image generation. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 6–11 July 2015; pp. 1462–1471. [Google Scholar]
  10. Hertzmann, A. Non-photorealistic rendering and the science of art. In Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering, Annecy, France, 7–10 June 2010; pp. 147–157. [Google Scholar]
  11. Park, J.; Kim, D.H.; Kim, H.N.; Wang, C.J.; Kwak, M.K.; Hur, E.; Suh, K.Y.; An, S.S.; Levchenko, A. Directed migration of cancer cells guided by the graded texture of the underlying matrix. Nat. Mater. 2016, 15, 792–801. [Google Scholar] [CrossRef]
  12. AlAmir, M.; AlGhamdi, M. The Role of generative adversarial network in medical image analysis: An in-depth survey. ACM Comput. Surv. 2022, 55, 1–36. [Google Scholar] [CrossRef]
  13. Mo, Y.; Li, C.; Zheng, Y.; Wu, X. DCA-CycleGAN: Unsupervised single image dehazing using Dark Channel Attention optimized CycleGAN. J. Vis. Commun. Image Represent. 2022, 82, 103431. [Google Scholar] [CrossRef]
  14. Choi, Y.; Choi, M.; Kim, M.; Ha, J.W.; Kim, S.; Choo, J. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8789–8797. [Google Scholar]
  15. Liu, Y.; Sangineto, E.; Chen, Y.; Bao, L.; Zhang, H.; Sebe, N.; Lepri, B.; Wang, W.; De Nadai, M. Smoothing the disentangled latent style space for unsupervised image-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 10785–10794. [Google Scholar]
  16. Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
  17. Chen, J.; Liu, G.; Chen, X. AnimeGAN: A novel lightweight GAN for photo animation. In Proceedings of the International Symposium on Intelligence Computation and Applications, Guangzhou, China, 16–17 November 2019; Springer: New York, NY, USA, 2019; pp. 242–256. [Google Scholar]
  18. Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar]
  19. Cao, K.; Liao, J.; Yuan, L. Carigans: Unpaired photo-to-caricature translation. arXiv 2018, arXiv:1811.00222. [Google Scholar] [CrossRef]
  20. Zhao, Y.; Wu, R.; Dong, H. Unpaired image-to-image translation using adversarial consistency loss. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: New York, NY, USA, 2020; pp. 800–815. [Google Scholar]
  21. Timms, M.J. Letting artificial intelligence in education out of the box: Educational cobots and smart classrooms. Int. J. Artif. Intell. Educ. 2016, 26, 701–712. [Google Scholar] [CrossRef]
  22. Cairns, L.; Malloch, M. Computers in education: The impact on schools and classrooms. Life in Schools and Classrooms: Past, Present and Future; Springer: Berlin, Germany, 2017; pp. 603–617. [Google Scholar]
  23. Hwang, G.J.; Xie, H.; Wah, B.W.; Gašević, D. Vision, challenges, roles and research issues of Artificial Intelligence in Education. In Computers and Education: Artificial Intelligence; Elsevier: Amsterdam, The Netherlands, 2020; Volume 1, p. 100001. [Google Scholar]
  24. Al Darayseh, A. Acceptance of artificial intelligence in teaching science: Science teachers’ perspective. Comput. Educ. Artif. Intell. 2023, 4, 100132. [Google Scholar] [CrossRef]
  25. Chen, X.; Xie, H.; Li, Z.; Zhang, D.; Cheng, G.; Wang, F.L.; Dai, H.N.; Li, Q. Leveraging deep learning for automatic literature screening in intelligent bibliometrics. Int. J. Mach. Learn. Cybern. 2022, 14, 1483–1525. [Google Scholar] [CrossRef]
  26. Chiu, M.C.; Hwang, G.J.; Hsia, L.H.; Shyu, F.M. Artificial intelligence-supported art education: A deep learning-based system for promoting university students’ artwork appreciation and painting outcomes. Interact. Learn. Environ. 2022, 1–19. [Google Scholar] [CrossRef]
  27. Lin, H.C.; Hwang, G.J.; Chou, K.R.; Tsai, C.K. Fostering complex professional skills with interactive simulation technology: A virtual reality-based flipped learning approach. Br. J. Educ. Technol. 2023, 54, 622–641. [Google Scholar] [CrossRef]
  28. Zhu, D.; Deng, S.; Wang, W.; Cheng, G.; Wei, M.; Wang, F.L.; Xie, H. HDRD-Net: High-resolution detail-recovering image deraining network. Multimed. Tools Appl. 2022, 81, 42889–42906. [Google Scholar] [CrossRef]
  29. Ma, Y.; Liu, Y.; Xie, Q.; Xiong, S.; Bai, L.; Hu, A. A Tibetan Thangka data set and relative tasks. Image Vis. Comput. 2021, 108, 104125. [Google Scholar] [CrossRef]
  30. Zhang, J.; Zhang, K.; Peng, R.; Yu, J. Parametric modeling and generation of mandala thangka patterns. J. Comput. Lang. 2020, 58, 100968. [Google Scholar] [CrossRef]
  31. Qian, J.; Wang, W. Main feature extraction and expression for religious portrait Thangka image. In Proceedings of the 2008 the 9th International Conference for Young Computer Scientists, Hunan, China, 18–21 November 2008; pp. 803–807. [Google Scholar]
  32. Liu, H.; Wang, W.; Xie, H. Thangka image inpainting using adjacent information of broken area. In Proceedings of the International MultiConference of Engineers and Computer Scientists, Hong Kong, China, 19–21 March 2008; Volume 1. [Google Scholar]
  33. Hu, W.; Ye, Y.; Zeng, F.; Meng, J. A new method of Thangka image inpainting quality assessment. J. Vis. Commun. Image Represent. 2019, 59, 292–299. [Google Scholar] [CrossRef]
  34. Gatys, L.A.; Ecker, A.S.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2414–2423. [Google Scholar]
  35. Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: New York, NY, USA, 2016; pp. 694–711. [Google Scholar]
  36. Risser, E.; Wilmot, P.; Barnes, C. Stable and controllable neural texture synthesis and style transfer using histogram losses. arXiv 2017, arXiv:1701.08893. [Google Scholar]
  37. Li, S.; Xu, X.; Nie, L.; Chua, T.S. Laplacian-steered neural style transfer. In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 1716–1724. [Google Scholar]
  38. Li, Y.; Wang, N.; Liu, J.; Hou, X. Demystifying neural style transfer. arXiv 2017, arXiv:1701.01036. [Google Scholar]
  39. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. arXiv 2014, arXiv:1406.2661. [Google Scholar]
  40. Ratliff, L.J.; Burden, S.A.; Sastry, S.S. Characterization and computation of local Nash equilibria in continuous games. In Proceedings of the 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2–4 October 2013; pp. 917–924. [Google Scholar]
  41. Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
  42. Li, S.Z. Markov random field models in computer vision. In Proceedings of the European Conference on Computer Vision, Stockholm, Sweden, 2–6 May 1994; Springer: New York, NY, USA, 1994; pp. 361–370. [Google Scholar]
  43. Castillo, L.; Seo, J.; Hangan, H.; Gunnar Johansson, T. Smooth and rough turbulent boundary layers at high Reynolds number. Exp. Fluids 2004, 36, 759–774. [Google Scholar] [CrossRef]
  44. Champandard, A.J. Semantic style transfer and turning two-bit doodles into fine artworks. arXiv 2016, arXiv:1603.01768. [Google Scholar]
  45. Chen, Y.L.; Hsu, C.T. Towards Deep Style Transfer: A Content-Aware Perspective. In Proceedings of the BMVC, York, UK, 19–22 September 2016; pp. 8.1–8.11. [Google Scholar]
  46. Lu, X.; Zheng, X.; Yuan, Y. Remote sensing scene classification by unsupervised representation learning. IEEE Trans. Geosci. Remote Sens. 2017, 55, 5148–5157. [Google Scholar] [CrossRef]
  47. Mechrez, R.; Talmi, I.; Zelnik-Manor, L. The contextual loss for image transformation with non-aligned data. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 768–783. [Google Scholar]
  48. Liu, J.; Zha, Z.J.; Chen, D.; Hong, R.; Wang, M. Adaptive transfer network for cross-domain person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7202–7211. [Google Scholar]
  49. Chen, J.; Li, S.; Liu, D.; Lu, W. Indoor camera pose estimation via style-transfer 3D models. Comput.-Aided Civ. Infrastruct. Eng. 2022, 37, 335–353. [Google Scholar] [CrossRef]
  50. Zach, C.; Klopschitz, M.; Pollefeys, M. Disambiguating visual relations using loop constraints. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 1426–1433. [Google Scholar]
  51. Huang, Q.X.; Guibas, L. Consistent shape maps via semidefinite programming. In Proceedings of the Computer Graphics Forum, Guangzhou, China, 16–18 November 2013; Wiley Online Library: Hoboken, NJ, USA, 2013; Volume 32, pp. 177–186. [Google Scholar]
  52. Wang, F.; Huang, Q.; Guibas, L.J. Image co-segmentation via consistent functional maps. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 849–856. [Google Scholar]
  53. Zhou, T.; Jae Lee, Y.; Yu, S.X.; Efros, A.A. Flowweb: Joint image set alignment by weaving consistent, pixel-wise correspondences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1191–1200. [Google Scholar]
  54. Godard, C.; Mac Aodha, O.; Brostow, G.J. Unsupervised monocular depth estimation with left-right consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 270–279. [Google Scholar]
  55. Zhou, T.; Krahenbuhl, P.; Aubry, M.; Huang, Q.; Efros, A.A. Learning dense correspondence via 3d-guided cycle consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 117–126. [Google Scholar]
  56. Mao, X.; Li, Q.; Xie, H.; Lau, R.Y.; Wang, Z.; Paul Smolley, S. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2794–2802. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.