Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks

Cheng, Kan; Liu, Haidong; Liu, Jiayu; Xu, Bo; Liu, Xinyue

doi:10.3390/electronics13132643

Open AccessArticle

Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks

by

Kan Cheng

¹,

Haidong Liu

²,

Jiayu Liu

²

,

Bo Xu

³ and

Xinyue Liu

^2,*

¹

China Academy of Space Technology, Beijing 100039, China

²

School of Software, Dalian University of Technology, Dalian 116024, China

³

School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(13), 2643; https://doi.org/10.3390/electronics13132643

Submission received: 23 May 2024 / Revised: 1 July 2024 / Accepted: 4 July 2024 / Published: 5 July 2024

(This article belongs to the Special Issue Advances in Algorithm Optimization and Computational Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Generative domain adaptation in a one-shot scenario involves transferring a pretrained generator from one domain to another using only a single reference image. To address the issue of extremely scarce data, existing methods resort to complex parameter constraints and leverage additional semantic knowledge from CLIP models to mitigate it. However, these methods still suffer from overfitting and underfitting issues due to the lack of prior knowledge about the domain adaptation task. In this paper, we firstly introduce the perspective of the frequency domain into the generative domain adaptation task to support the model in understanding the adaptation goals in a one-shot scenario and propose a method called frequency-auxiliary GAN (FAGAN). The FAGAN contains two core modules: a low-frequency fusion module (LFF-Module) and high-frequency guide module (HFG-Module). Specifically, the LFF-Module aims to inherit the domain-sharing information of the source module by fusing the low-frequency features of the source model. In addition, the HFG-Module is designed to select the domain-specific information of the reference image and guide the model to fit them by utilizing high-frequency guidance. These two modules are dedicated to alleviating overfitting and underfitting issues, thereby enchancing the diversity and fidelity of generated images. Extensive experimental results showed that our method leads to better quantitative and qualitative results than the existing methods under a wide range of task settings.

Keywords:

domain adaptation; one-shot; image generation

1. Introduction

Generative adversarial networks have undergone tremendous development in the past few years [1,2,3,4] and occupy an important position in the field of image generation. Until now, according to different task settings, GANs have been widely used in image editing [5,6,7], image super-resolution [8,9], image inpainting [10,11], and image translation [12,13].

However, behind the widespread application of GANs, large-scale datasets are usually required to support them, such as 70 K FFHQ [3], which will greatly limit the usage scenarios of GANs. Insufficient data can easily lead to overfitting in GANs, so training GANs on a small number of samples is a challenging task. Although there are some studies on data-limited methods [14,15,16,17] that efficiently utilize training samples and can alleviate the problem of insufficient datasets to some extent, they still require at least a few hundred training images.

In order to address the situation of scarce datasets, a domain adaptation method [18,19,20,21,22] has been proposed. The main idea of this task is to use a model pretrained on a large-scale source domain dataset as the model’s initialization parameter and fine-tune it on few-shot target domain images. This type of method can effectively generate diverse and high-quality images with only a few target domain images (e.g., 10 images) by inheriting reusable domain-sharing knowledge from the source domain model and learning the domain-specific knowledge from the target domain images. However, for one-shot settings, the target domain is represented by only one reference image, and the above methods still suffer from obvious overfiiting problems due to the lack of sufficient target domain representation. A series of methods [23,24,25,26,27] additionally utilizing the rich semantic representation ability of Contrastive Language–Image Pretraining (CLIP) [28] to guide the direction of fine-tuning have improved the performance of the adaptation process in one-shot scenarios.

However, in some common adaptation tasks, existing methods still exhibit overfitting phenomena, which we do not want to see. In addition, when the domain-specific features of the reference image are complex, the model may also encounter underfitting problems. As shown in Figure 1, when the source domain is real faces and the reference image is a cartoon girl, all the images generated by the fine-tuned model show the same expression of laughter, which is a kind of signal of overfitting and can be harmful to the generation diversity. And when the source domain is cats and the reference image is a tiger, the generated image cannot sufficiently capture the complex stripes of the tiger, which can be considered as an underfitting issue and can impair the fidelity of the generated images. We argue that the previous methods lack explicitly modeling domain-sharing and domain-specific features, resulting in the model having difficulty knowing which information needs to be inherited and which information needs to be learned. Due to the lack of prior knowledge about the domain adaptation task, the model cannot achieve a trade-off between diversity and fidelity through a single reference image.

In this work, we firstly introduce the perspective of the frequency domain into the generative domain adaptation task to precisely assist model fine-tuning. We find that the low-frequency part of the image is highly correlated with domain-sharing information, and the high-frequency part of the image is highly correlated with domain-specific information. As shown in Figure 2, the image can be deconstructed into a low-frequency part (e.g., posture, expression) and high-frequency part (e.g., style, texture) by discrete wavelet transform (DWT). Follwing the above observation, we propose our one-shot domain adaptation method (FAGAN), which utilizes additional frequency-auxiliary to guide the model’s fine-tuning. The two core components of FAGAN are the low-frequency fusion module (LFF-Module) and high-frequency guide module (HFG-Module). The LFF-Module regards the low-frequency part of the source model’s feature maps as domain-sharing knowledge and preserves it to be unchanged during the fine-tuning process. This operation will be implemented at the shallow layer of the model, which is considered to control the coarse attributes of generated images. In this way, the domain-sharing knowledge of the source model will be explicitly inherited, and the diversity of the generated images will be improved. The HFG-Module aims to find the local domain-specific information of the reference image and drive the model to pay more attention to learning them. Specifically, we first decompose the reference image using discrete wavelet transform to extract high-frequency components. Then, we calculate the high-frequency content of each patch and select the patches with a higher high-frequency content, which are considered to contain more domain-specific information. Finally, we calculate the loss between the generated image patches and the selected patches using a similarity matrix. By utilizing high-frequency guidance, the model can learn domain-specific information in a fine-grained and targeted manner, thereby enchancing the fidelity of the generated images. The main contributions of this work can be summarized as outlined below:

We analyze the limitations of previous methods, which lack explicit modeling for the domain adaptation task, and introduce the perspective of the frequency domain into the generative domain adaptation task.
We propose a novel method called Frequency-Auxiliary GAN (FAGAN), which mainly contains two submodules: the low-frequency fusion module (LFF) and high-frequency guide module (HFG), which are designed to improve the diversity and fidelity of the generated images, respectively.
Extensive experiments demonstrate that our method outperforms the state-of-the-art methods on several benchmark datasets in terms of qualitative and quantitative comparison.

2. Related Work

2.1. Domain Adaptation of GANs

The aim of the few-shot domain adaptation of GANs is to learn an accurate and diverse distribution of the target domain, which is represented by only a few images. The core idea of this approach is to use a generator pretrained on the source domain and fine-tune it to a target domain by an adaptation process. The domain adaptation methods of GANs are usually categorized into two directions: fine-tuning-based and regularization-based methods.

The fine-tuning-based methods reduce the number of trainable parameters by only fine-tuning a part of the generative model [30,31], adding additional trainable parameters/modules while fixing the main backbone model [19,32], or decomposing the parameters using singular value decomposition (SVD) [19]. These methods aim to reuse the transferable parameters in the source model and inherit the diversity of the source domain. The LFP-Module used in our method can be seen as an implementation of fine-tuning-based methods which keep the low-frequency information frozen during the adapting process.

The regularization-based methods usually introduce additional regularization terms to constrain or guide the parameters/features changes in the adaptation process. It can show obvious correspondences between images in the source domain and target domain. CDC [18] maintained feature similarity among source/target samples via extra consistency loss. RSSA [21] introduced spatial consistency losses to keep structural identity between the source images and corresponding target images. DCL [22] used contrastive losses in the feature space of both the generator and discriminator. DWSC [33] designed both perceptual and contextual loss, respectively, in a local patch level. These methods employ the concept of multi-task learning, introducing additional loss terms to provide more meaningful guidance for the model adapting process. The HFG-Module used in our method is a kind of regularization-based method, and it can guide the model to learn more representative local domain-specific information through the high-frequency information of the reference image.

For a one-shot scenario, the above methods usually exhibit obvious overfitting problems because of the deteriorating adversarial loss. With the success of CLIP [28], recent works [23,24,25,26,27] build the difference between the source and target domains in the CLIP-embedding space to guide the adaptation process. The abundant semantic knowledge of the pretrained CLIP model can compensate for the extremely scarce samples in one-shot scenarios. Among these clip-based methods, MTG [23] used GAN inversion methods [29,34] to calculate the global domain-gap direction between the source and target domains, which is more reasonable than the mean value used in StyleGAN-NADA [24]. MTG also adopted the style mixing trick during the inference time to better capture domain-specific styles. Our method mainly builds upon this baseline method.

2.2. Frequency Information Used in GANs

In the field of computer vision, the spatial information of images is commonly utilized for image processing. As two-dimensional signals, images can also utilize frequency transformation methods commonly employed in the field of signal processing to address image-related issues. Discrete wavelet transformation (DWT) [35], which decomposes images into different frequency components, has made great success in various generative tasks such as image editing [36], image reconstruction [37], image inpainting [38], and style transfer [39]. For image synthesis, FreGAN [40] and WaveGAN [41] improved GAN’s capabilities of understanding and generating high-frequency signals by adjusting the network structure, thereby enchancing the quality of image generation under limited data. Unlike the above methods, we explicity model domain-sharing and domain-specific information in different frequency signals of images, thus alleviating the learning pressure of the one-shot domain adaptation tasks.

3. Our Method

To alleviate the overfitting and underfitting issues of previous methods in one-shot scenarios, we propose a frequency-auxiliary domain adaptation method, FAGAN, which uses different frequency domain components of images as tasks prior and mainly contains two auxiliary modules: a low-frequency fusion module (LFF-Module) and high-frequency guide module (HFG-Module). The overall framework of FAGAN is shown in Figure 3. There are two generator networks in our framework: a pretrained source domain generator

G_{s o u r c e}

whose parameters are fixed and a target domain generator

G_{t a r g e t}

guided by only a single reference image

I_{r e f e r e n c e}

. The goal of the generative domain adaptation task is to enable the fine-tuned model

G_{t a r g e t}

to generate diverse and faithful target domain images using the sampled latent vector z.

3.1. Low-Frequency Fusion Module

To alleviate the overfitting issues that occurred in previous methods, we propose a low-frequency fusion module (LFF-Module), which aims to reuse the low-frequency components of

G_{s o u r c e}

to inherit the diversity of the source domain. As mentioned in Figure 3b, the low-frequency part of the images is highly correlated with domain-sharing information, which is embedded within the source model

G_{s o u r c e}

. The LFF-Module integrates the low-frequency components of

G_{s o u r c e}

with

G_{t a r g e t}

at the feature map level, thereby inheriting the domain-sharing attributes of the source domain. To disentangle the features into multiple frequency components, we adopt a simple yet effective discrete wavelet transformation, i.e., Haar wavelet [35]. Haar wavelet consists of two operations: discrete wavelet transformation (DWT) and inverse discrete wavelet transformation (IDWT). The former converts images into frequency components, and the latter inversely reconstructs frequency components into the spatial domain, which can be used to fuse images [42]. There are four kernels in Haar wavelet, namely

L L^{T}

,

L H^{T}

,

H L^{T}

, and

H H^{T}

.

L^{T} = \frac{1}{\sqrt{2}} [1 1], H^{T} = \frac{1}{\sqrt{2}} [- 1 1],

(1)

where L and H denote the low and high-pass filters, respectively. The low-pass filter focuses on low-frequency signals containing the outline and structural information, which is highly correlated to domain-sharing information. The high-pass filter focuses on high-frequency signals that capture fine-grained details like texture and style, which is highly correlated to domain-specific information. After discrete wavelet transformation, the images would be decomposed into a low-frequency component

L L

and high-frequency components

L H

,

H L

,

H H

(as shown in Figure 2). As for inverse discrete wavelet transformation, this operation can reconstruct these frequency components back to the original images. Specifically, it can be implemented by transposed convolution on each frequency component, which is analyzed in WaveGAN [41].

In the LFF-Module, as shown in Figure 3b, we use discrete wavelet transformation on feature maps of both

G_{s o u r c e}

and

G_{t a r g e t}

to obtain two groups of frequency components. And then, we employ discrete inverse wavelet transformation using

L L

of

G_{s o u r c e}

and

L H, H L, H H

of

G_{t a r g e t}

and eventually obtain the fused feature maps

F_{f u s e d}

.

F_{f u s e d} = I D W T (L L_{s o u r c e}, L H_{t a r g e t}, H L_{t a r g e t}, H H_{t a r g e t}),

(2)

where

L L_{s o u r c e}

denotes the freezed low-frequency features of

G_{s o u r c e}

, and

L H_{t a r g e t}, H L_{t a r g e t},

and

H H_{t a r g e t}

denote the trainable high-frequency features of

G_{t a r g e t}

. In addition, influenced by the interpretation of models as mentioned in StyleGAN [3], we employ the LFF-Module only on the shallow layers (e.g., 6 layers) of the synthesis network, which mainly control the coarse-grained attributes of the generates images.

3.2. High-Frequency Guide Module

To alleviate the underfitting issue in which the complex domain-specific attributes cannot be learned well, we propose a high-frequency guide module (HFG-Module). Motivated by the methods [25,27] which use patch-based consistency loss to fit the local level features of the target domain, we also employ a consistency loss on the patch level. Based on previous frequency domain observations, we believe that patches with high levels of high-frequency signals possess richer domain-sharing information and should be emphasized in the model adaptation process. So, we use the high-frequency signals of the reference image to guide the model, whose local attributes should be learned carefully.

Motivated by LoFGAN [43], the HFG-Module mainly has two phases: local selection and local matching. As shown in Figure 3c, in the phase of local selection, we crop the reference image into N patches which are correlated with the intermediate tokens in the CLIP image encoder. And then, we employ wavelet transformation on every patch and calculate

L H + H L + H H

as their high-frequency strength. To obtain the patches earning more domain-specific information, we select the top M patches

P_{r e f e r e n c e}

with the highest high-frequency content from them. After that, the corresponding M intermediate tokens of reference image

F_{r e f e r e n c e}

can be selected. In the phase of local matching, we want to find the patches in generated images

I_{t a r g e t}

which are semantically matching the selected patches

P_{r e f e r e n c e}

. And these patches should be guided to faithfully fit the reference image, thereby learning domain-specific information more accurately. Similarly, we embed the generated images in intermediate feature maps of the CLIP image encoder and obtain the same N tokens of local representations

F_{t a r g e t}

. For each token

f_{r e f e r e n c e}

in

F_{r e f e r e n c e}

, we calculate the similarity between every two tokens in

f_{t a r g e t}

and

f_{r e f e r e n c e}

to build a similarity map M as shown below,

M^{(i, j)} = g (f_{t a r g e t}^{(i)}, f_{r e f e r e n c e}^{(j)}),

(3)

where

i \in {1, . . ., n}

,

j \in {1, . . ., m}

and g is a similarity metric. According to the similarity map

M^{(i, j)}

, we can find the most corresponding local token

f_{t a r g e t}^{i}

for each token

f_{r e f e r e n c e}^{j}

, which has the minimum cosine similarity loss. After finding the corresponding tokens in generated images, we eventually sum up the loss between every two matched tokens.

L_{H F G} = \sum_{i = 0}^{n} m i n_{j} (1 - \frac{f_{t a r g e t}^{i} \cdot f_{r e f e r e n c e}^{j}}{|f_{t a r g e t}^{i}| \cdot |f_{r e f e r e n c e}^{j}|}),

(4)

through this loss, the generated images from the target domain can fully capture the domain-specific information of the reference image in local details, thereby alleviating the underfitting problem. For more details, please refer to Algorithm 1.

Algorithm 1 Construction of HFG

Input: The sampled generated images of target model

I_{t a r g e t}

; the reference image

I_{r e f e r e n c e}

Output: The proposed loss

L_{p a t c h}

, which is used to update the parameters of

G_{t a r g e t}

stage1: Local Selection

1:: Crop the reference image $I_{r e f e r e n c e}$ into N patches
2:: Calculate the strength of high-frequency signal $L H + H L + H H$ in every patch using DWT
3:: Build a 0,1 mask according to the high-frequency strength of every patch and select the top M patches $P_{r e f e r e n c e}$ , which have more high-frequency signals.
4:: Get the embedding of each selected patch $F_{r e f e r e n c e}$ by CLIP image encoder

stage2: Local Matching

1:: Crop the sampled generated image $I_{t a r g e t}$ into N patches $P_{t a r g e t}$
2:: Get the embedding of each patch $F_{t a r g e t}$ by CLIP image encoder
3:: Match every token i in $F_{t a r g e t}$ to the most related token j in $F_{r e f e r e n c e}$
4:: Sum the distance between every matched token as the proposed loss term $L_{p a t c h}$
5:: $L_{p a t c h} = \sum_{i = 1}^{N} m i n_{j} (1 - \frac{f_{t a r g e t}^{i} * f_{r e f e r e n c e}^{j}}{| f_{t a r g e t}^{i} | * | f_{r e f e r e n c e}^{j} |})$

3.3. Overall Training Loss

Two frequency-auxiliary modules are mainly based on the MTG baseline model [23]. Specifically, it utilized GAN inversion methods [29,34] to find the source domain image

I_{r e f e r e n c e}^{'}

corresponding to the target domain reference image

I_{r e f e r e n c e}

. And then, it computed the CLIP loss

L_{C L I P}

between

I_{r e f e r e n c e}^{'}

and

I_{r e f e r e n c e}

in clip semantic space, which was accompanied by LPIPS loss

L_{L P I P S}

and MSE loss

L_{M S E}

in perceptual and pixel space.

L_{o v e r a l l} = λ_{1} L_{C L I P} + λ_{2} L_{L P I P S} + λ_{3} L_{H F G},

(5)

during the adaptation process, by utilizing the

L_{o v e r a l l}

combined with our LFF module, the fine-tuned model

G_{t a r g e t}

can alleviate overfitting and underfitting issues in most one-shot scenarios.

4. Experiments

In this section, we first introduce the experimental settings of our FAGAN, including implementation details, datasets, and evaluation metrics (Section 4.1). The qualitative and quantitative results across various datasets are shown in Section 4.2 and Section 4.3, respectively, and they demonstrate the effectiveness of our FAGAN to alleviate the issues of overfitting and underfitting. Finally, ablation studies are considered to evaluate the hyper-parameters used by our two modules and validate their effectiveness on the overall model (Section 4.4).

4.1. Experimental Settings

4.1.1. Implementation Details

In our experiments, we use the pretrained Stylegan2 [4] as a generator for domain adaptation like other methods. To verify the efficiency of our frequency-auxiliary modules, we conducte experiments using different source domain datasets: FFHQ [3] and AFHQ-Cat [44]. The pretrained models

G_{s o u r c e}

of FFHQ and AFHQ-Cat can generate images at resolutions of 1024 × 1024 and 512 × 512, respectively. Following MTG [23], we use the Adam optimizer [45] with coefficients

β 1 = 0.0

and

β 2 = 0.99

to train our model. And we set the learning rate to 0.002 and the batch size to 2. For the clip model we use in our model, we utilize the ‘ViT-B/16’ and ‘ViT-B/32’ CLIP image encoders [46]. For the

L_{p a t c h}

calculated in the HFG-Module, we first resize all the images into 224 × 224 resolution and use a patch size of 32 to divide each image into 7 × 7 patches, which corresponds to the number of intermediate tokens of the ‘ViT-B/32’ image encoder. For the weight of losses in

L_{o v e r a l l}

, we set

λ_{1} = 1

and

λ_{2} = λ_{3} = 10

for all settings. Similar to previous works [23,24,25,27], we freeze ‘toRGB’ layers and the mapping network in the adaptation process. In the training stage, we fine-tune the pretrained generator for 200 ∼ 400 iterations, and the total training time took about 25 min using a single RTX 3090 GPU.

At the inference time, we also apply the style mixing trick of MTG [23] to better reflect the global color attribute of the reference image. Specifically, due to the different resolutions of the generator (1024 × 1024, 512 × 512), their representations in W+ space [34] have different numbers of latent vectors (18, 16). In our experiment, during the inference process on the FFHQ dataset which has 18 latent vectors, we replace the last 11 latent vectors of the generated images with those of the reference image. In the AFHQ-Cat inference process, which has 16 latent vectors, we substitute the last nine vectors of the sampled vectors.

4.1.2. Datasets

When FFHQ serves as the source domain for adaptation tasks, we use the reference image mainly collected from three datasets which are commonly used in other related works: ArtstationArtistic-face-HQ (AAHQ) [47], MetFaces [14], and face paintings by Amedeo Modigliani, Fernand Leger and Raphael [48]. Each of them represents a different style of face image, including cartoons, oil paintings, sketches, etc. And each target domain consists of 10 images; we only selected one to train the model. For the adaptation of the AFHQ-Cat source model, we collect target images from the AFHQ-Wild validation dataset and divide them into Tiger, Fox and Lion datasets, which include 103, 49, and 136 images, respectively. In particular, Amedeo Modigliani, Fernand Leger, Raphael, Tiger, Fox and Lion are used to conduct the quantitative comparsion in Section 4.3.

4.1.3. Evaluation Metrics

Following DiFa [27], we use the Fréchet Inception Distance (FID) [49] and Kernel Inception Distance (KID) [50] to quantitatively evaluate our FAGAN. Both metrics can measure the quality and diversity of the generated images, while KID is more suitable for the few-shot setting when there are only a few images in the validation set. In all our experiments, FID and KID are calculated between 1000 synthesized images and each validation set.

4.2. Qualitative Results

Figure 4 shows the image result generated by the target generator adapted from the source generator pretrained with FFHQ. As shown in the Figure 4, the baseline model MTG [23] shows a certain degree of overfitting to the local part of the reference image. For example, in the 2nd and 3rd rows, the expression of the generated images overfit the big smiles and twisted mouths, which lost the domain-sharing information hidden in the source model. In contrast, our method can better preserve the original expressions of the source images with the help of our low-frequency fusion module, thus enhancing the diversity of the generated images. And from the 5 ∼ 6th rows, we can observe that our model can capture more domain-specific information (e.g., the bright red mouth of the clown and the artistic style of paintings) than other methods, which demonstrates the effectiveness of our high-frequency guide module.

Additionally, we also illustrate the qualitative results adapted from AFHQ-Cat in Figure 5. As this is a domain adaptation task that is inherently more challenging compared to the FFHQ dataset, existing methods often exhibit significant underfitting issues. For example, because of the huge difference between the source domain images and target domain images, both MTG and DynaGAN struggle with fitting the complex domain-specific information, like the circular stripes of tigers, dense manes of lions, and so on. However, thanks to our high-frequency guide module, our model can identify which local patterns need to be adjusted to learn and can effectively handle conflicts between source patterns and target patterns. In other words, our model can address the trade-off between diversity and fidelity in one-shot scenarios by utilizing frequency assistance.

4.3. Quantitative Results

For the evaluation of our proposed model, we conducted quantitative experiments in this section. Specifically, we compare our FAGAN with other competing methods [18,23,24,26,27] under six settings, i.e., FFHQ → {Amedeo Modigliani, Fernand Leger, Raphael} and AFHQ-Cat → {Tiger, Fox, Lion}. For each setting, we randomly sample an image from the validation set as the reference image and then perform adaptation, recording both the Fréchet Inception Distance (FID) [49] and Kernel Inception Distance (KID) [50] for each method. Additionally, to mitigate any random sampling error, we conduct five repetitions and calculate the mean value as the final score on the 1000 generated images. The results are listed in Table 1 and Table 2. As shown in the tables, our FAGAN achieves competitive FID and KID results on most settings against other competing methods, which is consistent with the outstanding performance of our methods observed in qualitative experiments. The experimental results also demonstrate the contribution of our method to both the diversity and fidelity of the generated images.

4.4. Ablation Study

In this section, we conduct several ablation experiments to further analyze the impact of different hyper-parameters of the proposed frequency-auxiliary modules as well as the influence of each module/operation on the overall generated images.

4.4.1. The Number of Layers Employing LFF-Module

As mentioned in Section 3.1, we only employ the LFF-Module on the shallow layers of the synthesis network. To verify the impact of the number of layers using the LFF-Module, we conduct ablation experiments on a cartoon character dataset. As shown in Figure 6, we use (2, 4, 6, 8, 10, 12) layers, respectively. And one can easily observe that the model tends to overfitting when the number of fusion layers is too low (such as 2 or 4 layers), and the underfitting issue will appear when the number of fusion layers is too large (such as 8 to 12 layers). We explain that too few fusion layers can result in the model’s inability to fully inherit the diversity of the source domain, while too many fusion layers lead to the model having a limited learning capacity, thereby preventing it from faithfully fitting the reference image. So, in our experiment, we set the fusion layers to 6 in the FFHQ dataset and 4 in the AFHQ-Cat dataset to achieve balanced generation results.

4.4.2. The Number of Tokens Selected by HFG-Module

To verify the impact of the amount of selected tokens for calculating losses in the HFG-Module, we conduct ablation experiments on the Tiger dataset. Specifically, we use (1, 9, 25, 49) tokens of high-frequency patches to calculate

L_{p a t c h}

in the HFG-Module. As shown in Figure 7, using too few tokens (such as 1 or 9) can lead to underfitting of the model, resulting in prominent cat-like features and insufficient tiger stripes. Using too many high-frequency tokens (such as 49) can cause the model to struggle to understand the learning priorities, thus increasing the learning difficulty. This can result in disruptions to the symmetry of the eyes and the appearance of monotonous local patterns of the generated images. In our experiment, we set the number of selected tokens to 25 for both the FFHQ and AFHQ-Cat datasets.

4.4.3. Effect of Two Frequency Modules

Eventually, we conduct ablation studies on the effect of our proposed two critical modules of our FAGAN, i.e., the low-frequency fusion module (LFF-Module) and high-frequency guide module (HFG-Module). As shown in Figure 8, the images can obtain global color attributes through the style mixing trick and generate preliminary clown images as the baseline method. However, the expression and teeth of the generated images are all similar, and the features of the clown are not emphasized. By using the LFF-Module, the generated images exhibit stronger consistency with the original images in terms of expressions, indicating that more domain-sharing information is inherited. After incorporating the HFG-Module, which is our whole-method FAGAN, the generated images clearly capture the significant features of the clown, indicating that the domain-specific attributes of the reference image are adequately learned, e.g., bright red mouth and dark eye circles.

5. Conclusions

In this paper, we have proposed frequency-auxiliary GAN (FAGAN) for the one-shot generative domain adaptation task, which introduces the perspective of frequency into the adapting process and helps alleviate the overfitting and underfitting issues of previous methods. Our FAGAN contains two core modules, the low-frequency fusion module (LFF-Module) and the high-frequency guide module (HFG-Module), which are combined with the baseline clip-based method. The LFF-Module focuses on preserving the domain-sharing information of the source domain by fusing the low-frequency features of the source model. The HFG-Module is dedicated to selecting the domain-specific information of the reference image and guiding the model to focus on fitting them by utilizing high-frequency guidance. Extensive experimental results showed that our method leads to better quantitative and qualitative results than the existing methods under a wide range of task settings.

Author Contributions

Investigation, K.C. and H.L.; methodology, H.L. and J.L.; validation, H.L. and J.L.; writing—original draft preparation, H.L. and B.X.; writing—review and editing, X.L. and K.C.; supervision, K.C. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Liaoning Provincial Social Science Planning Fund (NO. L22CTQ002).

Data Availability Statement

The three datasets used in this paper are publicly available.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LFF	low-frequency fusion module
HFG	high-frequency guide module
FAGAN	frequency-auxiliary GAN
DWT	discrete wavelet transformation
IDWT	inverse discrete wavelet transformation

References

Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014. [Google Scholar] [CrossRef]
Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of gans for improved quality, stability, and variation. arXiv 2017, arXiv:1710.10196. [Google Scholar]
Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar]
Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
Luo, W.; Yang, S.; Wang, H.; Long, B.; Zhang, W. Context-consistent semantic image editing with style-preserved modulation. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2022; pp. 561–578. [Google Scholar]
Li, N.; Plummer, B.A. Supervised attribute information removal and reconstruction for image manipulation. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2022; pp. 457–473. [Google Scholar]
Wang, T.; Zhang, Y.; Fan, Y.; Wang, J.; Chen, Q. High-fidelity gan inversion for image attribute editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11379–11388. [Google Scholar]
Tian, C.; Zhang, X.; Lin, J.C.W.; Zuo, W.; Zhang, Y.; Lin, C.W. Generative adversarial networks for image super-resolution: A survey. arXiv 2022, arXiv:2204.13620. [Google Scholar]
Li, B.; Li, X.; Zhu, H.; Jin, Y.; Feng, R.; Zhang, Z.; Chen, Z. SeD: Semantic-Aware Discriminator for Image Super-Resolution. arXiv 2024, arXiv:2402.19387. [Google Scholar]
Yang, T.; Ren, P.; Xie, X.; Zhang, L. Gan prior embedded network for blind face restoration in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 672–681. [Google Scholar]
Wang, Y.; Holynski, A.; Zhang, X.; Zhang, X. Sunstage: Portrait reconstruction and relighting using the sun as a light stage. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 7–24 June 2023; pp. 20792–20802. [Google Scholar]
Koley, S.; Bhunia, A.K.; Sain, A.; Chowdhury, P.N.; Xiang, T.; Song, Y.Z. Picture that sketch: Photorealistic image generation from abstract sketches. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 7–24 June 2023; pp. 6850–6861. [Google Scholar]
Careil, M.; Verbeek, J.; Lathuilière, S. Few-shot semantic image synthesis with class affinity transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 7–24 June 2023; pp. 23611–23620. [Google Scholar]
Karras, T.; Aittala, M.; Hellsten, J.; Laine, S.; Lehtinen, J.; Aila, T. Training generative adversarial networks with limited data. Adv. Neural Inf. Process. Syst. 2020, 33, 12104–12114. [Google Scholar]
Yang, C.; Shen, Y.; Xu, Y.; Zhou, B. Data-efficient instance generation from instance discrimination. Adv. Neural Inf. Process. Syst. 2021, 34, 9378–9390. [Google Scholar]
Tseng, H.Y.; Jiang, L.; Liu, C.; Yang, M.H.; Yang, W. Regularizing generative adversarial networks under limited data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7921–7931. [Google Scholar]
Li, T.; Li, Z.; Rockwell, H.; Farimani, A.; Lee, T.S. Prototype memory and attention mechanisms for few shot image generation. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023; Volume 18. [Google Scholar]
Ojha, U.; Li, Y.; Lu, J.; Efros, A.A.; Jae Lee, Y.; Shechtman, E.; Zhang, R. Few-shot Image Generation via Cross-domain Correspondence. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar] [CrossRef]
Robb, E.; Chu, W.S.; Kumar, A.; Huang, J.B. Few-shot Adaptation of Generative Adversarial Networks. arXiv 2021, arXiv:2010.11943. [Google Scholar]
Wang, Y.; Wu, C.; Herranz, L.; Weijer, J.; Gonzalez-Garcia, A.; Raducanu, B. Transferring GANs: Generating images from limited data. In Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
Xiao, J.; Li, L.; Wang, C.; Zha, Z.J.; Huang, Q. Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Zhao, Y.; Ding, H.; Huang, H.; Cheung, N.M. A Closer Look at Few-shot Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Zhu, P.; Abdal, R.; Femiani, J.; Wonka, P. Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks. arXiv 2021, arXiv:2110.08398. [Google Scholar]
Gal, R.; Patashnik, O.; Maron, H.; Chechik, G.; Cohen-Or, D. StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators. ACM Trans. Graph. 2021, 41, 1–13. [Google Scholar] [CrossRef]
Kwon, G.; Ye, J. One-Shot Adaptation of GAN in Just One CLIP. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 12179–12191. [Google Scholar] [CrossRef]
Kim, S.; Kang, K.; Kim, G.; Baek, S.H.; Cho, S. DynaGAN: Dynamic Few-Shot Adaptation of GANs to Multiple Domains. In SIGGRAPH Asia 2022 Conference Papers; ACM: New York, NY, USA, 2022. [Google Scholar]
Zhang, Y.; Yao, M.; Wei, Y.; Ji, Z.; Bai, J.; Zuo, W. Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks. Adv. Neural Inf. Process. Syst. 2022, 35, 37297–37308. [Google Scholar]
Radford, A.; Kim, J.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Amanda, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021. [Google Scholar]
Zhu, P.; Abdal, R.; Qin, Y.; Wonka, P. Improved StyleGAN Embedding: Where are the Good Latents? arXiv 2020, arXiv:2012.09036. [Google Scholar]
Mo, S.; Cho, M.; Shin, J. Freeze Discriminator: A Simple Baseline for Fine-tuning GANs. arXiv 2020, arXiv:2002.10964. [Google Scholar]
Zhao, M.; Yang, C.; Carin, L. On Leveraging Pretrained GANs for Generation with Limited Data. In Proceedings of the 37th International Conference on Machine Learning, Online, 13–18 July 2020. [Google Scholar]
Noguchi, A.; Harada, T. Image Generation From Small Datasets via Batch Statistics Adaptation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar] [CrossRef]
Hou, X.; Liu, B.; Zhang, S.; Shi, L.; Jiang, Z.; You, H. Dynamic Weighted Semantic Correspondence for Few-Shot Image Generative Adaptation. In Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, 10–14 October 2022. [Google Scholar] [CrossRef]
Tov, O.; Alaluf, Y.; Nitzan, Y.; Patashnik, O.; Cohen-Or, D. Designing an encoder for StyleGAN image manipulation. ACM Trans. Graph. 2021, 40, 1–14. [Google Scholar] [CrossRef]
Daubechies, I. The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory 1990, 36, 961–1005. [Google Scholar] [CrossRef]
Gao, Y.; Wei, F.; Bao, J.; Gu, S.; Chen, D.; Wen, F.; Lian, Z. High-Fidelity and Arbitrary Face Editing. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar] [CrossRef]
Jiang, L.; Dai, B.; Wu, W.; Loy, C.C. Focal Frequency Loss for Image Reconstruction and Synthesis. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar] [CrossRef]
Yu, Y.; Zhan, F.; Lu, S.; Pan, J.; Ma, F.; Xie, X.; Miao, C. WaveFill: A Wavelet-based Generation Network for Image Inpainting. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar] [CrossRef]
Yoo, J.; Uh, Y.; Chun, S.; Kang, B.; Ha, J.W. Photorealistic Style Transfer via Wavelet Transforms. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar] [CrossRef]
Yang, M.; Wang, Z.; Chi, Z.; Zhang, Y. FreGAN: Exploiting Frequency Components for Training GANs under Limited Data. Adv. Neural Inf. Process. Syst. 2022, 35, 33387–33399. [Google Scholar]
Yang, M.; Wang, Z.; Chi, Z.; Feng, W. WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2022. [Google Scholar]
Bhardwaj, J.; Nayak, A. Haar wavelet transform—Based optimal Bayesian method for medical image fusion. Med. Biol. Eng. Comput. 2020, 58, 2397–2411. [Google Scholar] [CrossRef]
Gu, Z.; Li, W.; Huo, J.; Wang, L.; Gao, Y. LoFGAN Fusing Local Representations for Few-shot Image Generation.pdf. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar] [CrossRef]
Choi, Y.; Uh, Y.; Yoo, J.; Ha, J.W. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, DC, USA, 13–19 June 2020. [Google Scholar]
Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Liu, M.; Li, Q.; Qin, Z.; Zhang, G.; Wan, P.; Zheng, W. BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation. Adv. Neural Inf. Process. Syst. 2021, 34, 29710–29722. [Google Scholar]
Yaniv, J.; Newman, Y.; Shamir, A. The face of art. ACM Trans. Graph. 2019, 38, 1–15. [Google Scholar] [CrossRef]
Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. Neural Inf. Process. Syst. Inf. Process. Syst. 2017, 8, 25–34. [Google Scholar] [CrossRef]
Bińkowski, M.; Sutherland, D.; Arbel, M.; Gretton, A. Demystifying MMD GANs. arXiv 2018, arXiv:1801.01401. [Google Scholar]

Figure 1. The visual examples of the overfitting and underfitting issues in the previous method MTG [23]. The overfitting is reflected in similar facial expressions (e.g., smiling face), while the underfitting is manifested in incomplete tiger stripes in generated images. The source image is obtained through the GAN inversion method [29] on the reference image. The first row on the right shows images sampled from the source domain, and the second row shows their corresponding generated target domain images.

Figure 2. Visualization of the transformed frequency components by using discrete wavelet transformation (DWT) on reference images from difference target domains.

L L

denotes the coarse-grained low-frequency component, and

L H, H L,

and

H H

represent the fine-grained high-frequency components.

Figure 2. Visualization of the transformed frequency components by using discrete wavelet transformation (DWT) on reference images from difference target domains.

L L

denotes the coarse-grained low-frequency component, and

L H, H L,

and

H H

represent the fine-grained high-frequency components.

Figure 3. The overall framework of our proposed FAGAN which cosisted of two core modules, i.e., a low-frequency fusion module (LFF) and high-frequency guide module (HFG-Module). At the bottom of the figure, the gray box shows the details of the LFF-Module. At the bottom of the figure, the orange box shows the details of the HFG-Module.

Figure 4. Qualitative comparisons using the generator pretrained on FFHQ [3] between our FAGAN, MTG [23], and DynanGAN [26]. The first row and first column show sampled source images and reference images in the target domain. Our FAGAN can inherit the diversity from the source domain, faithfully generate target domain images, and achieve a good trade-off between diversity and fidelity.

Figure 5. Qualitative comparisons using the generator pretrained on AFHQ-Cat [44]. The first row shows the sampled source images, while the first column presents reference images in the target domain. Compared to other methods, our FAGAN faithfully captures the essential domain-specific information from the reference image (e.g., tiger stripes, lion mane) while also preserving the domain-sharing information from the source domain (e.g., posture, expression).

Figure 6. The impact of the number of layers employing LFF-Module. The more fusion layers used in the LFF-Module, the more source domain information is preserved in the target model.

Figure 7. The impact of the number of tokens selected by HFG-Module.

Figure 8. Ablation study of the modules and style mixing used in our framework. From left to right: the reference image inference and sampled source images, using style mixing tricks in inference time, the baseline approach (MTG), adding our LFF-Module and adding our HFG-Module to achieve our complete method.

Table 1. FID (↓) comparisons between different one-shot domain adaptation methods. Each result is averaged over 5 training shots.

Methods	FFHQ			AFHQ-Cat
Methods	Amedeo	Fernand	Raphael	Fox	Lion	Tiger
FSA [18]	173.66 ± 19.43	227.19 ± 20.33	177.43 ± 29.29	-	-	-
NADA [24]	187.42 ± 21.33	258.35 ± 19.17	185.37 ± 30.14	84.28 ± 14.33	60.55 ± 17.44	16.74 ± 2.57
MTG [23]	207.17 ± 33.18	306.03 ± 51.29	184.35 ± 27.41	67.64 ± 17.77	63.28 ± 18.05	21.25 ± 8.32
DynaGAN [26]	229.19 ± 27.32	317.97 ± 47.98	186.03 ± 14.20	93.04 ± 20.01	86.35 ± 17.33	25.14 ± 9.79
DiFa [27]	178.72 ± 18.95	255.17 ± 16.61	170.76 ± 9.15	69.92 ± 15.55	42.01 ± 11.28	16.26 ± 2.88
FAGAN (Ours)	170.68 ± 17.99	281.74 ± 30.43	166.36 ± 14.70	65.39 ± 10.66	35.20 ± 10.98	21.41 ± 4.33

Table 2. KID (↓) comparisons between different one-shot domain adaptation methods. Each result is averaged over 5 training shots.

Methods	FFHQ			AFHQ-Cat
Methods	Amedeo	Fernand	Raphael	Fox	Lion	Tiger
FSA [18]	178.04 ± 4.97	190.26 ± 13.88	164.64 ± 60.19	-	-	-
NADA [24]	131.77 ± 23.57	177.47 ± 30.48	148.32 ± 45.89	77.31 ± 35.24	54.37 ± 18.02	14.22 ± 2.44
MTG [23]	138.55 ± 30.12	205.01 ± 28.11	128.51 ± 37.26	69.97 ± 25.99	69.55 ± 21.74	21.50 ± 7.78
DynaGAN [26]	159.31 ± 31.77	219.81 ± 23.89	112.08 ± 27.39	82.73 ± 37.88	77.14 ± 22.19	22.81 ± 8.10
DiFa [27]	120.78 ± 23.91	167.22 ± 33.15	110.13 ± 17.77	51.01 ± 31.24	36.73 ± 20.45	13.29 ± 1.97
FAGAN (Ours)	116.79 ± 21.58	189.67 ± 25.82	103.33 ± 23.15	47.91 ± 24.43	34.62 ± 16.80	20.90 ± 9.31

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cheng, K.; Liu, H.; Liu, J.; Xu, B.; Liu, X. Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks. Electronics 2024, 13, 2643. https://doi.org/10.3390/electronics13132643

AMA Style

Cheng K, Liu H, Liu J, Xu B, Liu X. Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks. Electronics. 2024; 13(13):2643. https://doi.org/10.3390/electronics13132643

Chicago/Turabian Style

Cheng, Kan, Haidong Liu, Jiayu Liu, Bo Xu, and Xinyue Liu. 2024. "Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks" Electronics 13, no. 13: 2643. https://doi.org/10.3390/electronics13132643

APA Style

Cheng, K., Liu, H., Liu, J., Xu, B., & Liu, X. (2024). Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks. Electronics, 13(13), 2643. https://doi.org/10.3390/electronics13132643

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks

Abstract

1. Introduction

2. Related Work

2.1. Domain Adaptation of GANs

2.2. Frequency Information Used in GANs

3. Our Method

3.1. Low-Frequency Fusion Module

3.2. High-Frequency Guide Module

3.3. Overall Training Loss

4. Experiments

4.1. Experimental Settings

4.1.1. Implementation Details

4.1.2. Datasets

4.1.3. Evaluation Metrics

4.2. Qualitative Results

4.3. Quantitative Results

4.4. Ablation Study

4.4.1. The Number of Layers Employing LFF-Module

4.4.2. The Number of Tokens Selected by HFG-Module

4.4.3. Effect of Two Frequency Modules

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI