An Adversarial Single-Domain Generalization Network for Fault Diagnosis of Wind Turbine Gearboxes

: In deep learning-based fault diagnosis of the wind turbine gearbox, a commonly faced challenge is the domain shift caused by differing operational conditions. Traditional domain adaptation methods aim to learn transferable features from the source domain and apply them to the target data. However, such methods still require access to target domain data during the training process, which limits their applicability in real-time fault diagnosis. To address this issue, we introduce an adversarial single-domain generalization network (ASDGN). It relies solely on data from a single length of data acquisition in wind turbine fault diagnosis. This novel approach introduces a more flexible and efficient solution to the field of real-time fault diagnosis for wind turbines.


Introduction
The gearbox is a crucial component of the wind turbine (WT) drivetrain system, operating under fluctuating load conditions for extended periods.It is characterized by a high failure rate, extended downtime during failures, and substantial consequences.Gearbox fault diagnosis is a focal point in the maintenance and operation of wind turbines [1,2].Among various methods, fault diagnosis approaches based on deep learning (DL) have garnered significant attention due to their advantages such as not requiring specialized knowledge and achieving high diagnostic accuracy [3,4].Zhang et al. proposed a semisupervised learning model based on a multi-associative layer network and applied it to the fault diagnosis of planetary gearboxes [5].This method demonstrated significant performance advantages in situations with limited labeled data.However, DL relies on a substantial amount of labeled data for training and assumes that test data and training data originate from the same distribution.In practice, wind turbine gearbox operations are often subject to changing conditions influenced by factors like wind and weather.The data collected frequently deviate from the distribution of training data, significantly diminishing the diagnostic accuracy of DL models under real operational conditions [6][7][8].Domain adaptation (DA) is the primary method for addressing data distribution shift and can be employed to achieve high-precision fault diagnosis for WT gearboxes under varying operating conditions [9].
In recent years, DA has found extensive applications in domain shift fault diagnosis tasks [10,11].Its primary objective is to reduce the distribution distance in high-order subspaces between the source domain and the target domain, extracting domain-invariant features to mitigate the effects caused by domain shift, as illustrated in Figure 1a.However, there are still limitations when applying DA methods to practical WT fault diagnosis tasks.Firstly, DA methods necessitate prior access to target domain data, which remains challenging for WTs.Additionally, these methods can only adapt to specific target domains, lacking the required generalization capacity for unknown operating conditions.Ensuring robustness in WT fault diagnosis tasks with variable operating conditions proves to be a challenging endeavor [12].
Compared to DA, DG exhibits the capability to generalize to unknown domains.However, DG demands a larger volume of data, necessitating a substantial number of samples from multiple source domains to achieve optimal results.In the context of WTs, collecting comprehensive fault samples from various operating conditions is a time-consuming and labor-intensive process.Therefore, this study explores a single-source domain generalization (SDG) fault diagnosis method, which alleviates the constraints associated with multiple source domains.This approach achieves generalization to unknown domains with training data sourced from a single domain, as depicted in Figure 1c.In line with the real-time fault diagnosis needs of actual WTs, this study proposes a novel Adversarial single-domain generalization network (ASDGN).The ASDGN primarily consists of a task module and a domain augmentation module.The domain augmentation module is capable of generating augmented data that diverges from the source domain distribution.This augmented data constitute several expanded domains, each exhibiting a shift from the source data.Subsequently, the task module extracts invariant fault features from this data to enhance its generalization capabilities.To ensure that the domain augmentation module can generate data with sufficient distributional disparities, a variational autoencoder (VAE) is employed to amplify the feature distribution difference In order to eliminate the dependence on target domain data, fault diagnosis methods based on domain generalization (DG) have emerged as a new research focus.DG methods involve learning from multiple source domains, extracting domain-invariant features, and consequently constructing fault diagnosis models capable of generalizing to unknown target domains, as illustrated in Figure 1b [13][14][15].
Compared to DA, DG exhibits the capability to generalize to unknown domains.However, DG demands a larger volume of data, necessitating a substantial number of samples from multiple source domains to achieve optimal results.In the context of WTs, collecting comprehensive fault samples from various operating conditions is a time-consuming and labor-intensive process.Therefore, this study explores a single-source domain generalization (SDG) fault diagnosis method, which alleviates the constraints associated with multiple source domains.This approach achieves generalization to unknown domains with training data sourced from a single domain, as depicted in Figure 1c.
In line with the real-time fault diagnosis needs of actual WTs, this study proposes a novel Adversarial single-domain generalization network (ASDGN).The ASDGN primarily consists of a task module and a domain augmentation module.The domain augmentation module is capable of generating augmented data that diverges from the source domain distribution.This augmented data constitute several expanded domains, each exhibiting a shift from the source data.Subsequently, the task module extracts invariant fault features from this data to enhance its generalization capabilities.To ensure that the domain augmentation module can generate data with sufficient distributional disparities, a variational autoencoder (VAE) is employed to amplify the feature distribution difference between the augmented data and the source data.Concurrently, the Wasserstein distance is used as a metric to constrain the distributions of the two, guaranteeing intra-class compactness and semantic consistency.Through the iterative minimax game between the two, augmented domains are generated, enabling the model to achieve generalization on unknown domains.
The main contributions of this paper can be summarized into the following three points: (1) We investigated the generalization issue of WT gearbox fault diagnosis under a single-source domain.Most existing studies related to WT gearbox fault diagnosis are conducted under the assumption of multi-source domain data.This paper studies the problem using only single-source domain data, which is more in line with actual operational conditions.
(2) We proposed an adversarial single-domain generalization model (ASDGN), which is applied to the fault diagnosis of WT gearboxes.This model exhibits commendable generalization performance in unknown domains, enabling it to adapt to the changing conditions in wind turbine scenarios.
(3) We conducted experimental validation and comparative experiments using the wind power gearbox dataset and the Southeast University gearbox dataset.The results demonstrate the efficacy and superiority of the proposed model in single-domain generalization problems.
The rest of the paper is structured as follows.Section 2 introduces related work.Section 3 elaborates on the proposed method in detail.Section 4 discusses the experiments and analyzes the results.Section 5 concludes the paper.

Related Work
Fault diagnosis methods based on DL, such as convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) networks, have garnered significant attention due to their exceptional feature extraction and fault classification capabilities.They are able to directly distill fault information from raw input data within an end-to-end framework, leading to effective fault diagnosis.Qiu et al. utilized a variety of feature extraction methods to construct a feature space comprising 52 features, which was then subjected to dimensionality reduction.This reduced feature space was employed for fault diagnosis using deep convolutional neural networks [16].Durbhaka et al. proposed a Long Short-Term Memory (LSTM) network model.This model, optimized through multiple swarm intelligence models, more effectively captures temporal information, thus achieving highprecision gearbox fault classification [17].Wang et al. employed kernel functions to enhance the capability of feature learning and designed a novel autoencoder known as KAE (Kernel Autoencoder).Subsequently, they constructed a DL model using one KAE and multiple traditional autoencoders to extract deeper-level features [18].In practical engineering, due to the influence of varying working conditions, there exists domain shift between the source and target domains.DA methods can assist the model in adapting to changes across different data domains, thereby enhancing the accuracy of fault diagnosis.DA methods can generally be categorized into two types: distance metric-based domain adaptation methods and adversarial domain adaptation methods.The former primarily aligns the features of the source and target domains using methods like Maximum Mean Discrepancy (MMD).The latter introduces a domain discriminator to distinguish whether the features extracted by the feature extractor are from the source or target domain.Through adversarial training, the domain discriminator and feature extractor compete against each other, compelling the feature extractor to produce domain-invariant features, thereby enhancing the model's generalization performance in the target domain.Guo and colleagues introduced a novel intelligent fault diagnosis model, termed deep convolutional transfer learning network (DCTLN).This model adopts the minimization of MMD for domain alignment, achieving the objective of cross-domain fault diagnosis [19].Wan et al. introduced the use of the MMD as a metric criterion.They employed both marginal distribution alignment and conditional distribution alignment techniques in both the source and target domains.Additionally, they dynamically adjusted the relative importance between these alignment approaches [20].Chen et al. incorporated a conditional domain discriminator in their model to differentiate whether data originated from the source domain or the target domain.Through adversarial learning between the feature extractor and the conditional domain discriminator, this approach guided the feature extractor to extract domain-invariant features [21].An et al. introduced a domain adaptation network based on contrastive learning (DACL).It achieves domain-level MMD and conditional distribution alignment separately for the source and target domains.Additionally, they introduced an adaptive factor to dynamically adjust the relative importance of these alignment methods [22].
While DA methods have demonstrated excellent performance in fault diagnosis, they are still constrained by the availability of target domain data, making it challenging to adapt to online fault diagnosis tasks.DG fault diagnosis methods that do not require access to target domain data are gaining increasing attention as a research hotspot.[25].Additionally, domain generalization methods also encompass regularization-based DG, meta-learningbased DG, and representation learning-based DG [26][27][28][29][30][31].These approaches have garnered extensive attention across various domains, including computer vision, natural language processing, and intelligent fault diagnosis.
However, the previously mentioned DG methods typically necessitate labeled data from multiple source domains, which still pose challenges in the context of WT fault diagnosis.This paper's primary focus lies in investigating the issue of fault diagnosis based on single-source domain generalization, with the aim of breaking free from the stringent reliance on data from multiple source domains and better catering to the requirements of WT fault diagnosis.

Problem Definition
This study aims to delve into the fault diagnosis techniques of WT gearboxes under various unknown operational conditions.The core objective is to develop a training model based on data from a single operational condition, ensuring its superior generalization performance across multiple conditions.
Detailed problem description: In the source domain , there are j sub-domains.Each sub-domain contains m labeled data points, with each source domain following its distinct joint distribution P j (x j s , y j s ).For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain-" means the data is not needed.In target conditions, "✓" signifies that the model is applicable to that condition, and " Table 1.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain-" suggests it is not suitable for that condition.In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.

Training Conditions
Target Conditions   For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain-

Method
in the a comparison between this study's problem and past research settings and scope of application., with substantial domain shift [32,33].Sub    In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain-    In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.

Training Conditions
Target Conditions   For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain-  In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.

Target Domain Data
Target Domain Labels

Same Distribution between the Source and Target Domains Different Distribution between the Source and Target Domains
Multi-Target Domains  For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain- For the task model, our goal is for the feature extractor F to extract domain-invariant feature vectors Z from the source domain S, which can then be accurately classified by the classifier C. Thus, we employ adversarial training techniques to generate several augmented domains S + , with substantial domain shift [32,33].Subsequently, we learn domain-agnostic features from the source domain and these augmented domains to train a domain-generalized model, which is where ω represents the model parameters for the feature extractor and classifier.D = S, S + 1 , • • •S + K denotes the set comprising the source domain and all augmented domains.In the equation above, L classifier can be computed using the cross-entropy function.
where ŷ represents the softmax output of the model, with ŷi denoting the ith dimension of ŷ. y stands for the one-hot representation of the model's true label, and y i indicates the ith dimension of y.K signifies the number of classes.The domain expansion module functions to generate several expanded domains from the source domain, each having a distribution distinct from the source, to simulate unknown domains.In this process, we employ variational autoencoders (VAE) to amplify domain shift [34].To prevent the expanded domains from diverging, a semantic consistency constraint is also introduced.This task is accomplished by the feature extractor F. The overall loss function for expanded domain generation can be expressed as where L classifier represents the classification loss and can be expressed as Equation ( 2), L contract signifies the semantic consistency constraint and its specific implementation will be detailed in Section 3.3, L expand induces domain shift and its specific implementation will be detailed in Section 3.4.The parameters α and β are hyperparameters that balance the training.Through adversarial training between VAE and F, the objective is to generate expanded domains with the maximum domain divergence from the source domain, thereby enhancing the model's generalization capabilities in unknown domains.Given the target function, expanded data x + is generated in the expanded domain S + using gradient ascent: x where µ represents the learning rate.Ideal adversarial samples are generated through multiple iterations.The configuration and detailed implementation of each sub-module will be discussed subsequently.

Feature Extractor
Figure 3 depicts the architectural configuration of the feature extractor F [35][36][37][38][39]. This section primarily comprises convolutional layers and fully connected layers, ultimately producing a 1024 × 1 feature vector Z.During the generation of domain-augmented data, we constrain the domain shift by measuring the Wasserstein distance between z and z + , thereby ensuring semantic consistency.
where, z, z + ∈ Z; z = F(x) and z + = F(x + ).||.|| represents the Wasserstein distance between the source domain and the augmented domain; ensures that the data originates from the same fault type.L contract constrains the conditional distribution difference between the source and augmented domains, preventing the generation of ineffective samples in the augmented domain.However, this also limits the domain shift distance.Therefore, L expand is introduced to amplify the domain shift, and its specific definition will be elaborated in Section 3.4.

Variational Autoencoder
The VAE consists of an encoder and a decoder [40][41][42].The encoder maps the input data x into a latent space z, with the posterior distribution of z represented as ( | ) q z x .
Unlike the autoencoder, the VAE maps the distribution parameters of z onto a standard distribution matching the input data, generally a Gaussian distribution.The decoder then samples from the reconstructed z to generate new samples x .The prior probability is ( ) ( | ) p z p x z .The objective of the VAE is to make these two distributions as close as possible, with their closeness measured by the KL divergence between them: As the KL divergence is always non-negative, the VAE model can be optimized using the following equation: Utilizing the source domain data to pre-train the VAE, its parameters are then frozen.Subsequently, it is used to maximize the domain distance between the generated data and the original data: As illustrated in Figure 4, during this process, Lexpand aims to expand the augmented domain, while Lcontract attempts to contract it.Through adversarial training of the two, we acquire a generalized dataset to train a robust generalization model.

Variational Autoencoder
The VAE consists of an encoder and a decoder [40][41][42].The encoder maps the input data x into a latent space z, with the posterior distribution of z represented as q(z|x) .Unlike the autoencoder, the VAE maps the distribution parameters of z onto a standard distribution matching the input data, generally a Gaussian distribution.The decoder then samples from the reconstructed z to generate new samples x.The prior probability is p(z)p( x|z) .The objective of the VAE is to make these two distributions as close as possible, with their closeness measured by the KL divergence between them: As the KL divergence is always non-negative, the VAE model can be optimized using the following equation: Utilizing the source domain data to pre-train the VAE, its parameters are then frozen.Subsequently, it is used to maximize the domain distance between the generated data and the original data: As illustrated in Figure 4, during this process, L expand aims to expand the augmented domain, while L contract attempts to contract it.Through adversarial training of the two, we acquire a generalized dataset to train a robust generalization model.

Model Optimization
In the pre-training of VAE, Adam is employed to optimize the objective in Equation ( 7), leading to the following parameter update: where ν represents the initial learning rate for pre-training, and θ denotes the parameters of VAE.Subsequently, based on Equations ( 3) and ( 4), augmented data is generated in the expanded domain.We employ SGD as the optimizer for adversarial training, and the parameter updates for the augmented data are as follows: where µ is the initial learning rate for domain expansion, and t x + represents the parameters of the augmented data.
Lastly, using Adam to optimize the parameters ω within the task model, with the optimization target given by Equation ( 2), the specific representation for the parameter update of the task model is where ρ is the initial learning rate.In summary, the training process of the ASDGN is summarized in Algorithm 1.

Model Optimization
In the pre-training of VAE, Adam is employed to optimize the objective in Equation ( 7), leading to the following parameter update: where ν represents the initial learning rate for pre-training, and θ denotes the parameters of VAE.Subsequently, based on Equations ( 3) and ( 4), augmented data is generated in the expanded domain.We employ SGD as the optimizer for adversarial training, and the parameter updates for the augmented data are as follows: where µ is the initial learning rate for domain expansion, and x + t represents the parameters of the augmented data.
Lastly, using Adam to optimize the parameters ω within the task model, with the optimization target given by Equation (2), the specific representation for the parameter update of the task model is where ρ is the initial learning rate.In summary, the training process of the ASDGN is summarized in Algorithm 1.

Experiments
To validate the effectiveness and superiority of the proposed method, experiments were conducted on two datasets and compared with other state-of-the-art fault diagnosis models.Additionally, the roles of each module were verified through ablation studies.Under each condition, there are five types of gear fault states, health, chipped, miss, root, and surface, as illustrated in Figure 6b.The data sampling frequency is 5120 Hz.The two conditions in this dataset have significant differences, which effectively validates the generalization performance of the model.2. Dataset 2 [43]: The gear fault data is collected from the gearbox fault simulation test rig, as shown in Figure 6a.It includes two conditions of speed-load, 20-0 and 30-2.Under each condition, there are five types of gear fault states, health, chipped, miss, root, and surface, as illustrated in Figure 6b.The data sampling frequency is 5120 Hz.The two conditions in this dataset have significant differences, which effectively validates the generalization performance of the model.2.

Comparison Experiment
To evaluate the performance of the ASDGN, we selected several outstanding fault diagnosis models for comparison.
CNN (baseline): the CNN applies the model trained on source domain data directly to the target domain data.
DT-DDG [44]: DT-DDG integrates adversarial training with dynamic weighting and batch spectral penalization regularization to extract domain-invariant and discriminative features from multiple source domains.It is an effective domain generalization (DG) method.
L2A-OT [45]: L2A-OT is a DG method based on domain augmentation.It enhances the model's generalizability by establishing a pseudo source domain through divergence maximization.
AMInet [46]: AMInet is an adversarial domain generalization network guided by mutual information (MI).It achieves domain generalization through adversarial training by maximizing the lower bound of MI and minimizing the upper bound of MI.
RTDGN [47]: RTDGN is another novel DG method.It enhances the network's generalization capabilities by mining domain-invariant information through two phases, taskagnostic domain adaptation and task-related domain generalization.
To ensure fairness in the experiments, the input data and hyperparameter settings for all comparison methods were set according to the requirements of the original papers.Additionally, adjustments were made to the experimental settings to ensure satisfactory results for our specific scenario.All experiments were conducted on a GPU with Nvidia 3070Ti, and PyTorch was used as the programming platform.The selection of hyperparameters can be guided by referencing the hyperparameter configurations of baseline models and similar tasks.Fine-tuning adjustments are then made to ensure the model converges to a satisfactory state.In the method proposed in this paper, the number of epochs for pre-training is E1 = 25, for adversarial training is E2 = 20, and for task model training it is E3 = 200.For the comparison methods, the number of training batches for the task model is also set to 200.The batch size is uniformly set to 32.To eliminate the impact of randomness, the results are presented as the average of 20 trials [48].

Experimental Results
The comparative experimental results on Dataset 1 are shown in Figure 7 and Table 3.Based on the experimental results, the following conclusions can be drawn: The CNN, which does not incorporate domain generalization structure, has the lowest average accuracy of only 78.37%.This is due to the difference in distribution between the test and training data, leading to reduced model generalization performance.Our proposed model demonstrated a notable improvement over the baseline, achieving a substantial accuracy boost of 13.58%.This underscores the model's effectiveness in addressing domain generalization challenges.In the four domain generalization tasks, the model proposed in this paper consistently achieves high accuracy, indicating its robustness under different data distributions.When compared to other domain generalization models, it was observed that, compared to the baseline, both L2A-OT and AMInet effectively enhance accuracy by 6.98% and 10.45%, respectively.However, their performances are slightly inferior to the model proposed in this paper, suggesting that the VAE-based domain augmentation method is marginally superior in exploring unknown domains compared to the maximum divergencebased domain augmentation method and MI-based domain generalization method.
that, compared to the baseline, both L2A-OT and AMInet effectively enhance accuracy by 6.98% and 10.45%, respectively.However, their performances are slightly inferior to the model proposed in this paper, suggesting that the VAE-based domain augmentation method is marginally superior in exploring unknown domains compared to the maximum divergence-based domain augmentation method and MI-based domain generalization method.
The results of the aforementioned methods on Dataset 2 are shown in Table 4 and Figure 8.It can be observed that the accuracy of the CNN on this dataset is only 66.17%.This is because the two conditions in Dataset 2 differ in both load and speed, resulting in a greater distribution difference, thereby increasing the generalization difficulty.Nevertheless, the DMDGN model proposed in this study still maintains an average accuracy of 90.04%, demonstrating its ability to maintain good generalization performance even in complex tasks.The results of the aforementioned methods on Dataset 2 are shown in Table 4 and Figure 8.It can be observed that the accuracy of the CNN on this dataset is only 66.17%.This is because the two conditions in Dataset 2 differ in both load and speed, resulting in a greater distribution difference, thereby increasing the generalization difficulty.Nevertheless, the DMDGN model proposed in this study still maintains an average accuracy of 90.04%, demonstrating its ability to maintain good generalization performance even in complex tasks.Taking task G2 in Dataset 2 as an example, Figure 9a and Figure 9b, respectively, illustrate the iteration patterns of the CNN and ASDGN during training.From the figures, it can be observed that as the iterations progress, the accuracy rate of the CNN on the training set incrementally rises.However, on the test set, after an initial rise, the accuracy starts to decline and stabilizes after 50 epochs.This may be attributed to the model's inadequacy in extracting domain-invariant features, leading to overfitting.In contrast, for the ASDGN, even though the training speed slightly decelerates due to the introduction of the expanded domain, it demonstrates stability after 100 epochs.Notably, its accuracy on the test set closely matches that of the training set, indicating that the ASDGN is more effective in extracting domain-invariant features, thereby enhancing its generalization capabilities in unknown domains.Taking task G2 in Dataset 2 as an example, Figure 9a and Figure 9b, respectively, illustrate the iteration patterns of the CNN and ASDGN during training.From the figures, it can be observed that as the iterations progress, the accuracy rate of the CNN on the training set incrementally rises.However, on the test set, after an initial rise, the accuracy starts to decline and stabilizes after 50 epochs.This may be attributed to the model's inadequacy in extracting domain-invariant features, leading to overfitting.In contrast, for the ASDGN, even though the training speed slightly decelerates due to the introduction of the expanded domain, it demonstrates stability after 100 epochs.Notably, its accuracy on the test set closely matches that of the training set, indicating that the ASDGN is more effective in extracting domain-invariant features, thereby enhancing its generalization capabilities in unknown domains.
adequacy in extracting domain-invariant features, leading to overfitting.In contrast, for the ASDGN, even though the training speed slightly decelerates due to the introduction of the expanded domain, it demonstrates stability after 100 epochs.Notably, its accuracy on the test set closely matches that of the training set, indicating that the ASDGN is more effective in extracting domain-invariant features, thereby enhancing its generalization capabilities in unknown domains.In addition, we also compared the average training times of different models, as detailed in Table 5.It is observed that the ASDGN incurs a longer training time due to the necessity of domain expansion.However, considering its robust generalization performance across multiple target domains, the additional training time is acceptable.In addition, we also compared the average training times of different models, as detailed in Table 5.It is observed that the ASDGN incurs a longer training time due to the necessity of domain expansion.However, considering its robust generalization performance across multiple target domains, the additional training time is acceptable.

Visualization Analysis
To intuitively observe the efficacy of the ASDGN in fault diagnosis, we employed the T-SNE method to visualize the feature layer.Using B3 as a reference, the feature visualization results are shown in Figure 10, with its corresponding confusion matrix presented in Figure 11.
As observed from Figure 10f, the features extracted by the ASDGN demonstrate commendable clustering and distinct inter-class boundaries.On the contrary, Figure 10a reveals that the CNN has ambiguous inter-class distances, with some features even showing signs of confusion, making accurate classification challenging.This vividly underscores the robust generalization capability of the ASDGN across multiple unknown domains.

Visualization Analysis
To intuitively observe the efficacy of the ASDGN in fault diagnosis, we employed the T-SNE method to visualize the feature layer.Using B3 as a reference, the feature visualization results are shown in Figure 10, with its corresponding confusion matrix presented in Figure 11.

Visualization Analysis
To intuitively observe the efficacy of the ASDGN in fault diagnosis, we employed the T-SNE method to visualize the feature layer.Using B3 as a reference, the feature visualization results are shown in Figure 10, with its corresponding confusion matrix presented in Figure 11.         the robust generalization capability of the ASDGN across multiple unknown domains.From the results, it is evident that the accuracy of the ASDGN without L expand drops significantly.This is because, although the expanded domain increases the number of samples, it does not cover the unknown domain, resulting in an inadequate generalization capability.The ASDGN without L contract has slightly lower accuracy than the standard ASDGN and shows poorer stability during iterations.This is attributed to the absence of L contract 's constraint, leading to the inclusion of some faulty samples in the generated samples.Additionally, it is worth noting that since the ASDGN pre-trains the feature extractor and classifier during the domain expansion phase, its initial accuracy when training the task model is higher than that of a CNN model trained from scratch.

Conclusions
This paper focused on the fault diagnosis of wind turbine gearboxes, considering the variability of operating conditions and the challenges in data acquisition.A single-source domain generalization model based on adversarial training was proposed.This model can be trained using labeled data from a single operating condition and can generalize across multiple unknown domains.The model employs adversarial training between the VAE and feature extractor to generate a series of expanded domains.From these domains, generalized fault features are extracted, enhancing the diagnostic accuracy in unknown domains.Extensive experiments on two datasets demonstrated the model's efficacy in improving fault diagnosis accuracy in unseen domains.Compared to the CNN, the proposed method boosted accuracy by 14.83% and 23.87% on the bearing and gear datasets, respectively.Through comparative experiments, it can be discerned that, in comparison to domain generalization methods widely employed in other scenarios, our model exhibits superior accuracy and robustness.Moreover, ablation studies further elucidate the influence and significance of the two modules in adversarial training.As the training process does not require access to target domain data and the model exhibits strong generalization across multiple unknown domains, it holds great potential for realtime fault diagnosis in wind turbine systems.In future endeavors, we aim to employ more intricate neural network architectures as feature extractors, integrating components such as self-attention mechanisms and recurrent neural networks to adapt to increasingly complex input scenarios.Furthermore, we will explore the incorporation of meta-learning techniques during the training phase to augment the model's generalization capabilities.

Figure 1 .
Figure 1.The illustration depicts the schematics of DA, DG, and SDG.The distinct domains signify data originating from different operational conditions.(a) DA: extracting transferable features from both the source and target domains to facilitate knowledge transfer.(b) DG: extracting domaininvariant features from multiple source domains, aiming for direct application to an unseen target domain.(c) SDG: harnessing domain-invariant features solely from a single-source domain, eliminating the multi-source domain constraints inherent in DG.

Figure 1 .
Figure 1.The illustration depicts the schematics of DA, DG, and SDG.The distinct domains signify data originating from different operational conditions.(a) DA: extracting transferable features from both the source and target domains to facilitate knowledge transfer.(b) DG: extracting domaininvariant features from multiple source domains, aiming for direct application to an unseen target domain.(c) SDG: harnessing domain-invariant features solely from a single-source domain, eliminating the multi-source domain constraints inherent in DG.

Figure 2
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2
Figure 2 illustrates the overall structure of the ASDGN p primarily consists of the task module and domain expansi comprises a feature extractor F, defined by Z = F(X), which into feature vector Z.The classifier C, defined by Y = C(Z), pre feature vector Z.The domain expansion model, in addition t classifier C, also includes a variational autoencoder E, repres data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.
For the task model, our goal is for the feature extractor F feature vectors Z from the source domain S, which can then b classifier C. Thus, we employ adversarial training techniqu mented domains S +

Figure 2
Figure 2 illustrates the overall structure o primarily consists of the task module and d comprises a feature extractor F, defined by Z into feature vector Z.The classifier C, defined feature vector Z.The domain expansion mode classifier C, also includes a variational autoen data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN mode

Figure 2
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2
Figure 2 illustrates the overall structure o primarily consists of the task module and d comprises a feature extractor F, defined by Z into feature vector Z.The classifier C, defined feature vector Z.The domain expansion mode classifier C, also includes a variational autoen data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN mode

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x + ∈ S + and x ∈ S. Here, S and S + denote the source domain and the expanded domain, respectively.

Figure 2
Figure 2 illustrates the overall structure of the ASDGN proposed in this study, which primarily consists of the task module and domain expansion module.The task model comprises a feature extractor F, defined by Z = F(X), which encodes the model input X into feature vector Z.The classifier C, defined by Y = C(Z), predicts the label Y through the feature vector Z.The domain expansion model, in addition to the feature extractor F and classifier C, also includes a variational autoencoder E, represented as X R = E(X), used for data reconstruction.Let x S + + ∈

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 2 .
Figure 2. Overall architecture of the ASDGN model.

Figure 3 .
Figure 3. Structure diagram of the feature extractor F.

Figure 3 .
Figure 3. Structure diagram of the feature extractor F.

Figure 4 .
Figure 4. Schematic illustration of the effects of Lexpand and Lcontract in data generation.

Figure 4 .
Figure 4. Schematic illustration of the effects of L expand and L contract in data generation.

Algorithm 1 .
ASDGN #Pre-train VAE Input: Source dataset S = {x i , y i } n i=1 ; VAE model E(θ; X); pre-train epoch E1 for i = 1 to E1 do: Randomly sample from S Forward propagation and calculation Equation (7) Backward propagation to update θ by Equation (9) end Return: pre-trained VAE model #Domain Augmentation Input: Source dataset S = {x i , y i } n i=1 ; pre-trained VAE model E(θ; X); pre-trained feature extractor F; classifier C; number of augmentation domains K; Adversarial train epoch E2 for i = 1 to K do: for i = 1 to E2 do: Randomly sample m data X = {x i } m i=1 from S; X + == X.clone() Forward propagation and calculation Equations (2), (5) and (8) Calculation Equation (3) Backward propagation to update X + by Equation (9) end Create S + = x + i n i=1 end Return: D = S, S + 1 , • • •S + K #Domain Augmentation Input: Dataset D = S, S + 1 , • • •S + K ; feature extractor F; classifier C; Task model train epoch E3. for i = 1 to E3 do: Randomly sample data from S Forward propagation and calculation Equation (2) Backward propagation to update ω by Equation (11) end Return: Task model

1 : 18 Figure 5 .
Figure 5. Wind power gearbox fault testbench structure diagram.2. Dataset 2 [43]: The gear fault data is collected from the gearbox fault simulation test rig, as shown in Figure 6a.It includes two conditions of speed-load, 20-0 and 30-2.Under each condition, there are five types of gear fault states, health, chipped, miss, root, and surface, as illustrated in Figure 6b.The data sampling frequency is 5120 Hz.The two conditions in this dataset have significant differences, which effectively validates the generalization performance of the model.

2 .
Dataset 2 [43]: The gear fault data is collected from the gearbox fault simulation test rig, as shown in Figure6a.It includes two conditions of speed-load, 20-0 and 30-2.

Figure 6 .
Figure 6.(a) shows the structure diagram of the gearbox fault simulation testbench, and (b) illustrates the five fault states.From left to right, they are surface, miss, chipped, health, and root.Each health condition contains 1000 training samples and 300 test samples.Each sample contains 1024 sampling points.The samples are converted into time-frequency images through continuous wavelet transformation.A total of six fault diagnosis tasks were set up to evaluate the generalization ability and robustness of the ASDGN in real-time fault diagnosis.The task settings are shown in Table2. .

Figure 6 .
Figure 6.(a) shows the structure diagram of the gearbox fault simulation testbench, and (b) illustrates the five fault states.From left to right, they are surface, miss, chipped, health, and root.Each health condition contains 1000 training samples and 300 test samples.Each sample contains 1024 sampling points.The samples are converted into time-frequency images through continuous wavelet transformation.A total of six fault diagnosis tasks were set up to evaluate the generalization ability and robustness of the ASDGN in real-time fault diagnosis.The task settings are shown in Table2.

Figure 7 .
Figure 7. Histogram of the experimental results for Dataset 1.

Figure 7 .
Figure 7. Histogram of the experimental results for Dataset 1.

Figure 8 .
Figure 8. Histogram of the experimental results for Dataset 2.

Figure 9 .
Figure 9. Taking task G2 as an example, this figure contrasts the iterative curves of the baseline and proposed models.(a) The graph depicts the iterative convergence curve of the CNN; (b) the graph depicts the iterative convergence curve of the ASDGN.

Figure 9 .
Figure 9. Taking task G2 as an example, this figure contrasts the iterative curves of the baseline and proposed models.(a) The graph depicts the iterative convergence curve of the CNN; (b) the graph depicts the iterative convergence curve of the ASDGN.

Figure 10 .
Figure 10.Taking B3 as an example, the figure shows the feature visualization of the feature layer for each method.

Figure 10 .
Figure 10.Taking B3 as an example, the figure shows the feature visualization of the feature layer for each method.

Figure 10 .
Figure 10.Taking B3 as an example, the figure shows the feature visualization of the feature layer for each method.

Figure 11 .
Figure 11.Taking B3 as an example, the figure shows the confusion matrix results for each method.
4.5.Ablation ExperimentsTo investigate the roles of L contract and L expand during the domain expansion model optimization process, we conducted ablation experiments to compare the effects on model classification accuracy and the training process when either Lcontract or L expand are omitted.The experimental results are shown in Figures12 and 13 .
4.5.Ablation ExperimentsTo investigate the roles of Lcontract and Lexpand during the domain expansion model optimization process, we conducted ablation experiments to compare the effects on model classification accuracy and the training process when either Lcontract or Lexpand are omitted.The experimental results are shown in Figures12 and 13 .

Figure 12 .
Figure 12.Taking G1 as an example, we compare the iterative processes of the CNN and ASDGN without Lcontract, or Lexpand.

Figure 13 .
Figure 13.A comparison of the accuracy rates for the CNN and ASDGN without Lcontract or Lexpand on Dataset 1.

Figure 12 .
Figure 12.Taking G1 as an example, we compare the iterative processes of the CNN and ASDGN without L contract , or L expand .
To investigate the roles of Lcontract and Lexpand during the domain expansion model optimization process, we conducted ablation experiments to compare the effects on model classification accuracy and the training process when either Lcontract or Lexpand are omitted.The experimental results are shown in Figures12 and 13 .

Figure 12 .
Figure 12.Taking G1 as an example, we compare the iterative processes of the CNN and ASDGN without Lcontract, or Lexpand.

Figure 13 .
Figure 13.A comparison of the accuracy rates for the CNN and ASDGN without Lcontract or Lexpand on Dataset 1.Figure 13.A comparison of the accuracy rates for the CNN and ASDGN without L contract or L expand on Dataset 1.

Figure 13 .
Figure 13.A comparison of the accuracy rates for the CNN and ASDGN without Lcontract or Lexpand on Dataset 1.Figure 13.A comparison of the accuracy rates for the CNN and ASDGN without L contract or L expand on Dataset 1.
Han et al. enhanced the clustering of similar faults across different domains and improved the model's domain robustness by minimizing the triplet loss [23].Shi et al. combined multi-source augmentation with domain adversarial techniques to enhance the model's generalization ability by extracting domain-invariant features from multiple source domains [24].Fan et al. introduced a deep mixed and domain generalization network (DMDGN) for intelligent fault diagnosis.This model adjusts inter-class and intra-class distances across multiple source domains through adversarial and divergence metric approaches.Experimental results demonstrate its strong generalization capabilities on unknown domains

Table 1 .
E (x tl ,y tl )∼Q k [ f (x k t ̸ = y k t )] in the target domain.Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.In training conditions, "✓" indicates that the data is required, and "

Table 1 .
Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.In training conditions, "" indicates that the data is required, and "" means the data is not needed.In target conditions, "" signifies that the model is applicable to that condition, and "" suggests it is not suitable for that condition.

Table 1 .
Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
in the target domain.Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
in the target domain.Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
Comparison of the research problem addressed in this s terms of problem settings and the scope of application.Note: This t and applicability scope of different methods.In training condition required, and "" means the data is not needed.In target condition applicable to that condition, and "" suggests it is not suitable for t

Table 1 .
Comparison of the research problem add terms of problem settings and the scope of applicat and applicability scope of different methods.In tr required, and "" means the data is not needed.In applicable to that condition, and "" suggests it is J. Mar.Sci.Eng.2023, 11, x FOR PEER REVIEW thereby minimizing the risk ( , )~[ (

Table 1 .
in the target domain.Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.
Thus, we employ adversarial tra mented domains S + , with substantial domain s

Table 1 .
in the target domain.Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
Comparison of the research problem add terms of problem settings and the scope of applicat and applicability scope of different methods.In tr required, and "" means the data is not needed.In applicable to that condition, and "" suggests it is

Table 1 .
Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
in the target domain.Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
Table 1 offers a comparison between this study's problem and past research studies in terms of problem settings and scope of application.Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.

Table 1 .
Comparison of the research problem addressed in this study with previous research in terms of problem settings and the scope of application.Note: This table lists the training conditions and applicability scope of different methods.
Figure 8. Histogram of the experimental results for Dataset 2.

Table 5 .
The computation times of different models (taking Dataset 1 as an example).