Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control

Hu, Yaqian; Jin, Herong; Chu, Xiangcheng; Cao, Ran

doi:10.3390/act15010045

Open AccessArticle

Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control

by

Yaqian Hu

¹,

Herong Jin

^1,*

,

Xiangcheng Chu

^2,* and

Ran Cao

¹

School of Mechanical Engineering, Yanshan University, Qinhuangdao 066004, China

²

State Key Laboratory of New Ceramics and Fine Processing, School of Materials Science and Engineering, Tsinghua University, Beijing 100084, China

^*

Authors to whom correspondence should be addressed.

Actuators 2026, 15(1), 45; https://doi.org/10.3390/act15010045

Submission received: 30 November 2025 / Revised: 3 January 2026 / Accepted: 4 January 2026 / Published: 8 January 2026

(This article belongs to the Section Actuator Materials)

Download

Browse Figures

Versions Notes

Abstract

Transfer learning enables the leveraging of knowledge acquired from other piezoelectric actuators (PEAs) to facilitate the positioning control of a target PEA. However, blind knowledge transfer from datasets irrelevant to the target PEA often leads to degraded displacement control performance. To address this challenge, this study proposes a transfer learning method, termed selective multi-source ensemble transfer learning (SMETL). The SMETL adopts a multi-source transfer learning framework integrated with Proxy A-distance (PAD)-based multi-source domain selection and a greedy ensemble transfer learning strategy. Only when the performance on the target domain validation is improved, fine-tuned GRU-CNN feedforward control models are screened into the ensemble. The outputs of the retained ensemble models are averaged to generate the final prediction. Comparative experiment results demonstrate that SMETL achieves superior control performance across all evaluation metrics. This confirms SMETL’s capability to effectively leverage multi-source domain knowledge and mitigate the risk of introducing irrelevant data.

Keywords:

piezoelectric actuators; multi-source; ensemble learning; transfer learning; feedforward control

1. Introduction

The piezoelectric actuator (PEA) serves as a typical precision positioning device that uses piezoelectric ceramics as the driving element and a compliant mechanism for motion guidance [1]. It has been widely employed in micro/nanoscale positioning systems owing to its superior characteristics, such as high resolution, high output force, and rapid response [2]. However, the inherent hysteretic nonlinearity and low damping vibration characteristics of its mechanical structure severely impair both the positioning accuracy and dynamic response speed of PEAs. In general, under open-loop control, the tracking error induced by the hysteretic nonlinearity of a PEA can reach up to 15% of its full-scale range; this error may even exceed 35% as the frequency of the input signal increases [3,4]. Therefore, the development of an accurate mathematical model and a practical control strategy for PEAs is essential to improve their overall performance.

In recent years, neural networks have emerged as a promising approach for hysteresis modeling. Their superior approximation capabilities enable them to effectively capture the intricate dynamic behaviors of nonlinear systems. Long short-term memory (LSTM) networks exhibit the ability to model hysteresis across a wide frequency range, as they leverage the networks’ long-term memory characteristics [5]. A sequence-to-sequence LSTM (LSTMseq2seq) framework was developed by [6] to model PEA systems, which effectively alleviates the common issues of gradient explosion and gradient vanishing associated with recurrent neural networks (RNNs). Additionally, an inversion model based on RNNs was proposed by [7], denoted as RNNinv, for compensating nonlinearities in PEAs. Although high modeling accuracy is desirable for neural network-based frameworks, their generalization ability is crucial for the accurate modeling and control of PEAs under diverse operating conditions. Current neural network-based approaches typically rely on two key assumptions: (1) training and testing data are derived from the same feature space and follow an identical probability distribution; (2) sufficient data are available to train an effective model [8]. However, these assumptions do not always hold in practical scenarios, creating an urgent need for methods to address challenges related to data distribution discrepancies.

Challenges associated with distribution discrepancies have spurred the development of transfer learning, a methodology that transfers knowledge from a well-trained domain (source domain) to another domain (target domain). Notably, transfer learning does not strictly require the data of the source and target domains to match an identical distribution [9]. For instance, a fine-tuning deep transfer learning method based on LSTM networks was proposed by [10] to address the problem of insufficient training data for measurements from new air quality monitoring sites. A domain-adversarial neural network (DANN) was employed to predict the remaining service life of aero engines [11]. Most existing transfer learning methods only use a single-source domain; accordingly, multi-source transfer learning (MSTL) has been proposed to effectively leverage knowledge from multiple domains [12,13]. A set-based boosting technique enhanced the performance of each source task while assigning higher weights to tasks with stronger positive transferability [14]. An ensemble learning and tri-transfer model was introduced by [15] to develop a multi-source ensemble transfer learning (METL) approach for the initial diagnosis of Alzheimer’s disease. These studies demonstrate that multi-source transfer learning outperforms single-source transfer learning approaches. However, while using multi-source domains offers significant advantages, it also poses the challenge of identifying and selecting valuable knowledge from these domains. A transfer learning framework was designed as source-selection-free transfer learning (SSFTL), which utilizes tags from the delicious website to construct a semantic similarity relationship between the source and target domains via Laplace feature mapping, thereby enabling automatic source domain selection [16]. In the context of transfer learning, several similarity metrics are available, including proxy A-distance (PAD) [17,18], maximum mean discrepancy (MMD) [19], soft dynamic time warping (Soft-DTW) [20], and CORAL [21]. These metrics, often paired with classifiers, assess the alignment between the source and target domains by analyzing classifier error. In behavior recognition, the similarity between the source and target domains can be assessed by integrating the similarity of sensor data from body parts with the semantic correlations of the corresponding body parts [22].

To the best of our knowledge, relatively few studies have focused on the development of a selective multi-source ensemble transfer learning (SMETL) algorithm and its application to PEAs. Existing research on transfer learning, ensemble learning, and their integration primarily emphasizes classification tasks and adversarial domain adaptation, while little attention has been paid to parameter sharing (including pre-training and fine-tuning) and multi-source ensemble learning methods specifically designed for PEAs. In this study, a framework SMETL is proposed. The key technologies of the proposed SMETL framework are summarized as below:

(1) The potential of transfer learning lies in its ability to adapt across different domains. As demonstrated by methods such as MSTL and SSFTL, multi-source approaches offer promising ways to improve performance. However, despite the availability of multiple datasets, the relevance between these source datasets and the target dataset is often ambiguous. In this context, PAD is employed as a similarity metric between the source and target domains. By analyzing the correlation between PAD values and the evaluation metrics of single-source transfer learning models, it is demonstrated that PAD can effectively quantify the similarity between actuators.

(2) Since blind knowledge transfer from target-irrelevant source datasets often results in deteriorated displacement control performance, the SMETL framework innovatively adopts a greedy ensemble transfer learning strategy. Specifically, this strategy constructs the ensemble by first sorting all candidate transfer learning models in ascending order of the PAD value between their respective source domains and the target domain, then sequentially adding each candidate model and only retaining those that improve the ensemble’s performance on the target domain validation set. Evaluation and analysis results demonstrate that this strategy not only enhances performance compared to individual single-source transfer learning models but also effectively avoids negative transfer.

This paper is organized as follows: Section 2 presents the research framework, model architecture, and methodologies used in this study. Section 3 elaborates on the relationship between the performance of single-source transfer learning models and PAD values, and verifies the necessity of multi-source transfer learning. Section 4 investigates the influences of source domain data volume, target domain data volume, and ensemble strategy on the multi-source transfer learning model; additionally, a comparison with representative transfer learning frameworks is conducted to validate the effectiveness of the proposed SMETL framework. Section 5 concludes this paper and outlines potential directions for future research.

2. Methodology

As illustrated in Figure 1, the SMETL framework consists of two core steps, which are detailed as follows:

Step 1: Source domain selection.

Given N source domains (denoted as

D_{S_{1}}, D_{S_{2}}, \dots, D_{S_{N}}

) and one target domain (denoted as

D_{T}

), the PAD algorithm is firstly employed to calculate the data distribution distance between each source domain and the target domain. Subsequently, the strong linear correlation between PAD values and transfer learning-based feedforward control performance is verified. For each source domain selected based on PAD screening, a pre-trained GRU-CNN model is constructed using the source domain’s dataset. These pre-trained models are then fine-tuned on the target domain’s data to adapt their parameters to the target domain’s distribution characteristics.

Step 2: Multi-source greedy ensemble transfer learning.

After Step 1, a greedy ensemble transfer learning strategy is used to develop a hybrid model tailored to the target domain. The primary objective of this step is to enhance the transferability of knowledge from multi-source domains to the target domain, while simultaneously mitigating the risk of negative transfer that may arise from individual source domains with low similarity to the target. To validate the effectiveness of SMETL, the control performance of the proposed SMETL model is evaluated and compared with that of each individual single-source transfer learning model.

Figure 1. Schematic diagram of selective multi-source ensemble transfer learning.

2.1. Similarity Between Domains

The higher the similarity between a source domain and the target domain, the more effectively knowledge from the source domain can be transferred to the target domain, thereby avoiding the risk of negative transfer [23]. Given the availability of multi-source domains, identifying a reliable metric to quantify inter-domain similarity is essential for selecting source domains with high transfer potential. In this study, PAD is employed to calculate the distribution distance between each source domain and the target domain, enabling the quantification of similarity between these two domains. Specifically, a smaller PAD value indicates a smaller distribution discrepancy between the source and target domains, and thus a higher degree of inter-domain similarity.

Given a source domain

D_{S}

and a target domain

D_{T}

, let a labeled source sample

S

be drawn from

D_{S}^{X}

, and a target sample

T

be drawn from

D_{T}^{X}

, where

X

denotes the input space and

Y = {0,1}

represents the set of two possible labels. Specifically, all instances in the source sample

S

are assigned the label 0, while all instances in the target sample

T

are assigned the label 1. For a symmetric hypothesis class

H

, the empirical

H

-divergence between

D_{S}^{X}

and

D_{T}^{X}

is defined as

\hat{d} (S, T) = 2 (1 - \min_{η \in H} [\frac{1}{n} \sum_{i = 1}^{n} I [η (x_{i}) = 0] + \frac{1}{n^{'}} \sum_{i = n + 1}^{N} I [η (x_{i}) = 1]])

(1)

where

N = n + n^{'}

denotes the total number of samples, and

I [a]

represents the indicator function, which takes a value of 1 if the predicate

a

holds true and 0 otherwise.

The risk of the classifier trained on the new data set approximates the “min” component of Equation (1). Given a classification error

ε

associated with the task of discriminating between source and target examples, the Proxy A-Distance

{\hat{d}}_{A}

can be defined as

{\hat{d}}_{A} = 2 (1 - ε)

(2)

2.2. GRU-CNN

As shown in Figure 2, the GRU-CNN framework comprises an input layer, one gated recurrent unit (GRU) layer, two convolutional neural network (CNN) layers, three dense layers and an output layer. The input layer accepts sequences in a three-dimensional format, typically denoted as batch size × sequence length × feature dimension. These sequences are then processed by the GRU layer, which captures temporal correlations within the sequences. Specifically, the GRU layer is designed to characterize the long-term dependence and nonlinear memory behaviors of hysteresis, while alleviating the issues of gradient vanishing or gradient exploding when handling long sequences. Subsequently, the multi-layer CNN employs sliding convolution kernels to extract local trend features from the entire output of the GRU layer. The CNN layer enhances the model’s ability to capture local patterns, thereby contributing to improved control accuracy. The framework concludes with three dense layers: the first incorporates a rectified linear unit (ReLU) activation function to prevent vanishing gradient, and the third is used for data size reshaping. When the batch size and feature dimension are set to 1, given an input sequence

\{x_{1}, x_{2}, \dots, x_{t}\}

and an output sequence

\{y_{1}, y_{2}, \dots, y_{t}\}

, the GRU-CNN model aims to establish a mapping from the input sequence to the output sequences. The mean square error (MSE) is selected as objective function to minimize error between the true output sequence

\{y_{1}, y_{2}, \dots, y_{t}\}

and the predicted output sequence

\{{\hat{y}}_{1}, {\hat{y}}_{2}, \dots, {\hat{y}}_{t}\}

.

For feedforward control, as depicted in Figure 3, the controlled variable corresponds to the output displacement of the PEA, and the manipulated variable refers to the input voltage applied to the PEA. The reference displacement of the PEA is first input to the GRU-CNN inverse model. This inverse model computes the corresponding driving voltage required to achieve the reference displacement, thereby compensating for the hysteretic nonlinearity and low damping vibration characteristics of the PEA. Aligned with this control logic, the GRU-CNN model takes the PEA’s measure displacement information as input and outputs the corresponding driving control voltage.

2.3. Ensemble Transfer Learning

Transfer learning schemes are designed to leverage pre-existing empirical knowledge to develop new models, which can be applied to either similar or entirely distinct PEAs. As documented in numerous studies, the advantages of transfer learning include faster convergence speed, enhanced generalization capability, improved control accuracy, and increased robustness, with the latter two advantages being particularly prominent in scenarios characterized by data scarcity [24]. Parameter sharing [25] is the most widely used model-based transfer learning approach. This method comprises two core steps: pre-training and fine-tuning. In the pre-training step, a source model is obtained by training a neural network on source domain data. In the fine-tuning step, only the final several layers are fine-tuned using a smaller volume of target domain data to generate the target model. By leveraging the pre-trained source model, the target model avoids the need for training from scratch, thereby reducing the computational cost and time required for model training.

Ensemble learning focuses on integrating multiple weak learners into a more powerful ensemble learner, rather than pursuing a single sophisticated model to achieve optimal performance [26]. However, an excessively large number of models in an ensemble significantly increases computational overhead [27]. Thus, minimizing this computational burden while ensuring predictive performance is crucial. An ensemble may include models with differing predictive performance levels, and integrating these models may fail to deliver the desired performance gains. Furthermore, an excessive number of high-performance models in an ensemble can lead to overfitting. To address these challenges, a greedy ensemble transfer learning strategy is proposed. This strategy constructs the ensemble by sequentially adding each candidate transfer learning model: a model is retained only if it improves the ensemble’s performance on the target domain validation set. Prior to initiating this process, all candidate models are sorted in ascending order of the PAD value between their corresponding source domains and the target domain. This ranking logic guarantees that the ensemble will not perform worse than the best individual transfer learning model on the validation set.

Algorithm 1 presents the pseudo-code for the proposed greedy ensemble transfer learning strategy in detail. Each GRU-CNN-based single-source transfer learning model processes the target domain input independently and generates individual predictive outputs. Given that the selected transfer learning models exhibit comparable performance on the target domain validation set, a simple arithmetic averaging method is adopted to compute the mean of all individual outputs. This averaged value is then used as the final output of the SMETL framework.

Algorithm 1. Greedy Ensemble Transfer Learning

Input: Potential source domain sets

\{D_{S_{1}}, D_{S_{2}}, \dots, D_{S_{N}}\}

(each

D_{S_{i}} = (X_{S_{i}}, Y_{S_{i}}),

sorted by

P A D (D_{S_{i}}, D_{T})

in ascending order)

Target domain

D_{T} = (X_{T}, Y_{T})

is divided into the training set

D_{T_{t r}} = (X_{T_{t r}}, Y_{T_{t r}})

and validation set

D_{T_{v a l}} = (X_{T_{v a l}}, Y_{T_{v a l}})

GRU-CNN model architecture

m o d e l

Output: Ensemble prediction function of selected transfer models

E n s e m b l e P r e d (X)

{s e l e c t e d}_{m o d e l s} \leftarrow {}

{b e s t}_{{v a l}_{M S E}} \leftarrow + \infty

for

i = 1

to

N

do

M_{{p r e t r a i n}_{i}} \leftarrow P r e t r a i n (m o d e l, D_{S_{i}})

M_{i} \leftarrow F i n e T u n e (M_{{p r e t r a i n}_{i}}, D_{S_{i}}, θ)

{c a n d i d a t e}_{m o d e l s} \leftarrow {s e l e c t e d}_{m o d e l s} \cup \{M_{i}\}

{Y p r e d}_{c a n d i d a t e} \leftarrow a v e r a g e \{M (X_{T_{v a l}})| M \in {c a n d i d a t e}_{m o d e l s}\}

{V a l M S E}_{c a n d i d a t e} \leftarrow M S E ({Y p r e d}_{c a n d i d a t e}, Y_{T_{v a l}})

if

{V a l M S E}_{c a n d i d a t e} \leq {b e s t}_{{v a l}_{M S E}}

{s e l e c t e d}_{m o d e l s} \leftarrow {c a n d i d a t e}_{m o d e l s}

{b e s t}_{{v a l}_{M S E}} \leftarrow {V a l M S E}_{c a n d i d a t e}

return

E n s e m b l e P r e d (X) = a v e r a g e \{M (X)| M \in {s e l e c t e d}_{m o d e l s}\}

3. Experiments and Analysis

This section demonstrates the effectiveness of the proposed SMETL framework. First, a GRU-CNN is employed to model the mapping relationship between the input displacement and output driving voltage of PEAs. Next, single-source transfer learning models are constructed for each candidate source domain. The PAD algorithm is then used to calculate the data distribution distance between each source domain and the target domain. By analyzing the correlation between the control performance of these single-source transfer learning models and their corresponding PAD values, the rationality and accuracy of using PAD to quantify inter-actuator data similarity are verified. Finally, SMETL is implemented to integrate valid knowledge from the screened high-similarity source domains, yielding the final SMETL-based multi-source transfer learning model. Notably, SMETL exhibits flexibility in adapting to an arbitrary number of candidate source domains. For the case study in this work, three candidate source domains are selected to form the source domain set.

3.1. Data Acquisition

To address the integration of distributions across multiple source domains and validate the effectiveness of the proposed SMETL framework, three source domains and four target domains were utilized. As depicted in Figure 4, two types of PEAs were employed in the source domain, differentiated by the length of their flexible hinges. Source domain #1, source domain #2, and target domain #1 were constructed using data from PEAs #1, #2, and #4, respectively. These three actuators share the same mechanical structure but originate from different production batches. Source domain #3 and target domain #2 were constructed using data from PEAs #3 and #5 respectively. These actuators also share an identical mechanical structure, which is distinct from that of PEAs #1, #2, and #4. For the PEA shown in Figure 4c, the longitudinal voltage and displacement are designated as data for target domain #3, and the lateral voltage and displacement are designated as data for target domain #4. Figure 5 presents the Gaussian kernel density estimates of all datasets corresponding to the PEAs. Even under the same operating conditions, the Gaussian kernel density estimates of actuators with the same mechanical structure but different production batches exhibit discrepancies. This phenomenon is attributed to manufacturing factors such as assembly errors and preload variations. The preload value of the PEAs is determined based on the recommended range 400–600 N. Notably, the kernel density estimates of all datasets exhibit roughly similar distributions. This similarity satisfies the core prerequisite for transfer learning.

Each source domain contains 15,000 samples, while each target domain contains 1000 samples, as shown in Table 1. All data samples were split into 10 folds for cross-validation. The validation subsets were employed to tune the near-optimal hyperparameters of each transfer learning model. Furthermore, the generalization ability of the models was evaluated individually with an additional 1200 test samples per target domain. These test samples were excluded from both model training and fine-tuning, thereby ensuring an unbiased evaluation of model performance.

All samples were collected from the piezoelectric actuator test platform, as illustrated in Figure 6. Each sample comprises a continuous voltage sequence and a corresponding displacement sequence, each spanning 1000 time steps. Each PEA was equipped with a laser displacement sensor LK-H020 (Keyence Corporation, Osaka, Japan), featuring a measurement range of ±3 mm and a measurement accuracy of 0.02 μm (±0.02% F.S.). The sensor was paired with a controller LK-G5001 (Keyence Corporation, Osaka, Japan) that provides analog output signals corresponding to displacement, with a voltage range of 0 V to 10 V. To drive the PEAs, a power amplifier module with a fixed gain of 15 was used. Input voltage curves were generated using a MATLAB (v2024a) program, with amplitudes ranging from 0 V to +10 V; accordingly, the power amplifier module’s output voltage ranged from 0 V to +150 V. All input and output signals were synchronously acquired using a 16-bit data acquisition card NI USB-6218 (National Instruments, Austin, TX, USA), with the sampling frequency set to 10 kHz.

Furthermore, all experimental samples underwent normalization to standardize the scales of different variables, using the min-max normalization method. Its mathematical expression is given by

x^{*} = \frac{x - x_{m i n}}{x_{m a x} - x_{m i n}}

(3)

where

x

denotes the original variable,

x_{m a x}

and

x_{m i n}

are the maximum and minimum values of

x

, and

x^{*}

denotes the normalized variable. The min-max method scales the original variable to the range [0, 1], making it particularly suitable for experimental scenarios where variables do not follow a normal distribution.

3.2. Signal-Source Transfer Learning

The effectiveness of transfer learning is critically dependent on the degree of similarity between the source and the target domains. Only when shared knowledge exists between them can transfer learning be performed effectively. Conversely, if the similarity is low, knowledge acquired from the source domain may have a detrimental impact on the target domain, resulting in negative transfer. Therefore, it is essential to ensure data similarity between the source and target domains and to identify transferable components via an appropriate approach.

All models were trained on a computer equipped with Intel(R) Core(TM) i9-12900K CPU @3.19 GHz (Intel Corporation, Santa Clara, CA, USA) and NVIDIA GTX 3080Ti GPU (NVIDIA Corporation, Santa Clara, CA, USA). The Adamax optimizer [28] was employed to minimize the objective function during training, and all experiments were implemented using Python 3.6.13 with the PyTorch 1.10.2 + CUDA 11.3.1. A fixed random seed of 1024 was adopted to ensure consistent initialization of model parameters across multiple training sessions. For each pre-trained model from the source domains, its key hyperparameters, including the learning rate, the number of neural units in the GRU and dense layers, the kernel size of CNN layers, the batch size, and the dropout are optimized using a Bayesian optimization algorithm, with the MSE on the validation set adopted as the optimization objective function. A dropout rate of 0.1 was incorporated to mitigate overfitting by randomly setting a subset of model parameters to zero during training. The number of training epochs was set to 100, the batch size was configured to 100, and an initial learning rate of 0.001 was adopted. Detailed hyperparameter settings for the GRU-CNN structure are provided in Table 2.

Each pre-trained model was paired with the target domains to construct single-source transfer learning models. In this study, for target domain #1 and #2, the pre-trained model was fine-tuned while fully retaining the structure and parameters of its GRU and CNN layers to mitigate overfitting. The existing parameters of these layers served as initial values for training, whereas the parameters in the dense layers were randomly initialized. For target domain #3 and #4, all layer parameters of the pre-trained model were directly adopted as initialization parameters for the fine-tuning process. The learning rate was adjusted automatically based on the model’s performance on the validation set. If the validation performance did not improve within the set 10 epochs, the learning rate was reduced by a factor of 0.5. For all fine-tuning processes, the initial learning rate was set to 5 × 10⁻⁴, with other parameters remaining unchanged, and an early stopping mechanism was employed to further prevent overfitting.

Table 3 presents the test results of single-source transfer learning models across the four target domains. Models trained exclusively on the respective target domain datasets are denoted as M.1, M.2, M.3, and M.4. For M.1–M.4, the training epochs were set to 100, with an initial learning rate of 0.001. Notably, all single-source transfer learning models outperformed the target-domain-only trained models. The relatively inferior displacement control performance of M.1 stems from the scarcity of training data, as only 1000 samples were available. M.2 exhibited a similar performance constraint, which underscores the critical role of data volume in training GRU-CNN models for PEA displacement control. Target domains #3 and #4 exhibited domain shift relative to #1 and #2, thereby degrading the performance of all single-source transfer learning models. Nevertheless, M.3 and M.4 still achieved improvements over the target-domain-only trained models. These results validate that transfer learning is an effective strategy for addressing the challenges associated with limited training data and mismatched data distributions for target PEAs.

Table 4 presents the PAD calculation results. A smaller PAD value indicates a higher similarity between the two domain distributions, whereas a larger value signifies a greater discrepancy between the domains. As evident from Table 3 and Table 4, the control performance of single-source transfer learning models gradually degrades as the similarity between the source and target domains decreases. For target domain #4, the PAD values corresponding to source domain #2 and source domain #3 are both 1.985. However, the single-source model built with source domain #2 and target domain #4 achieves an MAE of 0.245 and an RMSE of 0.361, outperforming the model constructed with source domain #3 and target domain #4. As noted in existing literature, similarity metrics only serve as references for source domain selection, and their results are not always accurate. Nevertheless, after excluding individual outliers, a strong linear correlation between PAD values and control performance is verified. Thus, it is reasonable to employ PAD to quantify the similarity between source and target domains for PEA control tasks.

3.3. Multi-Source Ensemble Learning

To verify the superior performance of multi-source transfer learning models compared with single-source transfer learning models, evaluation metrics were extracted from the results presented in Table 5. For target domain #1, performance varies across different source combinations. Among all single-source transfer learning models, the model utilizing source #1 achieves the optimal performance, followed by that using source #2, while the model based on source #3 exhibits the lowest performance. For two-source models, the combination of source #1 and #2 achieves the best performance, outperforming the combinations of source #1 and #3 as well as source #2 and #3. Notably, the three-source model (source #1, #2, and #3) exhibits slightly degraded performance compared to the optimal two-source model, indicating that simply increasing the number of source domains does not always enhance performance. In summary, multi-source ensemble learning can effectively leverage data from compatible source domains to enhance the learning process for target PEA control tasks corresponding to target domain #1.

A similar yet distinct trend is observed across target domain #2, #3, and #4. For target domain #2, among all single-source models, source #3 yields the optimal performance, outperforming source #1 and source #2. For two-source models, the combination of source #1 and #3 achieves the best performance, while the three-source model exhibits marginally inferior performance. This phenomenon underscores that the similarity between the added source domains and the target domain is a critical factor for model performance, rather than merely the number of source domains.

In multi-source transfer learning, exhaustive search is a conventional method for identifying optimal source domain combinations, as it evaluates all possible combinations of candidate source domains. For 3 candidate source domains, exhaustive search requires testing a total of 7 combinations. In contrast, the PAD-ranking-guided source domain selection strategy follows a similarity-based descending order, with source #1 being the most similar to the target domain, followed by source #2 and then source #3, and only requires testing a maximum of 3 combinations. Notably, for all target domains #1–#4, this PAD-guided strategy identified the same optimal source domain combinations as exhaustive search. Specifically, source #1 and #2 are optimal for target domains #1, #3, and #4, and source #1 and #3 are optimal for target domain #2. These findings validate that the PAD-ranking-guided strategy achieves an optimal trade-off between computational efficiency and model performance.

4. Discussion

To validate the proposed SMETL framework, comprehensive evaluations and experiments were conducted, focusing on three key aspects: (1) the impact of source domain data volume on model performance; (2) the combined effects of target domain data volume on SMETL; and (3) the control performance of SMETL compared with two frameworks.

4.1. Evaluation of Source Domain Data Volume

Figure 7 illustrates the percentage reduction in MAE and RMSE of multi-source transfer learning models relative to the average metrics of their corresponding single-source transfer learning models across four target domains. All multi-source models achieve measurable reductions in both MAE and RMSE, demonstrating the universal effectiveness of integrating multiple source domains to enhance target task performance. Specifically, for target domain #1, the test MAE values of the single-source transfer learning models are 0.063, 0.067, and 0.084, respectively, with the corresponding test RMSE values being 0.117, 0.122, and 0.150. The two-source model combining source #1 and #2 achieves a test MAE of 0.060, which is lower than the average MAE of 0.065 of its single-source baselines, while the three-source model combining source #1, source #2, and source #3 achieves a test MAE of 0.064, outperforming the average MAE of 0.071 of its single-source baselines. The percentage reductions in MAE and RMSE of multi-source models are at least 0.5% and 0.6% respectively. For target domain #2, the percentage reductions in MAE and RMSE are at least 1.0% and 1.3% respectively. For target domain #3, the percentage reductions in both MAE and RMSE are at least 0.9%. For target domain #4, the percentage reductions in both MAE and RMSE are at least 0.7%. This performance enhancement can be attributed to the unique advantage of multi-source transfer learning in fusing complementary information from diverse source domains. Unlike single-source models that may be affected by strong domain shift, multi-source transfer learning fills the information gap in target task and improves the model’s performance.

Table 6 and Table 7 compare the test performance of the SMETL model and the best individual single-source model across four target domains under different source dataset scales, with 40%, 60%, 80%, and 100% of samples per source domain. SMETL exhibits a clear upward performance trend as the source dataset scale increases from 40% to 100%, whereas the best individual model shows no systematic improvement and fluctuates slightly. The improved performance of SMETL with increased source data volume can be attributed to the more comprehensive characterization of intrinsic data distributions, which strengthens the model’s capacity to learn domain-invariant features critical for transfer learning. For the best individual model, its reliance on a single source limits its ability to capture diverse patterns, resulting in higher error metrics. SMETL not only outperforms the best individual model in MAE and RMSE across most scenarios but also exhibits smaller standard deviations, indicating superior stability. These advantages become more pronounced as the source dataset scale increases. Specifically, for target domain #1 with 80% source data, SMETL achieves an MAE of 0.061 ± 0.004, representing a reduction of 0.006 in MAE and 0.002 in standard deviation compared to the best individual model. Similarly, SMETL’s RMSE is 0.116 ± 0.003, a reduction of 0.005 in RMSE and 0.001 in standard deviation relative to the best individual model. The only exceptions are target domain #2 with 40% and 60% source data, where SMETL and the best individual model deliver identical performance. SMETL’s greedy ensemble transfer learning strategy, which sequentially incorporates source domains and retains only those that do not degrade validation performance, effectively leverages complementary information across multiple sources while avoiding redundant computational costs. Overall, these results confirm that SMETL delivers higher performance and stability than single-source models, particularly when sufficient source data is available to support effective multi-source fusion.

4.2. Evaluation of Target Domain Data Volume

Table 8 and Table 9 evaluate the performance of the SMETL model and the best individual single-source model across the four target domains under varying target dataset scales. As the target dataset volume increases, both models exhibit a systematic reduction in MAE and RMSE, while SMETL maintains superior performance and stability compared to the best individual model in all scenarios. This can be attributed to the more accurate characterization of the target domain’s intrinsic distribution, which reduces ambiguity in transferable knowledge. The advantages of SMETL are most prominent when target data is scarce. For target domain #2 with 10% target data, SMETL achieves an MAE of 0.552 ± 0.070, representing a 3.2% reduction compared to the best individual model. More notably, SMETL’s standard deviation is reduced by 53.4%, demonstrating enhanced stability. For target domain #1 with 10% target data, SMETL outperforms the best individual model by 9.0% in MAE and 8.9% in RMSE, respectively, with corresponding reductions of 48.0% and 54.6% in standard deviation. This is because insufficient target supervision amplifies model uncertainty and SMETL’s multi-source integration effectively mitigates this issue, as evidenced by the substantial standard deviation reductions in low-data scenarios. SMETL and the best individual model deliver identical performance on target domain #3 with 10% target data, which likely reflects the limitations of multi-source fusion under extreme data scarcity. Without sufficient target supervision to guide the integration of multi-source knowledge, additional source domains fail to provide new valuable information given that the single-source model’s domain-specific knowledge already aligns with the limited patterns in the target data. Furthermore, the narrowing performance gap between SMETL and the best individual model in extremely scarce target data scenarios confirms that limited target data increases the stringency for the quality of transferred knowledge. Overall, these results highlight that target domain data volume plays a critical role in shaping the performance of transfer learning models and that SMETL’s ensemble design confers the benefits of higher performance and stability.

4.3. Comparison of Different Frameworks

To validate SMETL’s superiority, two transfer learning frameworks were selected as baselines. Multi-LSTM-DANN [19], donated as framework 1, is a domain-adversarial transfer learning method. MMD is employed as the similarity metric to calculate the regression weight and prediction value of the Multi-LSTM-DANN model. The feature extractor is a four-layer LSTM neural network while the domain classifier and regression predictor consist of three and two fully connected layers, respectively. The training epochs are set to 300 using the Adam optimizer with the batch size and learning rate set to 100 and 0.005. Merged multi-source [29], denoted as framework 2, is a simple multi-source strategy that concatenates all source datasets into one aggregated set, treats all sources as equally valuable and applies transfer learning.

Table 10 presents a quantitative comparison of the proposed SMETL framework with two baseline transfer learning frameworks across four target domains. The most striking contrast is observed between SMETL and framework 1. For target domain #1, SMETL achieves an MAE of 0.060 ± 0.003 and an RMSE of 0.114 ± 0.004, which are 80.4% and 72.9% lower than framework 1’s corresponding values, respectively. In comparison to framework 2, SMETL maintains a moderate but consistent performance edge across all target domains. For target domain #1, SMETL’s MAE and RMSE are 3.3% and 1.2% lower than framework 2’s. Notably, SMETL also exhibits superior stability, with smaller standard deviation values for both MAE and RMSE than the two baseline frameworks.

Multi-LSTM-DANN relies on accurate feature alignment between source and target domains for its performance, but for regression tasks like PEA displacement prediction, the continuity of the output space increases the difficulty of achieving such accurate alignment. As a dual-task training method, it is prone to conflicts arising from the joint training of the domain classifier and regression predictor. Merged multi-source treats all source domains as equally valuable, failing to prioritize high-similarity, information-rich source domains, and thus underutilizes valuable transferable knowledge. In contrast, SMETL’s superior performance is attributed to its integrated design of PAD-ranking-guided source domain selection and greedy ensemble transfer learning. The PAD-ranking module adaptively assesses the contribution of each source domain enabling the framework to leverage high-similarity sources while discarding redundant or low-similarity ones.

Figure 8, Figure 9, Figure 10 and Figure 11 visualize the voltage prediction performance of three frameworks across four target domains. The solid black line denotes the actual driving voltage of the PEA while the dotted red line denotes the predicted voltage from each framework. Across all target domains, SMETL’s predicted voltage curves achieve the closest alignment with the actual values, which confirms its superior predictive performance and consistency. In contrast, Multi-LSTM-DANN exhibits the most pronounced misalignment, while the Merged multi-source approach delivers moderate performance between the two. For target domain #1, Multi-LSTM-DANN’s prediction curve shows significant bias throughout the entire time sequence, with a deviation range of [−16.85 V, 31.57 V]. The Merged multi-source approach improves slightly, with a deviation range of [−5.19 V, 7.40 V], but SMETL achieves the tightest alignment with an even narrower deviation range of [−2.73 V, 5.71 V] as its predicted curve closely tracks the actual voltage trajectory. This pattern is replicated across target domain #2 to #4. For target domain #4, Multi-LSTM-DANN’s deviation range expands to [−16.95 V, 45.54 V] due to bigger source–target domain divergence, while SMETL maintains a constrained deviation range of [−9.72 V, 14.66 V], outperforming the Merged multi-source approach’s range of [−11.06 V, 16.35 V].

Furthermore, to comprehensively verify the displacement control performance of the proposed SMETL under diverse working conditions, comparative experiments were conducted with various reference displacements, and the results are illustrated in Figure 12. The solid black line represents the preset reference displacement of the PEA, while the dotted red line denotes the measured displacement. It is visually evident that the dotted red line closely tracks the solid black reference curve throughout the entire test period, without obvious phase lag, overshoot, or steady-state error. The generation of deviations in slope mutation region is mainly related to two core factors: first, the inherent nonlinear characteristics of the actuator are significantly enhanced during the rapid dynamic response phase, and the superposition of hysteresis effect and low-damping vibration directly increases the difficulty of displacement tracking; second, the 1000 target domain training samples fail to fully include the nonlinear patterns under slope mutation scenarios, which makes it difficult for the proposed control model to completely capture the rapidly changing dynamic features. For the four target domains, the MAE values are 0.019, 0.009, 0.020 and 0.056, respectively, the corresponding RMSE values are 0.024, 0.011, 0.028 and 0.065, respectively, and the corresponding coefficients of determination are 98.9%, 99.3%, 99.1% and 93.6%, respectively. As presented in Table 4, the divergence between source domain and the four target domains gradually increases. Notably, even when the divergence between the source domain and target domain gradually widens, the proposed SMETL still maintains a control accuracy exceeding 93%. According to the above results, the proposed SMETL not only achieves high-precision displacement control but also maintains effectiveness as source–target domain divergence increases.

5. Conclusions

In this study, a selective multi-source ensemble transfer learning approach is proposed to address the challenge of data distributions across target PEA datasets, where such datasets are derived from PEAs with variations in batch production and mechanical structures. The SMETL framework integrates PAD-based multi-source domain selection and greedy ensemble transfer learning to enable targeted adaptation and efficient knowledge transfer, avoiding redundancy and computational waste. Experimental results validate the superiority of SMETL as it outperforms two baseline frameworks, namely Multi-LSTM-DANN and the merged multi-source approach, across all four target domains. Even as source–target domain divergence gradually widens, SMETL retains control accuracy exceeding 93% for all target domains. It provides a practical solution for PEA displacement control, where batch variations and structural differences are inevitable. Future research will focus on extending the SMETL framework to accommodate a broader range of actuator structures and application scenarios, which will further verify its generalizability. Quantitative analysis of transfer boundaries could clarify the model’s performance limits under different domain divergence scenarios. Another future direction is to develop quantitative criteria for providing guidance for source domain selection in practical applications.

Author Contributions

Conceptualization, Y.H. and H.J.; methodology, Y.H.; software, Y.H.; validation, Y.H. and R.C.; resources, X.C.; data curation, Y.H. and R.C.; writing—original draft preparation, Y.H.; writing—review and editing, H.J. and X.C.; visualization, Y.H. and R.C.; funding acquisition, X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China (Grant No. 52475596), the State Key Laboratory of New Ceramic Materials Tsinghua University (Grant No. KF202420), the Beijing Nova Program of China (Grant No. 20230484399), and the Natural Science Foundation of Beijing and Xiaomi Innovation Joint Foundation, China (Grant No. L243022).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Acknowledgments

The author would like to thank Paihe Science and Technology Holding Co., Ltd. Beijing, for the technical support and the provision of the PEAs.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gao, X.; Yang, J.; Wu, J.; Xin, X.; Li, Z.; Yuan, X.; Shen, X.; Dong, S. Piezoelectric Actuators and Motors: Materials, Designs, and Applications. Adv. Mater. Technol. 2020, 5, 1900716. [Google Scholar] [CrossRef]
Ding, B.; Li, X.; Li, C.; Li, Y.; Chen, S.-C. A Survey on the Mechanical Design for Piezo-Actuated Compliant Micro-Positioning Stages. Rev. Sci. Instrum. 2023, 94, 101502. [Google Scholar] [CrossRef] [PubMed]
Gu, G.; Zhu, L. Modeling of Rate-Dependent Hysteresis in Piezoelectric Actuators Using a Family of Ellipses. Sens. Actuators A: Phys. 2011, 165, 303–309. [Google Scholar] [CrossRef]
Delibas, B.; Arockiarajan, A.; Seemann, W. Rate Dependent Properties of Perovskite Type Tetragonal Piezoelectric Materials Using Micromechanical Model. Int. J. Solids Struct. 2006, 43, 697–712. [Google Scholar] [CrossRef]
Yan, J.; DiMeo, P.; Sun, L.; Du, X. LSTM-Based Model Predictive Control of Piezoelectric Motion Stages for High-Speed Autofocus. IEEE Trans. Ind. Electron. 2023, 70, 6209–6218. [Google Scholar] [CrossRef]
Yin, R.; Ren, J. Sequence-to-Sequence LSTM-Based Dynamic System Identification of Piezo-Electric Actuators. In Proceedings of the 2023 American Control Conference (ACC), San Diego, CA, USA, 31 May–2 June 2023; IEEE: San Diego, CA, USA; pp. 673–678. [Google Scholar] [CrossRef]
Xie, S.; Ren, J. Tracking Control Using Recurrent-Neural-Network-Based Inversion Model: A Case Study on a Piezo Actuator. IEEE Trans. Ind. Electron. 2021, 68, 11409–11419. [Google Scholar] [CrossRef]
Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A Survey of Transfer Learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef]
Cornuéjols, A.; Murena, P.-A.; Olivier, R. Transfer Learning by Learning Projections from Target to Source. In Proceedings of the 18th International Symposium on Intelligent Data Analysis, Konstanz, Germany, 27–29 April 2020; Springer International Publishing: Konstanz, Germany; Volume 12080, pp. 119–131. [Google Scholar] [CrossRef]
Ma, J.; Li, Z.; Cheng, J.C.P.; Ding, Y.; Lin, C.; Xu, Z. Air Quality Prediction at New Stations Using Spatially Transferred Bi-Directional Long Short-Term Memory Network. Sci. Total Environ. 2020, 705, 135771. [Google Scholar] [CrossRef] [PubMed]
Da Costa, P.R.D.O.; Akçay, A.; Zhang, Y.; Kaymak, U. Remaining Useful Lifetime Prediction via Deep Domain Adaptation. Reliab. Eng. Syst. Saf. 2020, 195, 106682. [Google Scholar] [CrossRef]
Huang, L.; Fan, J.; Zhao, W.; You, Y. A New Multi-Source Transfer Learning Method Based on Two-Stage Weighted Fusion. Knowl.-Based Syst. 2023, 262, 110233. [Google Scholar] [CrossRef]
Gu, Q.; Dai, Q. A Novel Active Multi-Source Transfer Learning Algorithm for Time Series Forecasting. Appl. Intell. 2021, 51, 1326–1350. [Google Scholar] [CrossRef]
Eaton, E.; desJardins, M. Set-Based Boosting for Instance-Level Transfer. In Proceedings of the 2009 IEEE International Conference on Data Mining Workshops (ICDMW), Miami, FL, USA, 6–9 December 2009; IEEE: Miami, FL, USA; pp. 422–428. [Google Scholar] [CrossRef]
Yang, Y.; Li, X.; Wang, P.; Xia, Y.; Ye, Q. Multi-Source Transfer Learning via Ensemble Approach for Initial Diagnosis of Alzheimer’s Disease. IEEE J. Transl. Eng. Health Med. 2020, 8, 1–10. [Google Scholar] [CrossRef] [PubMed]
Xiang, E.; Pan, S.J.; Pan, W.; Su, J.; Yang, Q. Source-Selection-Free Transfer Learning. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Spain, 16–22 July 2011; AAAI Press: Barcelona, Spain; pp. 2355–2360. [Google Scholar]
Ben-David, S.; Blitzer, J.; Crammer, K.; Pereira, F. Analysis of Representations for Domain Adaptation. In Advances in Neural Information Processing Systems 19; The MIT Press: Cambridge, MA, USA, 2007; pp. 137–144. ISBN 978-0-262-25691-9. [Google Scholar]
Ben-David, S.; Blitzer, J.; Crammer, K.; Kulesza, A.; Pereira, F.; Vaughan, J.W. A Theory of Learning from Different Domains. Mach. Learn. 2010, 79, 151–175. [Google Scholar] [CrossRef]
Fang, X.; Gong, G.; Li, G.; Chun, L.; Peng, P.; Li, W. A General Multi-Source Ensemble Transfer Learning Framework Integrate of LSTM-DANN and Similarity Metric for Building Energy Prediction. Energy Build. 2021, 252, 111435. [Google Scholar] [CrossRef]
Cuturi, M.; Blondel, M. Soft-DTW: A Differentiable Loss Function for Time-Series. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017. [Google Scholar] [CrossRef]
Sun, B.; Saenko, K. Deep CORAL: Correlation Alignment for Deep Domain Adaptation. In Proceedings of the Computer Vision-ECCV 2016 Workshops, Amsterdam, The Netherlands, 8–10 and 15–16 October 2016; Springer International Publishing: Amsterdam, The Netherlands; Volume 9915, pp. 443–450. [Google Scholar] [CrossRef]
Wang, J.; Zheng, V.W.; Chen, Y.; Huang, M. Deep Transfer Learning for Cross-Domain Activity Recognition. In Proceedings of the 3rd International Conference on Crowd Science and Engineering, Singapore, 28–31 July 2018; ACM: Singapore; pp. 1–8. [Google Scholar] [CrossRef]
Yu, Y.; Karimi, H.R.; Shi, P.; Peng, R.; Zhao, S. A New Multi-Source Information Domain Adaption Network Based on Domain Attributes and Features Transfer for Cross-Domain Fault Diagnosis. Mech. Syst. Signal Process. 2024, 211, 111194. [Google Scholar] [CrossRef]
Mustafa, B.; Riquelme, C.; Puigcerver, J.; Pinto, A.S.; Keysers, D.; Houlsby, N. Deep Ensembles for Low-Data Transfer Learning. arXiv 2020, arXiv:2010.06866. [Google Scholar] [CrossRef]
Houlsby, N.; Giurgiu, A.; Jastrzebski, S.; Morrone, B.; Laroussilhe, Q.d.; Gesmundo, A.; Attariyan, M.; Gelly, S. Parameter-Efficient Transfer Learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019; Volume 97, pp. 2790–2799. [Google Scholar] [CrossRef]
Elith, J.; Leathwick, J.R.; Hastie, T. A Working Guide to Boosted Regression Trees. J. Anim. Ecol. 2008, 77, 802–813. [Google Scholar] [CrossRef] [PubMed]
Partalas, I.; Tsoumakas, G.; Vlahavas, I. Focused Ensemble Selection: A Diversity-Based Method for Greedy Ensemble Selection. In Proceedings of the 18th European Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2008; IOS Press: Patras, Greece; Volume 178, pp. 117–121. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar] [CrossRef]
Wu, Q.; Zhou, X.; Yan, Y.; Wu, H.; Min, H. Online Transfer Learning by Leveraging Multiple Source Domains. Knowl. Inf. Syst. 2017, 52, 687–707. [Google Scholar] [CrossRef]

Figure 2. The framework of GRU-CNN.

Figure 3. The framework of feedforward control.

Figure 4. Actuators. (a) Source domain actuators; (b) Target domain actuators under batch production variations; (c) Target domain actuator under mechanical structure variations.

Figure 5. Domain dataset kernel density estimates. (a) Source domain; (b) Target domain.

Figure 6. Test platform.

Figure 7. The reduction percentages of evaluation metrics of multi-source transfer learning models over the average metrics of corresponding single-source transfer learning models. (a) Target domain #1; (b) Target domain #2; (c) Target domain #3; (d) Target domain #4.

Figure 8. Model performance disparities in the case of target domain #1. (a) Predicted voltage of Multi-LSTM-DANN; (b) Predicted voltage of Merged multi-source; (c) Predicted voltage of SMETL; (d) The deviations between the actual and predicted voltages for all frameworks.

Figure 9. Model performance disparities in the case of target domain #2. (a) Predicted voltage of Multi-LSTM-DANN; (b) Predicted voltage of Merged multi-source; (c) Predicted voltage of SMETL; (d) The deviations between the actual and predicted voltages for all frameworks.

Figure 10. Model performance disparities in the case of target #3. (a) Predicted voltage of Multi-LSTM-DANN; (b) Predicted voltage of Merged multi-source; (c) Predicted voltage of SMETL; (d) The deviations between the actual and predicted voltages for all frameworks.

Figure 11. Model performance disparities in the case of target domain #4. (a) Predicted voltage of Multi-LSTM-DANN; (b) Predicted voltage of Merged multi-source; (c) Predicted voltage of SMETL; (d) The deviations between the actual and predicted voltages for all frameworks.

Figure 12. Displacement control performance of the proposed SMETL. (a) Target domain #1; (b) Target domain #2; (c) Target domain #3; (d) Target domain #4.

Table 1. Domain datasets.

Domain	Source #1	Source #2	Source #3	Target #1	Target #2	Target #3	Target #4
Data origin	PEA #1	PEA #2	PEA #3	PEA #4	PEA #5	PEA #6	PEA #7
Samples	15,000	15,000	15,000	1000	1000	1000	1000

Table 2. Hyper-parameters setting of GRU-CNN.

Layer	Parameter	Input Dim	Output Dim
Input layer			100 × 1000 × 1
GRU layer	hidden neuron 35	100 × 1000 × 1	100 × 1000 × 35
CNN layer 1	kernel 2, stride 1, padding 1, channels 1000, dropout 0.1	100 × 1000 × 35	100 × 1000 × 35
CNN layer 2	kernel 2, stride 1, padding 1, channels 1000, dropout 0.1	100 × 1000 × 35	100 × 1000 × 35
Dense layer 1	neuron 25	100 × 1000 × 35	100 × 1000 × 25
Dense layer 2	neuron 25	100 × 1000 × 25	100 × 1000 × 25
Dense layer 3	neuron 1	100 × 1000 × 25	100 × 1000 × 1
Output layer			100 × 1000 × 1

Table 3. Test results of signal-source transfer learning models.

Source	Target #1		Target #2		Target #3		Target #4
Source	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE
-	0.143 ± 0.037	0.226 ± 0.056	0.486 ± 0.522	0.633 ± 0.664	0.396 ± 0.108	0.565 ± 0.141	0.311 ± 0.046	0.477 ± 0.074
#1	0.063 ± 0.004	0.117 ± 0.003	0.122 ± 0.009	0.175 ± 0.010	0.180 ± 0.007	0.271 ± 0.017	0.235 ± 0.002	0.344 ± 0.002
#2	0.067 ± 0.006	0.122 ± 0.006	0.133 ± 0.015	0.186 ± 0.020	0.186 ± 0.005	0.288 ± 0.013	0.245 ± 0.002	0.361 ± 0.003
#3	0.084 ± 0.003	0.150 ± 0.004	0.104 ± 0.009	0.153 ± 0.012	0.218 ± 0.009	0.364 ± 0.024	0.296 ± 0.005	0.425 ± 0.006

Table 4. PAD values of source–target pairs.

Source	Target #1	Target #2	Target #3	Target #4
#1	0.988	1.585	1.810	1.973
#2	1.595	1.700	1.903	1.985
#3	1.740	1.445	1.968	1.985

Table 5. Validation results of different transfer learning models.

Source Group	Target #1		Target #2		Target #3		Target #4
Source Group	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE
#1	0.057 ± 0.007	0.096 ± 0.011	0.119 ± 0.012	0.173 ± 0.017	0.151 ± 0.008	0.242 ± 0.019	0.103 ± 0.003	0.143 ± 0.010
#2	0.060 ± 0.007	0.099 ± 0.009	0.128 ± 0.017	0.182 ± 0.023	0.159 ± 0.006	0.257 ± 0.021	0.107 ± 0.004	0.147 ± 0.010
#3	0.076 ± 0.008	0.132 ± 0.015	0.105 ± 0.011	0.153 ± 0.014	0.197 ± 0.012	0.329 ± 0.039	0.155 ± 0.007	0.210 ± 0.013
#1 #2	0.053 ± 0.005	0.092 ± 0.009	0.113 ± 0.011	0.165 ± 0.016	0.146 ± 0.006	0.239 ± 0.018	0.098 ± 0.003	0.137 ± 0.010
#1 #3	0.061 ± 0.006	0.105 ± 0.013	0.102 ± 0.009	0.149 ± 0.013	0.154 ± 0.009	0.259 ± 0.024	0.117 ± 0.004	0.160 ± 0.011
#2 #3	0.063 ± 0.006	0.108 ± 0.011	0.105 ± 0.009	0.152 ± 0.013	0.159 ± 0.009	0.269 ± 0.030	0.120 ± 0.005	0.164 ± 0.011
#1 #2 #3	0.057 ± 0.005	0.099 ± 0.011	0.103 ± 0.009	0.150 ± 0.014	0.147 ± 0.007	0.248 ± 0.023	0.108 ± 0.004	0.149 ± 0.010

Table 6. Test results of actuators under batch production variations and different source dataset scales.

Ratio of Source Dataset	Method	Target #1		Target #2
Ratio of Source Dataset	Method	MAE	RMSE	MAE	RMSE
40%	Best individual model	0.085 ± 0.005	0.18 ± 0.005	0.115 ± 0.007	0.164 ± 0.007
40%	SMETL	0.081 ± 0.003	0.137 ± 0.003	0.115 ± 0.007	0.164 ± 0.007
60%	Best individual model	0.070 ± 0.002	0.124 ± 0.002	0.115 ± 0.008	0.160 ± 0.011
60%	SMETL	0.066 ± 0.004	0.121 ± 0.003	0.15 ± 0.008	0.160 ± 0.011
80%	Best individual model	0.067 ± 0.005	0.121 ± 0.004	0.113 ± 0.013	0.155 ± 0.013
80%	SMETL	0.061 ± 0.004	0.116 ± 0.003	0.106 ± 0.004	0.150 ± 0.004
100%	Best individual model	0.063 ± 0.004	0.117 ± 0.003	0.104 ± 0.009	0.153 ± 0.012
100%	SMETL	0.060 ± 0.003	0.114 ± 0.004	0.102 ± 0.006	0.146 ± 0.007

Table 7. Test results of actuators under mechanical structure variations and different source dataset scales.

Ratio of Source Dataset	Method	Target #3		Target #4
Ratio of Source Dataset	Method	MAE	RMSE	MAE	RMSE
40%	Best individual model	0.200 ± 0.010	0.197 ± 0.007	0.282 ± 0.003	0.401 ± 0.004
40%	SMETL	0.324 ± 0.023	0.313 ± 0.017	0.279 ± 0.002	0.398 ± 0.003
60%	Best individual model	0.198 ± 0.009	0.297 ± 0.018	0.246 ± 0.002	0.353 ± 0.002
60%	SMETL	0.189 ± 0.007	0.291 ± 0.017	0.239 ± 0.001	0.341 ± 0.002
80%	Best individual model	0.185 ± 0.006	0.286 ± 0.014	0.237 ± 0.001	0.349 ± 0.002
80%	SMETL	0.176 ± 0.006	0.288 ± 0.015	0.229 ± 0.001	0.333 ± 0.001
100%	Best individual model	0.180 ± 0.007	0.278 ± 0.017	0.235 ± 0.002	0.344 ± 0.002
100%	SMETL	0.173 ± 0.005	0.271 ± 0.014	0.231 ± 0.001	0.338 ± 0.002

Table 8. Test results of actuators under batch production variations and different target dataset scales.

Ratio of Target Dataset	Method	Target #1		Target #2
Ratio of Target Dataset	Method	MAE	RMSE	MAE	RMSE
10%	Best individual model	0.158 ± 0.040	0.246 ± 0.064	0.570 ± 0.151	0.825 ± 0.220
10%	SMETL	0.144 ± 0.021	0.224 ± 0.029	0.552 ± 0.070	0.763 ± 0.090
25%	Best individual model	0.093 ± 0.009	0.155 ± 0.013	0.172 ± 0.020	0.247 ± 0.031
25%	SMETL	0.090 ± 0.013	0.152 ± 0.017	0.170 ± 0.014	0.249 ± 0.030
50%	Best individual model	0.073 ± 0.004	0.128 ± 0.004	0.133 ± 0.024	0.196 ± 0.043
50%	SMETL	0.073 ± 0.005	0.128 ± 0.005	0.125 ± 0.014	0.180 ± 0.021
100%	Best individual model	0.063 ± 0.004	0.117 ± 0.003	0.104 ± 0.009	0.153 ± 0.012
100%	SMETL	0.060 ± 0.003	0.114 ± 0.004	0.102 ± 0.006	0.146 ± 0.007

Table 9. Test results of actuators under mechanical structure variations and different target dataset scales.

Ratio of Target Dataset	Method	Target #3		Target #4
Ratio of Target Dataset	Method	MAE	RMSE	MAE	RMSE
10%	Best individual model	0.491 ± 0.011	0.821 ± 0.018	0.436 ± 0.011	0.606 ± 0.013
10%	SMETL	0.491 ± 0.011	0.821 ± 0.018	0.425 ± 0.008	0.589 ± 0.012
25%	Best individual model	0.279 ± 0.020	0.473 ± 0.036	0.296 ± 0.009	0.426 ± 0.010
25%	SMETL	0.275 ± 0.020	0.474 ± 0.038	0.284 ± 0.008	0.408 ± 0.011
50%	Best individual model	0.224 ± 0.009	0.369 ± 0.017	0.259 ± 0.004	0.379 ± 0.006
50%	SMETL	0.219 ± 0.009	0.365 ± 0.017	0.249 ± 0.003	0.363 ± 0.005
100%	Best individual model	0.180 ± 0.007	0.278 ± 0.017	0.235 ± 0.002	0.344 ± 0.002
100%	SMETL	0.173 ± 0.005	0.271 ± 0.014	0.231 ± 0.001	0.338 ± 0.002

Table 10. Test results with different frameworks.

Frameworks	Target #1		Target #2		Target #3		Target #4
Frameworks	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE
1	0.305 ± 0.179	0.421 ± 0.198	0.193 ± 0.054	0.293 ± 0.081	0.411 ± 0.087	0.745 ± 0.085	0.616 ± 0.126	0.807 ± 0.112
2	0.062 ± 0.006	0.116 ± 0.006	0.101 ± 0.006	0.148 ± 0.007	0.179 ± 0.006	0.285 ± 0.015	0.239 ± 0.002	0.351 ± 0.003
SMETL	0.060 ± 0.003	0.114 ± 0.004	0.102 ± 0.006	0.146 ± 0.007	0.173 ± 0.005	0.271 ± 0.014	0.231 ± 0.001	0.338 ± 0.002

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, Y.; Jin, H.; Chu, X.; Cao, R. Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control. Actuators 2026, 15, 45. https://doi.org/10.3390/act15010045

AMA Style

Hu Y, Jin H, Chu X, Cao R. Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control. Actuators. 2026; 15(1):45. https://doi.org/10.3390/act15010045

Chicago/Turabian Style

Hu, Yaqian, Herong Jin, Xiangcheng Chu, and Ran Cao. 2026. "Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control" Actuators 15, no. 1: 45. https://doi.org/10.3390/act15010045

APA Style

Hu, Y., Jin, H., Chu, X., & Cao, R. (2026). Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control. Actuators, 15(1), 45. https://doi.org/10.3390/act15010045

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Selective Multi-Source Transfer Learning and Ensemble Learning for Piezoelectric Actuator Feedforward Control

Abstract

1. Introduction

2. Methodology

2.1. Similarity Between Domains

2.2. GRU-CNN

2.3. Ensemble Transfer Learning

3. Experiments and Analysis

3.1. Data Acquisition

3.2. Signal-Source Transfer Learning

3.3. Multi-Source Ensemble Learning

4. Discussion

4.1. Evaluation of Source Domain Data Volume

4.2. Evaluation of Target Domain Data Volume

4.3. Comparison of Different Frameworks

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI