A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning

Zhou, Fan; Han, Xiao; Ren, Jinyang; Wang, Wei; Wang, Yang; Zhang, Peiying; Liao, Shaolin

doi:10.3390/computers14090346

Open AccessArticle

A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning

by

Fan Zhou

¹,

Xiao Han

¹,

Jinyang Ren

¹,

Wei Wang

²,

Yang Wang

^1,*,

Peiying Zhang

^3,4

and

Shaolin Liao

^5,6,7,*

¹

School of Information Science and Engineering, Shenyang Ligong University (SYLU), No. 6 Nanping East Road, Shenyang 110159, China

²

National Key Laboratory of Electromagnetic Space Security, Xidian University, No. 2 South Taibai Road, Jiaxing 314000, China

³

Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), 66 Changjiang West Road, Qingdao 266580, China

⁴

Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Jinan 250013, China

⁵

School of Electronics and Information Technology, Sun Yat-sen University, No. 135 Xingang Xi Road, Guangzhou 510275, China

⁶

Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL 60616, USA

⁷

Elmore School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA

^*

Authors to whom correspondence should be addressed.

Computers 2025, 14(9), 346; https://doi.org/10.3390/computers14090346

Submission received: 18 July 2025 / Revised: 19 August 2025 / Accepted: 21 August 2025 / Published: 22 August 2025

(This article belongs to the Special Issue Wireless Sensor Networks in IoT)

Download

Browse Figures

Versions Notes

Abstract

In response to the problem where neural network models fail to fully learn signal sample features due to an insufficient number of signal samples, leading to a decrease in the model’s ability to recognize signal modulation methods, a few-shot signal modulation mode recognition method based on reinforcement metric meta-learning (RMML) is proposed. This approach, grounded in meta-learning techniques, employs transfer learning to building a feature extraction network that effectively extracts the data features under few-shot conditions. Building on this, by integrating the measurement of features of similar samples and the differences between features of different classes of samples, the metric network’s target loss function is optimized, thereby improving the network’s ability to distinguish between features of different modulation methods. The experimental results demonstrate that this method exhibits a good performance in processing new class signals that have not been previously trained. Under the condition of 5-way 5-shot, when the signal-to-noise ratio (SNR) is 0 dB, this method can achieve an average recognition accuracy of 91.8%, which is 2.8% higher than that of the best-performing baseline method, whereas when the SNR is 18 dB, the model’s average recognition accuracy significantly improves to 98.5%.

Keywords:

few-shot modulation recognition; meta-learning; transfer learning; metric learning

1. Introduction

With the rapid development of modern radio communication technology, automatic modulation recognition has become an indispensable component in both military and civil fields [1]. In the military field, modulation recognition is a key means to formulating effective military strategies and jamming enemy communications in modern warfare. In civil communications, modulation identification has a wide range of applications in spectrum management and resource allocation. Modulation identification can optimize the efficiency of the communication system and improve the spectrum utilization, allowing the communication network to operate efficiently and reliably [2]. The traditional modulation identification methods are mainly divided into two categories: those based on the likelihood ratio and those based on signal feature extraction. The likelihood ratio method is theoretically comprehensive but requires extensive a priori knowledge [3]. On the other hand, the signal feature extraction method is applicable to a variety of signal-to-noise ratios but relies on experience, and its identification stability and the range of recognizable modulation modes are limited [4].

In recent years, with the rapid development of deep learning, the use of neural networks for modulation identification has become a new trend [5,6,7,8,9,10,11,12]. Deep-learning-based methods have strong automatic learning capabilities and can extract deeper features. Compared with traditional methods, these methods offer a higher recognition accuracy. However, they require a large number of labeled signals for support. Acquiring these signals may be time-consuming. This requirement limits their application in real-world environments to a certain extent, especially in military confrontations. Therefore, accurately identifying the modulation of signals with few-shot learning has become a new research trend.

In order to address the aforementioned issues, many scholars have conducted in-depth research and proposed various solutions, which can be categorized into three main groups: generative adversarial network (GAN)-based methods, transfer learning, and meta-learning. For example, Mansi et al. proposed a new data augmentation method based on conditional generative adversarial networks (CGANs) which uses a small amount of sample data to generate labeled data, thereby significantly improving the recognition accuracy of CNN-based modulation classification networks in few-shot learning [13]. Zhidong Xie et al. enhanced the recognition performance of Support Vector Machines (SVMs) by constructing a deep convolutional generative adversarial network (LDCGAN) that included layer normalization [14]. To address the issue of confusing signals such as QPSK, BPSK, and QAM, Tang et al. used preprocessed constellation diagrams as the inputs and employed an Auxiliary Classifier Generative Adversarial Network (ACGAN) to identify the signal modulation modes, successfully improving the recognition accuracy for these confusing signals [15].

The above GAN-based methods significantly improve the modulation recognition accuracy in few-shot learning through data augmentation. However, their training process is complex, and the demand for computational resources is high, which poses limitations for practical applications. Transfer learning and meta-learning methods can effectively address this issue.

The training process of transfer learning is simple and can effectively utilize the knowledge from the source domain to enhance the feature extraction capability of the target domain. Bu et al. proposed an adversarial transfer learning architecture (ATLA) which aimed to reduce the difference in sampling rates, significantly improving the recognition accuracy of the target model while utilizing merely half of the training data [16]. Hu et al. employed a Domain Adaptation Network (DAN) for transfer learning, which maintained the recognition accuracy of the algorithm even with a significantly reduced number of samples in the target dataset [17]. Jing et al. proposed an adaptive focal loss function combined with transfer learning for radar signal intra-pulse modulation classification with small samples. By training a CNN in the source domain and transferring it to the target domain, the method significantly improves the recognition performance. The adaptive focal loss function focuses on difficult samples, achieving an average recognition rate of over 90% [18]. Xu et al. proposed a CNN-based transfer learning model for modulation format recognition [19]. The model uses the Hough transform of constellation diagrams to improve the recognition performance. This method effectively transfers knowledge from general tasks to specialized tasks, enhancing the overall recognition accuracy.

In the field of meta-learning, meta-learning can be classified into three main categories according to the different learning strategies employed: optimization-based meta-learning [20,21,22,23], model-based meta-learning [24,25,26,27], and metric-based meta-learning [28,29,30,31,32]. The metric-based meta-learning method has the ability to quickly adapt to new tasks compared with other methods. This method only requires a few labeled signals to achieve modulation mode recognition when facing new classes of signals, effectively solving the problem of recognizing modulation modes with few-shot learning. Therefore, many scholars have adopted metric-based meta-learning methods to address the few-shot modulation mode recognition problem. Pang Yi-Qiong et al. used a multi-task training strategy based on meta-learning to train the network through a large number of different tasks, enabling it to have cross-task signal recognition capabilities. This approach allows the network to quickly adapt with only a small number of samples when encountering new signal categories, achieving a recognition accuracy of up to 88.43% under the condition of only five labeled signal samples per category during the testing stage [33]. Zhang et al. proposed a novel attentional relational network model that fuses channel attention mechanisms with spatial attention mechanisms, aiming to learn and represent the features more efficiently [34]. Liu et al. introduced a scalable AMC scheme using the Meta-Transformer with few-shot learning and main-subtransformer-based encoders, enabling efficient and flexible modulation classification [35]. Hao et al. proposed a new meta-learning method, M-MFOR, for improving the efficiency and generalization of AMCs in the IoT. By combining multi-frequency ResNet and meta-task optimization, this method showed a better performance in environments with distributional bias [36].

The method of transfer learning has a better feature extraction ability when facing the target task, but it cannot directly cope with recognition problems when there are only a few shots [37]. Metric-based meta-learning effectively solves the recognition problem with only a few labeled samples, but its insufficient sample size leads to inadequate feature extraction by the network, which in turn affects the recognition accuracy [38]. Additionally, the existing metric functions do not sufficiently consider the relationships between inter- and intra-classes of different modulation methods. To address the above problems, this paper proposes a few-shot modulation mode recognition method based on reinforcement metric meta-learning (RMML). This method skillfully integrates the advantages of transfer learning and metric learning, effectively improving the extraction of the signal features by combining the strategies of transfer learning with the techniques of metric learning. Furthermore, innovative optimization of the metric loss function significantly enhances the model’s ability to recognize different modulation modes, effectively solving the problem of a poor modulation mode recognition performance under few-shot conditions.

2. Metric-Based Meta-Learning

In the radio environment, the signal

r (t)

that we collect is influenced by the transmission characteristics

C (t)

of the radio channel and the additive Gaussian white noise

n (t)

. Therefore,

r (t)

can be expressed as Equation (1):

r (t) = C (t) * x (t) + n (t) .

(1)

In this formula,

x (t)

denotes the transmitted modulated signal,

C (t)

denotes the transfer function of the radio channel, and

n (t)

denotes the additive Gaussian white noise. In few-shot modulation recognition, only a limited number of received signal samples are available for identifying the modulation type.

2.1. The Theoretical Basis of Metric Meta-Learning

The metric-based meta-learner is composed of two main components: the encoder and the metric function. The encoder typically consists of Convolutional Neural Networks (CNNs), which are utilized to extract the signal features from the support set and the query set. The metric function is employed to assess the similarity between the encoded query and support, thereby further optimizing the meta-learner.

In the meta-training phase, a model suitable for specific tasks is developed by training on the meta-training set

D_{t r}

using an N-way K-shot strategy. Consequently, during the meta-testing phase, only a minimal number of labeled signals are required to rapidly adapt to new tasks, demonstrating strong generalization capabilities. It is important to note that the modulation methods used in the meta-training set do not overlap with those in the meta-test set, and the categories present in the meta-test set have never appeared in the meta-training set.

The N-way K-shot strategy: In the meta-training process, we employ an episode-based training strategy. In each episode, N categories of modulations are randomly selected from the meta-training set, and K labeled signals are chosen from each category. The collected N × K signals form the support set, denoted as

D^{S}

. Additionally, Q unlabeled signals are selected from the remaining signals of each category to compose the query set

D^{Q}

. The metric loss function is optimized by measuring the distance, d, between the signals of the query set and the support set to determine the category from the support set to which the query set signals belong. The N-way K-shot training process is detailed in Algorithm 1.

Algorithm 1 The training procedure for N-way K-shot.

Input:: $D_{t r} = {(x_{i}, y_{i})}_{i = 1}^{n}$ $D_{t r}$ denotes the data of the training set; $x_{i}$ denotes the signal sample; $y_{i}$ denotes the sample label;

Output:: $θ$ $θ$ denotes the weight of the model.

1:: formed by randomly selecting N-class samples from $D_{t r}$ to construct $D^{S}$ and $D^{Q}$ ;
2:: for $iter = 0$ to Iterations do
3:: Select N categories in the training data and K in each category to form $D^{S}$ ; $D^{S} \leftarrow D_{t r}$ ;
4:: Select Q of the remaining samples from N categories to form $D^{Q}$ ; $D^{Q} \leftarrow D_{t r}$ ;
5:: $C_{k} \leftarrow \frac{1}{K} \sum_{(x_{i}, y_{i}) \in D^{S}} f_{φ} (x_{i})$ ;
6:: end for
7:: $J (φ) \leftarrow 0$ ;
8:: for $k \in {0, \dots, N - 1}$ do
9:: for $(x, y) \in D^{Q}$ do
10:: $J (φ) \leftarrow J (φ) + softmax (d (f_{φ} (x), C_{k}))$ ;
11:: end for
12:: end for

2.2. The Prototype Network

The prototype network is an efficient classification network within metric learning. This network achieves fast and accurate few-shot recognition through prototype representation of categories. Compared to other metric learning methods, it offers superior generalization capabilities and reduced computational costs.

In this paper, we will optimize the meta-learner using a modified prototype network. The workflow of the prototype network is illustrated in Figure 1, and its training adheres to the N-way K-shot strategy previously mentioned. Initially, an encoder

f_{φ}

is utilized to extract features from the signals in the support and query sets, producing embedding

f_{φ} (x)

. Subsequently, the mean of K embeddings for each class in the support set is calculated to obtain the prototype representation of that class

c_{k}

, as demonstrated in Equation (2).

c_{k} = \frac{1}{K} \sum_{i} f_{ϕ} (X_{i}^{k}) .

(2)

Thus, the essence of a class prototype is the average embedding of all samples for each class in the support set. When a new query point is introduced, the same embedding function used to create the class prototypes is employed to generate an embedding for this query point. The class to which the query point belongs is determined by comparing the distances between the query point’s embedding and each class prototype. The Euclidean distance can be utilized to measure these distances. After calculating the distances between all class prototypes and the query point embedding, a softmax function is applied to these distances to derive the probability of the query point belonging to each class, as illustrated in Equation (3).

P (Y^{'} = k ∣ X) = \frac{exp (- d (f_{ϕ} (X), c_{k}))}{\sum_{k} exp (- d (f_{ϕ} (X), c_{k}))} .

(3)

The method, however, exhibits some shortcomings when applied to the field of modulation mode recognition. With very few labeled signal samples available, the feature extraction network may fail to adequately extract features. Furthermore, the metric function only considers the geometric distances between the query set samples and the support set, neglecting the intra-class and inter-class relationships. This oversight results in lower accuracy in modulation recognition. We have made improvements to this approach, which will be detailed in Section 3.

3. Method

In order to address the aforementioned issues, we propose a few-shot modulation recognition method based on reinforcement metric meta-learning (RMML). This method is structured into three phases—the pre-training phase, the meta-learning phase, and the meta-testing phase—as illustrated in Figure 2.

In the pre-training phase, the encoder, which is the MAFNet feature extraction network, is trained using traditional deep learning methods with a large number of sample signals from the training set. This process results in an initial network model, referred to as Model A.

In the meta-training phase, pre-trained Model A undergoes further optimization and training using the N-way K-shot strategy and a prototype network. Simultaneously, the metric loss function within the prototype network is enhanced to better accommodate the few-shot modulation mode recognition task. At this stage, we have successfully refined and optimized the existing network model, now referred to as Model B.

In the meta-testing phase, we use Model B to predict new signal classes in the meta-test set that never appeared in the pre-training and meta-training phases. We input the support set containing a small number of labeled signals from the meta-testing task and the unlabeled query set into Model B for feature extraction and then determine the modulation type of the signals in the query set via a metric function. We will detail the exact implementation in the following.

3.1. The MAFNet Feature Extraction Network

To fully extract features at different levels, our method employs the Multi-Style Attention Fusion Network (MAFNet) [39] as the encoder. As shown in Figure 3, this network structure comprises three main components: a deep convolutional module, a Spatial Attention Module (SAM), and a Channel Attention Module (CAM). The structures of the SAM and the CAM are shown in the Figure 4. To ensure that the network can effectively extract signal features from various dimensions, it fuses low-, middle-, and high-level features through feature concatenation.

In the first feature fusion module, the SAM is applied to extracting the low-level features, which mainly come from the shallow convolutional layers and focus on capturing the local spatial structures of the signal. For middle-level features, a dual attention mechanism that combines the SAM and CAM is used to jointly enhance and integrate both spatial and channel information from intermediate convolutional layers. High-level features are obtained from the deep convolutional layers through the CAM, emphasizing the interdependencies between channels at a higher abstraction level. These low-, middle-, and high-level features, extracted from different network modules, are subsequently fused through a learnable convolutional fusion operation. Finally, the features from these three levels are aggregated. It is noteworthy that this approach does not merely add or stack multi-level features; instead, it uses a convolutional network for feature fusion. This feature fusion module is learnable, enabling it to better capture and integrate multi-level features. The specific parameters and details of the network are presented in Table 1. The authors of [40] pointed out that replacing large convolutional kernels with multiple smaller ones can effectively reduce the number of parameters and computational cost without sacrificing accuracy. Therefore, this paper employs multiple convolutional kernels with a size of 3.

3.2. The Optimization of Feature Extraction Networks

In metric-based meta-learning frameworks, the main problem in the feature extraction phase is that the scarcity of training samples makes it difficult for the network to achieve effective convergence, which hinders the comprehensive extraction of the signal features. To address this issue effectively, this study integrates the concepts of transfer learning and metric learning.

Transfer learning leverages pre-training on large-scale datasets to enable the models to learn common feature representations, which are then transferred to the target task. This transfer of knowledge helps the model learn and adapt more quickly to new tasks, especially when there is limited data available.

In this method, we can initially pre-train the MAFNet feature extraction network using 70% of the sample signals from the meta-training set to extract a generalized feature representation. This pre-trained network, already equipped with extensive feature extraction capabilities, can more accurately capture and extract information from the data, thereby quickly adapting to new tasks.

During the meta-learning process, a pre-trained MAFNet feature extraction network is used as the initial network. It is fine-tuned using an N-way K-shot training strategy, along with an improved prototype network, to adapt to specific meta-learning tasks better. This strategy effectively leverages the benefits of transfer learning, both accelerating the model convergence and significantly enhancing the model’s learning capabilities and generalization performance in few-shot scenarios.

3.3. Optimization of the Metric Function

The metric function in prototype networks typically utilizes the Euclidean distance. However, the Euclidean distance only accounts for the geometric distances between data points and overlooks the distribution characteristics of the samples, potentially affecting the accuracy of the metric results. Furthermore, when optimizing the loss function, the focus is solely on the distances between query points and the class prototypes in the support set. This approach neglects the relationships both among different classes within the support set and among similar samples within the same class, resulting in a reduced discriminative ability between different modulations, which adversely affects the recognition accuracy.

To address these issues, this paper introduces a novel joint loss function that comprehensively accounts for the relationships both between the query set and the support set and within the support set itself. This approach significantly enhances the discriminative ability between different modulations and the overall prediction accuracy of the model. Not only does this method consider the geometric distances between sample points but it also takes into account the distribution among the samples, effectively addressing the shortcomings of using the Euclidean distance as the metric function.

We propose a joint metric function which is composed of two components: the support set loss and query set loss. First, we use an encoder to encode the signal, obtaining an embedding. We then represent the signal using the mean and variance for this embedding. Subsequently, for each class of signal, we calculate the mean and variance. The mean is denoted as

m e a n_{i}

for the prototype of class i, and the variance is denoted as

i n t r a_{v a r_{i}}

, representing the dispersion among all signals within class i. Furthermore, we calculate the variance across all classes in the support set, termed the inter-class variance, which reflects the relationships between different classes within the support set. Further, the variance is calculated again for all classes within the support set to obtain the inter-class variance

i n t e r_{v a r}

, which reflects the relationships between different classes in the support set.

First, we normalize

i n t r a_{v a r_{i}}

for each class in the support set and transform it using the Sigmoid function. Subsequently, we compute the mean of the normalized variances, sum these means, and divide by the total number of classes to obtain the intra-class loss of the whole support set. For the inter-class loss, we performed the same Sigmoid normalization of the

i n t e r_{v a r}

and calculated its mean as the inter-class loss for the entire support set. Finally, in order to make the modulated signals of the same class closer to each other and those of different classes farther away, we obtain the support set loss by calculating the ratio of the intra-class loss to the inter-class loss as shown in Equation (4).

l o s s_{S} = \frac{\frac{1}{N} \sum_{N}^{i = 1} m e a n (S i g m o i d (i n t r a_{v a r_{i}}))}{m e a n (S i g m o i d (i n t e r_{v a r}))} .

(4)

For the query set loss, we utilize the Mean Squared Error (MSE) to measure the distance between a query point and its corresponding class prototype in the support set, aiming to increase the similarity between the query point and the class prototype. Specifically, as depicted in Equation (5), y represents the vector of query points derived from the encoder-processed query set,

μ

is the class prototype corresponding to that query point, and M is the total number of samples in the query set.

l o s s_{Q} = \frac{\sum_{j = 1}^{M} (\frac{1}{n} \sum_{i = 1}^{n} {(y_{j i} - μ)}^{2})}{M} .

(5)

We define the joint loss as the weighted sum of the support set loss and the query set loss, as denoted in Equation (6). Extensive experimental validation has shown that the best results are achieved when the ratio factor for the support set loss is set to 10.

l o s s = 10 \times l o s s_{S} + l o s s_{Q} .

(6)

3.4. Meta-Testing

In the meta-testing phase, we randomly draw a large number of N-way K-shot test tasks from the meta-test set. Unlike in the meta-training phase, the query set is no longer used to tune the model parameters at this stage; it is solely for evaluating the model’s performance. For each test task, the trained model is utilized to extract features from both the support and query sets. We then predict the modulation mode of the signals in the query set by calculating the Mean Squared Error (MSE) distance between the query points and the support set prototypes.

4. Experiments and Analysis

4.1. The Dataset and Experimental Details

In order to validate the effectiveness of RMML, we have selected the RadioML 2016.10a dataset [41] for simulation validation. This dataset contains 11 different modulation modes, with detailed information as shown in Table 2. The signal-to-noise ratio ranges from −20 to 18 dB, with intervals of 2 dB, and each modulation method includes 1000 samples. We divided all of the modulation modes into a meta-training set and a meta-test set, randomly selecting 6 modulations to form the meta-training set and the remaining 5 modulations to form the meta-test set. It is important to note that the modulation modes in the meta-training set and the meta-test set do not overlap.

In the pre-training phase, we optimized the MAFNet network parameters using Adam, with an initial learning rate set to 0.01, over 30 epochs of training. During the meta-training phase, we conducted training across 1000 episodes. In each episode, we randomly selected N-way K-shot recognition tasks from the meta-training set for training, with the specific settings of N and K detailed in Table 1 and Table 2. We used the pre-trained model as the initial model and continued optimization using the Adam optimizer, maintaining the initial learning rate at 0.01 and dynamically adjusting it based on the loss function. In the meta-testing phase, to eliminate the randomness of a single test experiment and ensure the reliability and stability of the evaluation results, we randomly selected 200 test tasks and calculated their average recognition accuracy and standard deviation (95% confidence) as the final evaluation metric for this method. Similar to the meta-training phase, in each test experiment, we also randomly selected N-way K-shot tasks from the meta-test set for testing.

All experiments were performed on a workstation equipped with an NVIDIA RTX 3090 GPU (24 GB memory), an Intel Xeon W-2245 CPU, and 128 GB of RAM. The pre-training phase took approximately 1.5 h for 30 epochs, while in the 5-way 5-shot setting, meta-training over 1000 episodes required about 0.5 h. Meta-testing under each configuration was completed in less than 1 min.

4.2. Analyzing the Performance of RMML Across Various Sample Sizes

In order to assess the impact of varying sample sizes on the recognition performance of RMML under consistent conditions, we conducted experiments focusing on recognition tasks for 3- and 5-class modulated signals, namely 3/5-way K-shot tasks. We analyzed various aspects of the recognition performance when the number of samples (K) in each support set class was set to 1, 5, 10, and 15, respectively. Figure 5 and Figure 6 illustrate how the recognition accuracy varies with changes in the signal-to-noise ratio. At a signal-to-noise ratio of 10 dB, a 3-way 1-shot test task achieved a recognition accuracy of 83.2%, further demonstrating the viability of RMML in few-shot scenarios.

When the value of K is small, increasing the sample size significantly enhances the recognition accuracy. However, when

K ⩾ 5

, further increases in the sample size do not significantly improve the recognition accuracy. This result suggests that RMML is more suitable for solving the problem of modulation recognition in extreme cases where only a small number of labeled signal samples are available.

4.3. The Ablation Experiment

4.3.1. The Improvement Ablation Experiment

To enhance the modulation recognition accuracy of the few-shot recognition model further when dealing with only a few labeled samples, this paper optimizes the metric function of the prototype network and introduces a migration pre-training module. To validate the effectiveness of these improvements, an ablation experiment was conducted using the dataset in Table 2, with the average of 200 test recognition accuracies chosen as the measure of the performance. The experimental results, presented in Figure 7, demonstrate that each modification made to the prototype network significantly boosts the model’s recognition capabilities.

4.3.2. Comparative Ablation Experiments

To demonstrate the performance advantage of RMML over other small-sample modulation recognition methods, this section includes comparative experiments with three approaches: a prototypical network, relation network, and transfer learning. To ensure fairness in these comparisons, all experiments employed the experimental data and the MAFNet feature extraction network structure, as shown in Table 2. The variation in the average recognition accuracy of the modulation methods with the signal-to-noise ratio for a 5-way 5-shot test task is depicted in Figure 8.

As shown in the figure, the method used in this study achieved the highest recognition accuracy. The recognition performance of the relational network is superior compared to that of the other two methods. This method also employs a metric-based meta-learning strategy that utilizes a neural network instead of the traditional metric function. However, this modification leads to an increase in the number of parameters and the complexity of neural network training.

The approach based on transfer learning exhibits the lowest recognition accuracy, primarily because when there are only a very small number of samples in the target domain, the model’s insufficient generalization capability can lead to severe overfitting during the testing phase, significantly impairing the recognition performance. This observation further corroborates the point made in the introduction that transfer learning methods are unable to effectively address recognition challenges with only a few labeled samples. The recognition accuracy of the prototypical network is also relatively low due to two main factors, the shortcomings of its metric function and the limited number of samples, which results in insufficient feature extraction by the network, thereby reducing its recognition accuracy.

4.4. A Comparative Analysis of Different Few-Shot Modulation Recognition Methods

To further validate the performance advantages of RMML, this experiment compares it with four state-of-the-art small-sample modulation recognition methods. The selected methods for comparison include an Attention-Based Modulation Classification Relational Network (AMCRN) [42], Feature-Transformation-Based Few-Shot Modulation Recognition (FTFMR) [43], a Hybrid Prototypical Network (HPN) [17], and Model-Agnostic Meta-Learning (MAML) [25]. All of the methods utilized the dataset shown in Table 2 to ensure consistency in the experimental conditions. The experiments were conducted under 5-way 5-shot conditions, and the results on the average accuracy of modulation mode recognition with various the signal-to-noise ratio are depicted in Figure 9. The shaded region in the figure denotes the error obtained by repeatedly partitioning the training and test sets, with the upper and lower bounds corresponding to the maximum and minimum accuracies, respectively. The simulation results demonstrate that RMML achieves a higher recognition accuracy than the other comparative methods.

The AMCRN is a method that achieves efficient modulation mode classification by constructing a relational network, maintaining a high classification accuracy even at low signal-to-noise ratios. FTFMR combines transfer learning and MAML, achieving efficient modulation mode recognition under small-sample conditions by leveraging the existing knowledge of large-scale datasets. The HPN combines the advantages of prototype networks and other machine learning models, also achieving efficient modulation mode recognition under small-sample conditions. MAML is a model-independent meta-learning approach that enables quick adaptation to new tasks by sharing the model parameters across different tasks, making it suitable for small-sample learning scenarios.

The experimental results show that the average accuracy of modulation mode recognition with RMML is higher than that with the AMCRN, FTFMR, HPN, and MAML under various signal-to-noise ratio conditions. This further validates the performance advantage of RMML in small-sample modulation mode recognition and proves its feasibility for practical applications.

Figure 10 and Figure 11 display the F1-scores of the RMML, AMCRN, and FTFMR methods for 11 types of modulated signals in the meta-test set under SNR conditions of 0 dB and 10 dB, respectively. The figures clearly show that the RMML method proposed in this study achieves superior F1-scores in recognizing the 11 modulated signals compared to those of the other two methods under both 0 dB and 10 dB SNR conditions.

In the field of modulation mode identification, distinguishing between high-order modulated signals (e.g., 64QAM) and low-order signals (e.g., 16QAM) has been a continual challenge [8]. To investigate RMML’s discernment, these two types of modulations, AM-SSB, BPSK, WBFM, 16QAM, and 64QAM, were selected as the meta-test set, while the remaining six modulation types were used as the meta-training set. The experiment was conducted under two different SNR levels (0 dB and 10 dB) in a 5-way 5-shot scenario. The experimental results are depicted in the confusion matrix shown in Figure 12 and Figure 13.

From the experimental results depicted in Figure 12 and Figure 13, it is evident that the method described in this paper demonstrates a superior performance in distinguishing between high-order and low-order modulated signals, without the anticipated significant signal confusion. This suggests that our method possesses notable advantages over the conventional modulation mode identification methods in addressing the challenge of differentiating between high-order and low-order modulated signals.

In order to validate the generalization capability of RMML further, we intentionally configured the meta-training set to include only digital modulation (digital modulation is a modulation method that transmits information through discrete changes in amplitude, phase, or frequency) methods, specifically including 8PSK, BPSK, CPFSK, GFSK, 16QAM, and QPSK. The meta-testing set, on the other hand, comprised a combination of analog (analog modulation refers to continuous-wave modulation schemes such as wide-band frequency modulation and amplitude modulation variants) and digital modulations, specifically for WBFM, 64QAM, AM-DSB, PAM4, and AM-SSB. As clearly demonstrated by the confusion matrices shown in Figure 14 and Figure 15, this particular setup did not adversely affect the recognition performance of RMML.

5. Conclusions

This paper proposes RMML, a few-shot modulation recognition framework that integrates transfer learning with metric-based meta-learning, and explicitly characterizes intra-class compactness and inter-class separability through a joint loss function. The MAFNet encoder is first pre-trained via transfer learning and then fine-tuned with the proposed metric extension, which effectively alleviates the problem of insufficient feature extraction under few-shot conditions. On the RadioML2016.10a dataset within the SNR range of −18 dB to 18 dB, RMML achieves consistent advantages over the representative baseline methods. The simulation results show that under the 5-way 5-shot setting with an SNR = 10 dB, RMML attains an average accuracy of approximately 96.5%, surpassing the best comparative model by 5.13% and outperforming the existing baselines, thereby effectively addressing the challenges of signal modulation recognition in few-shot scenarios. In addition, the method demonstrates a strong capability to distinguish high-order from low-order modulations and also exhibits advantages in terms of its generalization performance.

The main contribution of RMML lies in effectively coupling transfer learning with a variance-sensitive metric that accounts for the intra-class and inter-class relationships while explicitly separating the corresponding loss terms. This design brings substantial improvements under few-shot constraints. Furthermore, during inference, RMML only requires prototype construction and distance computation, resulting in a low overhead. Although pre-training and episode-based meta-learning incur additional training costs, compared with more parameter-intensive metric models (such as relation networks), RMML offers a more favorable accuracy–efficiency trade-off. This makes it well suited to scenarios with scarce labeled data that require rapid adaptation and deployment.

As the present study is based on a single public dataset and simulated channels, future work will focus on validating the robustness of RMML against domain shifts using real over-the-air data and broader datasets. Extensions to more diverse datasets and real-world communication channels, as well as the exploration of adaptive weighting or alternative metrics, will optimize the accuracy–efficiency balance further. Overall, RMML provides a novel and practically valuable solution for few-shot modulation recognition while also pointing to future directions for enhancing the methodological rigor and engineering applicability.

Author Contributions

Validation, J.R.; Formal analysis, W.W.; Writing—original draft, F.Z.; Writing—review & editing, X.H.; Supervision, Y.W. and S.L.; Project administration, P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shenyang Science and Technology Program (No. 23-503-6-16), the General Project of Education Department of Liaoning Province in 2022 (No. LJKMZ20220612), the Central Government Leads Local Science and Technology Development Projects (2022020128-JH6/1001), and the National Natural Science Foundation of China (No. 61971291).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Nandi, A.K.; Azzouz, E.E. Algorithms for automatic modulation recognition of communication signals. IEEE Trans. Commun. 1998, 46, 431–436. [Google Scholar] [CrossRef]
Kulin, M.; Kazaz, T.; Moerman, I.; De Poorter, E. End-to-end learning from spectrum data: A deep learning approach for wireless signal identification in spectrum monitoring applications. IEEE Access 2018, 6, 18484–18501. [Google Scholar] [CrossRef]
Zheng, J.; Lv, Y. Likelihood-based automatic modulation classification in OFDM with index modulation. IEEE Trans. Veh. Technol. 2018, 67, 8192–8204. [Google Scholar] [CrossRef]
Al Nuaimi, D.H.; Hashim, I.A.; Zainal Abidin, I.S.; Salman, L.B.; Mat Isa, N.A. Performance of feature-based techniques for automatic digital modulation recognition and classification—A review. Electronics 2019, 8, 1407. [Google Scholar] [CrossRef]
Liu, Y.; Liu, Y. Modulation recognition with pre-denoising convolutional neural network. Electron. Lett. 2020, 56, 255–257. [Google Scholar] [CrossRef]
Zhang, F.; Luo, C.; Xu, J.; Luo, Y.; Zheng, F.C. Deep learning-based automatic modulation recognition: Models, datasets, and challenges. Digit. Signal Process. 2022, 129, 103650. [Google Scholar] [CrossRef]
Wang, Y.; Liu, M.; Yang, J.; Gui, G. Data-driven deep learning for automatic modulation recognition in cognitive radios. IEEE Trans. Veh. Technol. 2019, 68, 4074–4077. [Google Scholar] [CrossRef]
Zhang, T.; Shuai, C.; Zhou, Y. Deep learning for robust automatic modulation recognition method for IoT applications. IEEE Access 2020, 8, 117689–117697. [Google Scholar] [CrossRef]
Fei, S.; Zhang, C. Research on modulation mode recognition based on dual-channel hybrid network model. J. Shenyang Ligong Univ. 2023, 42, 34–39. [Google Scholar]
Qian, L.; Wu, H.; Zhang, T.; Yang, X. Research and implementation of modulation recognition based on cascaded feature fusion. IET Commun. 2023, 17, 1037–1047. [Google Scholar] [CrossRef]
Liu, B.; Ge, R.; Zhu, Y.; Zhang, B.; Bao, Y. Universal and complementary representation learning for automatic modulation recognition. Electron. Lett. 2023, 59, e13004. [Google Scholar] [CrossRef]
Zhou, R.; Liu, F.; Gravelle, C.W. Deep learning for modulation recognition: A survey with a demonstration. IEEE Access 2020, 8, 67366–67376. [Google Scholar] [CrossRef]
Patel, M.; Wang, X.; Mao, S. Data augmentation with conditional GAN for automatic modulation classification. In Proceedings of the 2nd ACM Workshop on Wireless Security and Machine Learning, Miami, FL, USA, 15–17 May 2019; ACM: New York, NY, USA, 2019; pp. 1–4. [Google Scholar]
Xie, Z.; Tan, X.; Yuan, X. Few-shot signal modulation recognition algorithm based on support vector machine with generative adversarial data augmentation. J. Electron. Inf. Technol. 2023, 45, 2071–2080. [Google Scholar]
Lee, I.; Lee, W. UniqGAN: Unified generative adversarial networks for augmented modulation classification. IEEE Commun. Lett. 2021, 26, 355–358. [Google Scholar] [CrossRef]
Bu, K.; He, Y.; Jing, X.; Han, J. Adversarial transfer learning for deep learning-based automatic modulation classification. IEEE Signal Process. Lett. 2020, 27, 880–884. [Google Scholar] [CrossRef]
Hu, Y.; Li, C.; Wang, X.; Liu, L.; Xu, Y. Modulation recognition of optical and electromagnetic waves based on transfer learning. Optik 2023, 291, 171359. [Google Scholar] [CrossRef]
Jing, Z.; Li, P.; Wu, B.; Yuan, S.; Chen, Y. An adaptive focal loss function based on transfer learning for few-shot radar signal intra-pulse modulation classification. Remote Sens. 2022, 14, 1950. [Google Scholar] [CrossRef]
Mohamed, S.E.D.N.; Mortada, B.; Ali, A.M.; El-Shafai, W.; Khalaf, A.A.M.; Zahran, O.; Dessouky, M.I.; El-Rabaie, E.-S.M.; El-Samie, F.E.A. Modulation format recognition using CNN-based transfer learning models. Opt. Quantum Electron. 2023, 55, 343. [Google Scholar] [CrossRef]
Nichol, A.; Achiam, J.; Schulman, J. On first-order meta-learning algorithms. arXiv 2018, arXiv:1803.02999. [Google Scholar]
Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning (ICML 2017), Sydney, Australia, 6–11 August 2017; PMLR: Cambridge, MA, USA, 2017; pp. 1126–1135. [Google Scholar]
Bian, W.; Chen, Y.; Ye, X.; Zhang, Q. An optimization-based meta-learning model for MRI reconstruction with diverse dataset. J. Imaging 2021, 7, 231. [Google Scholar] [CrossRef]
Khabarlak, K. Faster optimization-based meta-learning adaptation phase. arXiv 2022, arXiv:2206.05930. [Google Scholar]
Santoro, A.; Bartunov, S.; Botvinick, M.; Wierstra, D.; Lillicrap, T. Meta-learning with memory-augmented neural networks. In Proceedings of the International Conference on Machine Learning (ICML 2016), New York, NY, USA, 19–24 June 2016; PMLR: Cambridge, MA, USA, 2016; pp. 1842–1850. [Google Scholar]
Li, W.; Wang, C.H.; Cheng, G.; Song, Q. Optimum-statistical collaboration towards general and efficient black-box optimization. arXiv 2021, arXiv:2106.09215. [Google Scholar]
Vuorio, R.; Sun, S.H.; Hu, H.; Lim, J.J. Multimodal model-agnostic meta-learning via task-aware modulation. Adv. Neural Inf. Process. Syst. 2019, 32, 1. [Google Scholar]
Huisman, M.; Van Rijn, J.N.; Plaat, A. A survey of deep meta-learning. Artif. Intell. Rev. 2021, 54, 4483–4541. [Google Scholar] [CrossRef]
Chicco, D. Siamese neural networks: An overview. In Artificial Neural Networks; Springer: Cham, Switzerland, 2021; pp. 73–94. [Google Scholar]
Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–23 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1199–1208. [Google Scholar]
Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 2017, 30, 4077–4087. [Google Scholar]
Chen, J.; Zhan, L.M.; Wu, X.M.; Chung, F.L. Variational metric scaling for metric-based meta-learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020), New York, NY, USA, 7–12 February 2020; AAAI Press: Palo Alto, CA, USA, 2020; pp. 3478–3485. [Google Scholar]
Huang, J.; Li, X.; Wu, B.; Wu, X.; Li, P. Few-shot radar emitter signal recognition based on attention-balanced prototypical network. Remote Sens. 2022, 14, 6101. [Google Scholar] [CrossRef]
Pang, Y.; Xu, H.; Jiang, L.; Shi, X. Meta-learning-based few-shot modulation recognition algorithm. J. Air Force Eng. Univ. 2022, 23, 77–89. [Google Scholar]
Zhang, Z.; Li, Y.; Gao, M. Few-shot learning of signal modulation recognition based on attention relation network. In Proceedings of the 28th European Signal Processing Conference (EUSIPCO 2021), Dublin, Ireland, 23–27 August 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1372–1376. [Google Scholar]
Jang, J.; Pyo, J.; Yoon, Y.I.; Choi, J. Meta-transformer: A meta-learning framework for scalable automatic modulation classification. IEEE Access 2024, 12, 9267–9276. [Google Scholar] [CrossRef]
Hao, X.; Feng, Z.; Yang, S.; Wang, M.; Jiao, L. Automatic modulation classification via meta-learning. IEEE Internet Things J. 2023, 10, 12276–12292. [Google Scholar] [CrossRef]
Deng, W.; Xu, Q.; Li, S.; Wang, X.; Huang, Z. Cross-domain automatic modulation classification using multimodal information and transfer learning. Remote Sens. 2023, 15, 3886. [Google Scholar] [CrossRef]
Hospedales, T.; Antoniou, A.; Micaelli, P.; Storkey, A. Meta-learning in neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 5149–5169. [Google Scholar] [CrossRef] [PubMed]
Liang, Y.; Qin, G.; Sun, M.; Yan, J.; Jiang, H. MafNet: Multi-style attention fusion network for salient object detection. Neurocomputing 2021, 422, 22–33. [Google Scholar] [CrossRef]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 2818–2826. [Google Scholar]
O’Shea, T.J.; West, N. Radio machine learning dataset generation with GNU Radio. In Proceedings of the GNU Radio Conference, Boulder, CO, USA, 12–16 September 2016; Volume 1, pp. 1–5. [Google Scholar]
Zhou, Q.; Zhang, R.; Mu, J.; Zhang, H.; Zhang, F.; Jing, X. AMCRN: Few-shot learning for automatic modulation classification. IEEE Commun. Lett. 2021, 26, 542–546. [Google Scholar] [CrossRef]
Xiao, W.; Zeng, Y.; Gong, Y. Few-shot modulation recognition with feature transformation and meta-learning. In Proceedings of the 24th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC 2023), Shanghai, China, 25–28 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 121–125. [Google Scholar]

Figure 1. The process flow diagram for the prototype network.

Figure 2. The pre-training, meta-training, and meta-testing processes of RMML for few-shot modulation recognition.

Figure 3. Feature extraction network MAFNet.

Figure 4. Channel Attention Module (CAM) and Spatial Attention Module (SAM).

Figure 5. Recognition performance of RMML in 3-way tasks.

Figure 6. Recognition performance of RMML in 5-way tasks.

Figure 7. Improvement in ablation experiments.

Figure 8. Comparative ablation experiments.

Figure 9. Comparison of the performance of RMML with four other recent modulation recognition methods.

Figure 10. F1-score comparison (SNR = 0).

Figure 11. F1-score comparison (SNR = 10).

Figure 12. Distinguishing higher-order modulated signals from lower-order modulated signals: modulation confusion matrix diagram (SNR = 0).

Figure 13. Distinguishing higher-order modulated signals from lower-order modulated signals: modulation confusion matrix diagram (SNR = 10).

Figure 14. Confusion matrix using purely digital modulated signals for training and analog modulated signals for testing (SNR = 0).

Figure 15. Confusion matrix using purely digital modulated signals for training and analog modulated signals for testing (SNR = 10).

Table 1. The details of each layer in MAFNet.

Model	Input Channels	Output Channels	Kernal Size	Stride	Padding	Additional Notes
Custom_cnn	2	64	3	1	1	Conv1d + BatchNorm1d + ReLU
Custom_cnn	64	64	3	1	1	Conv1d + BatchNorm1d + ReLU
MaxPool1d	64	64	3	2	1	–
Custom_cnn	64	128	3	1	1	Conv1d + BatchNorm1d + ReLU
Custom_cnn	128	128	3	1	1	Conv1d + BatchNorm1d + ReLU
MaxPool1d	128	128	3	2	1	–
SAM	128	128	7	1	3	–
Custom_cnn	128	256	3	1	1	Conv1d + BatchNorm1d + ReLU
Custom_cnn	256	256	3	1	1	Conv1d + BatchNorm1d + ReLU
MaxPool1d	256	256	3	2	1	–
SAM	256	256	7	1	3	–
CAM	256	256	1	1	0	AdaptiveAvgPool1d + AdaptiveMaxPool1d
Custom_cnn	256	512	3	1	1	Conv1d + BatchNorm1d + ReLU
Custom_cnn	512	512	3	1	1	Conv1d + BatchNorm1d + ReLU
Custom_cnn	512	512	3	1	1	Conv1d + BatchNorm1d + ReLU
Custom_cnn	512	512	3	1	1	Conv1d + BatchNorm1d + ReLU
MaxPool1d	512	512	3	2	1	–
CAM	512	512	1	1	0	AdaptiveAvgPool1d + AdaptiveMaxPool1d
Custom_cnn (X1)	128	32	1	1	0	Conv1d + BatchNorm1d + ReLU
Custom_cnn (X2)	256	32	1	1	0	Conv1d + BatchNorm1d + ReLU
Custom_cnn (X3)	512	32	1	1	0	Conv1d + BatchNorm1d + ReLU
Concatenation	–	–	–	–	–	Concatenate outputs of X1, X2, X3

Table 2. Experimental dataset.

Parameter	Assignment
Modulation types	8PSK, AM-DSB, AM-SSB, BPSK, CPFSK, GFSK, PAM4, 16QAM, 64QAM, QPSK, WBFM (11 classes)
Train set	Randomly select 6 types of modulation from all modulation types (6 classes)
Test set	5 types remaining besides the train set (5 classes)
SNR	−18∼18 dB
Support set	N-way K-shot (N = 3, 5; K = 1, 5, 10, 15)
Query set	100 samples

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, F.; Han, X.; Ren, J.; Wang, W.; Wang, Y.; Zhang, P.; Liao, S. A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning. Computers 2025, 14, 346. https://doi.org/10.3390/computers14090346

AMA Style

Zhou F, Han X, Ren J, Wang W, Wang Y, Zhang P, Liao S. A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning. Computers. 2025; 14(9):346. https://doi.org/10.3390/computers14090346

Chicago/Turabian Style

Zhou, Fan, Xiao Han, Jinyang Ren, Wei Wang, Yang Wang, Peiying Zhang, and Shaolin Liao. 2025. "A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning" Computers 14, no. 9: 346. https://doi.org/10.3390/computers14090346

APA Style

Zhou, F., Han, X., Ren, J., Wang, W., Wang, Y., Zhang, P., & Liao, S. (2025). A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning. Computers, 14(9), 346. https://doi.org/10.3390/computers14090346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Method for Few-Shot Modulation Recognition Based on Reinforcement Metric Meta-Learning

Abstract

1. Introduction

2. Metric-Based Meta-Learning

2.1. The Theoretical Basis of Metric Meta-Learning

2.2. The Prototype Network

3. Method

3.1. The MAFNet Feature Extraction Network

3.2. The Optimization of Feature Extraction Networks

3.3. Optimization of the Metric Function

3.4. Meta-Testing

4. Experiments and Analysis

4.1. The Dataset and Experimental Details

4.2. Analyzing the Performance of RMML Across Various Sample Sizes

4.3. The Ablation Experiment

4.3.1. The Improvement Ablation Experiment

4.3.2. Comparative Ablation Experiments

4.4. A Comparative Analysis of Different Few-Shot Modulation Recognition Methods

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI