Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method

: Deep learning technology has greatly propelled the development of intelligent and information-driven research on ship infrared automatic target recognition (SIATR). In future scenarios, there will be various recognition models with different mechanisms to choose from. However, in complex and dynamic environments, ship infrared (IR) data exhibit rich feature space distribution, resulting in performance variations among SIATR models, thus preventing the existence of a universally superior model for all recognition scenarios. In light of this, this study proposes a model-matching method for SIATR tasks based on bipartite graph theory. This method establishes evaluation criteria based on recognition accuracy and feature learning credibility, uncovering the underlying connections between IR attributes of ships and candidate models. The objective is to selectively recommend the optimal candidate model for a given sample, enhancing the overall recognition performance and applicability of the model. We separately conducted tests for the optimization of accuracy and credibility on high-fidelity simulation data, achieving Accuracy and EDMS (our credibility metric) of 95.86% and 0.7781. Our method improves by 1.06% and 0.0274 for each metric compared to the best candidate models (six in total). Subsequently, we created a recommendation system that balances two tasks, resulting in improvements of 0.43% (accuracy) and 0.0071 (EDMS). Additionally, considering the relationship between model resources and performance, we achieved a 28.35% reduction in memory usage while realizing enhancements of 0.33% (accuracy) and 0.0045 (EDMS).


Introduction
Ship targets are crucial combat units in modern maritime warfare, and it is important to recognize them accurately and credibly to enhance maritime situational awareness and gain an advantage.Infrared (IR) imaging technology is advantageous due to its all-weather capability, long-range perception, and strong concealment.It, along with visible light and synthetic aperture radar (SAR) imaging, forms an important means of acquiring feature information about ship targets and is widely used in ship automatic target recognition (SATR) tasks [1][2][3].In the marine environment, different types of ships exhibit varying degrees of thermal characteristics and IR radiation spectra within the IR spectrum.This paper focuses on ship IR automatic target recognition (SIATR) technology, which utilizes sensors to capture IR images of ship targets.By combining image processing, recognition algorithms, and other techniques, this technology automatically and accurately extracts shape, IR radiation, and other feature information from the targets.This extracted Mathematics 2024, 12, 168 2 of 20 information is then compared and matched against a pre-established feature information database to determine the type and identity of the target [4].In the military domain, this technology provides category-priority information for subsequent tasks like target tracking, threat assessment, and target engagement, thereby delivering reliable target recognition and intelligence support for maritime operations.Additionally, it has significant applications in civilian sectors such as maritime surveillance, safety rescue, and related activities.
In recent decades, SATR technology has witnessed rapid development.Traditional SATR research primarily relies on machine learning and pattern recognition algorithms.Specifically, the recognition system preprocesses the acquired images, including steps such as image enhancement and target extraction.Subsequently, through feature extraction and selection, the texture, shape, size, and other information of the targets are transformed into numerical features.Finally, by utilizing a machine learning classifier, the numerical features are analyzed and evaluated, enabling the automated recognition of different targets.For instance, in [5], the original IR images are first segmented to obtain the contour features of the ship body.Next, the moment functions of the contour are calculated to ensure translation, rotation, and scale invariance.Finally, the extracted features are inputted into a backpropagation neural network to achieve category prediction.Luo et al. [6] utilized a moment-based shape analysis method to extract features from optical images.They employed a complex angle radial transform for the shape of binary images to generate feature vectors.Then, they employed the k-nearest neighbors algorithm to make recognition decisions.Li et al. [7] first detected the salient features of the targets and segment the useful regions.Afterward, they extracted IR features by utilizing the moment functions of the target boundaries and solid contours and fed them into a support vector machine to predict the category.However, traditional methods often rely on manual intervention and rule definition, which not only consume time and effort but also lead to the possibility of misjudgments and omissions.Additionally, these methods have limited adaptability and require major adjustments in different contexts, making it difficult to exhibit good robustness in large-scale data and complex environments [8].
In recent years, deep learning methods have emerged as a prominent force, injecting vitality into image recognition tasks across numerous domains.Their remarkable capability lies in their ability to automatically and accurately learn more complex and diverse feature representations from vast amounts of data [9].These methods have an end-to-end mechanism that enables instant decision making and is expected to provide more robust and powerful technical support for SATR research [3].Currently, several researchers have begun exploring the application of deep learning techniques in SATR research and have achieved phased results.Initially, the focus of research in this field was to construct ship datasets and simultaneously develop deep learning algorithms.For instance, reference [10] introduced the VAIS dataset, which comprised over 1000 ship images in both IR and visible light, covering six different categories.Subsequently, the study utilized a 16-layer convolutional neural network (CNN) model to validate on the VAIS dataset, demonstrating the advantages of deep learning methods over traditional methods in SATR task.Karabayir et al. [11] employed a CAD modeling approach to synthesize ship images encompassing multiple targets for military and civilian purposes, as a means to augment the dataset size.They concurrently validated the feasibility of this approach in providing parameter training for CNN models.With the development of intelligent SATR research, some scholars have begun to focus on improving the internal design or embedding additional mechanisms on the basis of a single model.These refinements have resulted in the better accuracy and generalization of SATR models.We summarize the key techniques and results of some related studies in Table 1.The performance of the proposed method is superior to traditional machine learning methods and some CNN-based methods.Huang et al. [14] Visible light CNN + swin transformer The parallel network structure can extract features more fully.It performs best among multiple image recognition models.

Liu et al. [15]
Visible light CNN + feature fusion mechanism + supplement sample size using simulation data Compared to the CNN model as the backbone, the feature fusion mechanism and sample size complementation effectively and iteratively optimize the overall recognition performance.

Sharifzadeh et al. [16] SAR CNN + multilayer perceptron
Compared with using CNN or multilayer perceptron alone for ship recognition, the hybrid approach can better extract features and achieve higher accuracy.

CNN + deep metric learning + gradually balanced sampling
The residual neural network embedded with the new mechanism performs more accurately on multiple public datasets.
Overall, the successful application of deep learning technologies in the SATR field not only signifies their current prominence but also suggests their potential to shape the future direction of SATR research [18,19].Furthermore, it is anticipated that an array of models incorporating diverse mechanisms will emerge, offering researchers a broader spectrum of options for their specific recognition tasks.
However, ship targets in marine environments are influenced by various factors such as environmental conditions, target posture, and target behavior [20], exhibiting diverse IR characteristics and a rich feature distribution space.This poses significant challenges to the performance generalization and usability of recognition models.In fact, the performance of SATR models is influenced by their mechanisms and structures, making it particularly difficult to rely on a single model to achieve absolute performance advantages across all scenarios.The key focus of this paper is on how to further enhance the overall performance of SATR tasks while keeping the quantity and performance of existing deep learningbased SATR models unchanged.One strategy involves using ensemble learning to fuse the features of multiple models to enhance the ability to infer categories.However, this approach significantly increases overall complexity, resulting in resource and computational burdens that typically surpass those of all base models combined [21,22], and weakens the interpretability of the model [23].Another viable approach is to adopt model-adaptive recommendation.This method explores and captures the inherent connection between specific scene samples and model performance, constructing a prior knowledge system to adaptively recommend the most suitable model, thus obtaining better recognition judgments with a lower-complexity structure.
Against this background, this paper focuses on IR-based SATR tasks and proposes a novel model-matching method for SIATR based on bipartite graph recommendation [24] (SIATR-BGR), aiming to enhance the performance and applicability of the model in complex environments.The SIATR-BGR method considers the IR characteristic attributes of samples and candidate models as nodes of different categories in a bipartite graph.Through a reward-penalty combination, the method systematically explores the matching relationship between sample attributes and models, i.e., the edge weights in the bipartite graph.In designing and computing the edge weights, the method establishes not only the relationship between attributes and the recognition accuracy of candidate models but also quantitatively evaluates the credibility of model feature extraction.When recommending a model, the method calculates a recommendation score for each candidate model to reflect its degree of match with a specific sample, thus selecting the most suitable candidate model.The proposal of this method offers a new approach to enhance the overall performance of SIATR tasks while also providing decision makers with clearer model selection criteria, aiding in a better understanding of model selection behavior.It is worth noting that there is currently no large-scale publicly available IR dataset that covers various complex scenarios for real ship targets.With the continuous maturation of simulation modeling technology, some studies have successfully utilized simulated data for SATR research [25][26][27].In light of this inspiration, the validation process of this study will rely on highly realistic simulated IR data.
The contributions of this paper are as follows: 1.
We propose a novel SIATR model recommendation method, distinguished by its ability to adaptively match the optimal model based on sample attributes, which enhances adaptability to complex scenarios and improves overall performance; 2.
We have created a measure of SIATR model feature learning credibility.This measure, in combination with traditional accuracy metrics, provides a more comprehensive assessment of model usability; 3.
During the experimental validation phase, we analyze both recognition accuracy and feature learning reliability, as well as the relationship between the model resource consumption of the recommendation system.
In the remaining sections of the paper, Section 2 will introduce the materials and methods used in this study, Section 3 will present the experiments and results, and Section 4 will summarize the work of this paper.

Dataset
Deep learning recognition models require comprehensive data as the fundamental input, where the quality and quantity of data directly impact the performance of the model [28].Open datasets suitable for deep IR learning research on ships, such as VAIS and IRships [29], are relatively rare due to factors such as time and funding.Additionally, these datasets are limited by insufficient data volume, which makes it challenging to accurately reflect the changes in IR characteristics in different scenarios.As image simulation technology continues to advance, the use of IR simulation modeling to synthesize data for neural network parameter training has emerged as an alternative approach.The ship IR data used in this paper are obtained from our independently developed, physical, believable simulation modeling software for maritime scenarios.This software integrates highly credible meteorological model modules, allowing realistic simulation and analysis of various environmental factors, and can generate a large number of images and corresponding bounding boxes [30] in a short period.
For this study, we used a dataset consisting of 10,368 images with a resolution of 1024 × 768 pixels.The dataset encompasses three distinct categories: container freighter (3456 images), cruise ship (3456 images), and warship (3456 images).To ensure a comprehensive reflection of the differences in IR characteristics within real-world scenarios, we considered various factors that affect the generation and propagation of radiation, as well as diverse sensor capture elements when constructing the dataset.The relevant information is summarized in Table 2.In Figure 1, we showcase a selection of typical image samples and the automatically generated bounding boxes in sliced format.Note: in the simulated environment, the number of samples for each class is equal.

Proposed Approach
IR images of ships can vary significantly due to different background environments and shooting conditions.This variation can result in differences in the accuracy and credibility of models under different mechanisms.Consequently, there exists an inherent correlation between the attribute information of the images with the efficacy of the models.If one conceptualizes these entities as nodes of different categories (sample's attributes, models), it becomes evident that nodes within the same category lack direct connections, while relationships between nodes of different types can be represented as weighted edges.Thus, the relationship between attributes and models can be abstracted as the matching between nodes of a weighted bipartite graph.The SIATR-BGR method explores the potential impact of IR attributes on model performance from the perspectives of accuracy and credibility, and uses a bipartite graph to represent the relationship.Inspired by the analysis of causal factors influencing model decisions, as discussed in reference [31,32],

Proposed Approach
IR images of ships can vary significantly due to different background environments and shooting conditions.This variation can result in differences in the accuracy and credibility of models under different mechanisms.Consequently, there exists an inherent correlation between the attribute information of the images with the efficacy of the models.If one conceptualizes these entities as nodes of different categories (sample's attributes, models), it becomes evident that nodes within the same category lack direct connections, while relationships between nodes of different types can be represented as weighted edges.Thus, the relationship between attributes and models can be abstracted as the matching between nodes of a weighted bipartite graph.The SIATR-BGR method explores the potential impact of IR attributes on model performance from the perspectives of accuracy and credibility, and uses a bipartite graph to represent the relationship.Inspired by the analysis of causal factors influencing model decisions, as discussed in reference [31,32], SIATR-BGR follows the approach illustrated in Figure 2 to obtain masked images of the background region.The output differences of the model before and after masking are then quantified.This is because the training of the SIATR model mainly involves learning the feature space distribution of the samples, including the shape of the target, the brightness of the imaging, and the characteristics of the background region.When making prediction decisions, excellent candidate models primarily rely on the shape and brightness of the target, with minimal interference from the background region.
background region.The output differences of the model before and after masking are then quantified.This is because the training of the SIATR model mainly involves learning the feature space distribution of the samples, including the shape of the target, the brightness of the imaging, and the characteristics of the background region.When making prediction decisions, excellent candidate models primarily rely on the shape and brightness of the target, with minimal interference from the background region.background region.The output differences of the model before and after masking are then quantified.This is because the training of the SIATR model mainly involves learning the feature space distribution of the samples, including the shape of the target, the brightness of the imaging, and the characteristics of the background region.When making prediction decisions, excellent candidate models primarily rely on the shape and brightness of the target, with minimal interference from the background region.

Preliminaries
In this section, we will select and partition the sample IR attributes used to construct the SIATR-BGR framework and define some of the mathematical symbols that will be used.In the given simulated data, five types of attributes, including time, motion state, air temperature, zenith angle, and distance, can be perceived in advance through various means.This paper mainly focuses on these attributes, and their division is presented in Table 3.In our method, the following symbols represent the elements in a bipartite graph: G = (M, A, E, U, P) represents the bipartite graph, where M = Model j |j = 1, . . ., J and A = Attribute a t |a = 1, . . ., A, t = 1, . . ., −1 represent the vertex sets corresponding to the models and the sample attributes, respectively.The a in Attribute a t is used to determine the broad category of the sample attribute, and t is used to determine the small category of the attribute after division.If t equals −1, it represents the last one of the current small category.E ∈ A × M represents the edge set between A and M. Lastly, U = U j,a t and P = P j,a t represent the weight set for the accuracy and credibility of model recognition, respectively.4 illustrates the process of obtaining the weight matrix.Assuming there are n samples used for constructing the weight matrix, the specific steps include the following parts: represents the predicted categories.Record the predicted scores in vector form using ( 4) and ( 5): ,, ,, • Based on this, adopt a reward-penalty approach to obtain the weights U and , ji P for the model with respect to the sample (to be separately introduced later).

•
Construct the model-attribute weight matrix ( ) , , , , ,  for image i .Let the weight of the true attribute of the image be equal to , ji U and , ji P , and the weight of non-corresponding attributes be equal to 0, i.e., assigned according to the following equation: • For the i-th input image, the mask image is obtained using the method in Figure 2, and the attribute partition interval of the image is obtained using the following formula: • Here, ti is used to determine the specific attribute subclass corresponding to the sample.

•
Let f model j (•) be the predicted output formula of candidate model j; calculate the predicted scores for image i before and after masking using ( 2) and (3): • Here, l = 1, . . ., L represents the predicted categories.Record the predicted scores in vector form using ( 4) and ( 5): • Based on this, adopt a reward-penalty approach to obtain the weights U j,i and P j,i for the model with respect to the sample (to be separately introduced later).

•
Construct the model-attribute weight matrix UP i = U j,a t,i , P j,a t,i for image i.Let the weight of the true attribute of the image be equal to U j,i and P j,i , and the weight of non-corresponding attributes be equal to 0, i.e., assigned according to the following equation: • Combine the UP i corresponding to i images and compute the average using Equation (7) to obtain the final model-attribute weight matrix UP = U j,a t , P j,a t .
Part b of Figure 4 illustrates the calculation process for the weights U j,i and P j,i using a sample with a certain label set to 0. For the sake of simplicity, only the calculation results for three candidate models are listed in the figure.The calculation process for U j,i is as follows: • Obtain each candidate model's class prediction score matrix SCORE i = score i j,l for the original image i.The specific form of SCORE i is shown in Equation ( 8): • To simplify the data and focus on the interested categories, extract the sub-matrix SCORE i,li = score i j,li under the category from SCORE i , where li represents the true category of image i.

•
Normalize the elements of SCORE i,li using the softmax function, and denote the updated element as score i j,so f tmax .
• Using the precision-optimized reward-penalty function f U (•) to calculate the weights U j,i , such that the weights for correctly predicted models are positive and the weights for incorrectly predicted models are negative.In the following equation, α represents the accuracy-optimized penalty factor: The credibility indicator P j,i of SIATR-BGR is obtained by reshaping the pixel values in the background area outside the image annotation box and quantifying the credibility of feature learning by comparing the output values before and after masking.A smaller difference between the output values indicates that the model's decision is less affected by irrelevant environmental information, while a larger difference indicates the opposite.The specific steps for computing P j,i are as follows: • Obtain each candidate model's class prediction score matrix for the masked image i.The specific form of SCORE mask i is shown in Equation ( 11): • Calculate the Euclidean distance between SCORE mask i and SCORE i to measure the similarity of the model before and after image masking, and denote the result as • Using Equation ( 9), normalize the elemental values of ED, and denote the result as score i,d j,so f tmax .• When the original image is predicted correctly, an excellent candidate model's decision should be less sensitive to the background region of the target, resulting in a smaller score i,d j,so f tmax and, thus, should be assigned a larger weight.Conversely, when the prediction is incorrect, negative weights should be assigned, and smaller values of score i,d j,so f tmax should incur greater penalties.To achieve this, the absolute value is introduced into Equation ( 13) to design a reward-penalty function f p (•) for calculating the weight P j,i , where β represents the credibility penalty factor.

Model-Adaptive Recommendation
In this section, we use the prior knowledge constructed in the previous section to adaptively recommend the optimal model to recognize the actual image based on its attribute information.Part a of Figure 5 demonstrates the adaptive recommendation process in the form of a bipartite graph, and its computational process is equivalent to matrix operations.Part b provides a corresponding schematic diagram of matrix operations.For each known IR attribute of image i, the process of model recommendation for candidate models includes two steps.Extracting bipartite graph subgraph: image i can be represented in the form of Equation ( 1) with its attribute information, based on which the corresponding bipartite subgraph, denoted as sub G , is extracted from the original model-attribute bipartite graph G .This process can also be understood as retaining all M nodes in G while pruning the nodes not in i from i A , along with their corresponding edges.The corresponding matrix description is as follows: extract the weights corresponding to image i from the Model are determined via Formulas ( 14) and ( 15), respectively: Extracting bipartite graph subgraph: image i can be represented in the form of Equation (1) with its attribute information, based on which the corresponding bipartite subgraph, denoted as G sub , is extracted from the original model-attribute bipartite graph G.This process can also be understood as retaining all M nodes in G while pruning the nodes not in i from A i , along with their corresponding edges.The corresponding matrix description is as follows: extract the weights corresponding to image i from the weight matrix UP = U j,a t , P j,a t to obtain the sub-weight matrix UP sub = U j,a ti , P j,a ti .The calculation of model recommendation scores: This process involves computing the recommendation score score_model j i of each candidate model for image i in G sub , which is essentially the weighted sum of edge weights for each M node in G sub .Subsequently, the recommendation scores of various models are compared to determine the optimal model Model j * under prior knowledge, which is then used for category prediction.The values of score_model j i and the subscript j * in Model j * are determined via Formulas ( 14) and (15), respectively: where W U and W P are the weights used for accuracy and credibility optimization, respectively.They satisfy the condition W U + W P = 1 to adjust the preference emphasis of the entire recommendation system.In summary, we present the computational process of the SIATR-BGR method in Algorithm A1 (refer to the Appendix A).

Evaluation Metrics
This article will quantitatively evaluate the predictive accuracy and reliability of the model.As for image recognition research, metrics such as Accuracy(Acc), Precision(Prec), Recall(Rec), and F1 − score(F1) will be used to evaluate the model's performance in predicting categories, where Acc represents the ratio of correctly identified samples to the total number of samples; Prec indicates the proportion of samples predicted as positive that are truly positive; Rec, however, refers to the ratio of actual positive samples that are correctly predicted as positive; and F1 is the harmonic mean of Prec and Rec, thus providing a comprehensive assessment of the model's performance.The calculation formulas for each metric are as follows: where tp and tn represent the number of true positive and true negative samples, while f p and f n represent the number of false positive and false negative samples, respectively.In Section 2.2.2, the paper compares the Euclidean distance of model output values before and after image masking to construct a knowledge matrix.Similarly, we evaluate the credibility of the output values using the Euclidean distance and Formula (13) as a reference to design the Euclidean distance mean of samples (EDMS) as the evaluation metric.A larger EDMS indicates better reliability.The calculation formula is as follows: where, when sample i is predicted correctly, η i = 1; otherwise, η i = −1.

Experimental Settings
The proposed SIATR-BGR method is validated using the ship's IR simulation data from Section 2.1, and the data are divided into training, validation, and test sets in a ratio of 7:1:2.We will utilize six CNN models, including ResNet18 [33], SqueezeNet1_1 [34], DenseNet121 [35], MobileNet_v3_large [36], MnasNet1_3 [37], and ShuffleNet_v2_x10 [38], as candidate models for constructing the SIATR-BGR framework (hereinafter, the CNN version numbers will not be mentioned, for example, ResNet refers to ResNet18).During the model training, we utilized an 11 GB RTX 2080 Ti GPU running on the PyTorch environment.Each CNN model underwent training for 100 epochs, with a batch size of 32 and a learning rate of 0.01.We set the momentum to 0.9, and the parameter updates were performed using the Stochastic Gradient Descent optimizer.We evaluated the performance of each CNN model on the test set, a summary of the performance of each model is shown in Table 4, and we found that MobileNet outperformed the other models across all metrics.In the preceding sections, Formulas ( 10) and ( 13) used α and β as penalty factors to calculate the weights.We set α and β to vary within the range [0, 3], with a step size of 0.1, and employed grid search on the validation set to determine their optimal values.Figure 6 shows the results of the penalty factor search for SIATR-BGR performance, showing that the best performance for Acc, at 95.91%, was achieved when α = 0.4 (W U = 1, W P = 0), while the best performance for EDMS, at 0.7787, was achieved when β = 0.6 (W U = 0, W P = 1).
is shown in Table 4, and we found that MobileNet outperformed the other models across all metrics.
In the preceding sections, Formulas (10) and ( 13) used  and  as penalty factors to calculate the weights.We set  and  to vary within the range [0, 3], with a step size of 0.1, and employed grid search on the validation set to determine their optimal values.Figure 6 shows the results of the penalty factor search for SIATR-BGR performance, showing that the best performance for Acc , at 95.91%, was achieved when ), while the best performance for EDMS , at 0.7787, was achieved when

Results and Discusion
In this section, we first consider only recognition performance as the evaluation criterion and present an SIATR recommendation system that optimizes overall accuracy performance.Secondly, we only consider credibility as the evaluation criterion and present an SIATR model recommendation system that optimizes overall credibility.Finally, we design and construct a recommendation system that considers both recognition accuracy and credibility.

Results and Discusion
In this section, we first consider only recognition performance as the evaluation criterion and present an SIATR recommendation system that optimizes overall accuracy performance.Secondly, we only consider credibility as the evaluation criterion and present an SIATR model recommendation system that optimizes overall credibility.Finally, we design and construct a recommendation system that considers both recognition accuracy and credibility.

The Recommendation System Aims to Improve the Accuracy
To further validate the effectiveness of our method, we will also compare the performance of our approach with the voting ensemble learning method [39] (Hard Voting and Soft Voting).Hard Voting determines the final predicted label through a majority vote.In contrast, Soft Voting takes into account the predicted probabilities of each model and performs a weighted average to determine the final predicted label.The computation formulas for them are as follows: where y j,i represents the predicted label of model j for image i. I(•) denotes the indicator function, with I(•) taking a value of 1 when y j,i equals l, and 0 otherwise.f l model j (image i) represents the predicted score of model j for label l.
Table 5 shows a performance comparison between the SIATR-BGR method (α = 0.4, W U = 1, W P = 0) and the best candidate model MobileNet, as well as the voting algorithms for both.Our SIATR-BGR method achieves a performance of 95.86% across all four metrics on the test set.Taking Acc as an example, compared to MobileNet, Soft Voting, and Hard Voting, the performance of our method is 1.06%, 0.63%, and 0.58% higher, respectively, which demonstrates the superiority of the SIATR-BGR method.Additionally, to further analyze the performance of our proposed method compared to other approaches in specific scenarios, we have generated an Acc heatmap depicting the relationship between sample attributes and candidate models (refer to Figure 7).The heatmap reveals that there are performance differences among candidate models, making it challenging for a single model to establish absolute superiority across all conditions.Instead, a relative balance of advantages is maintained among the candidate models.For example, among the various candidate models, SqueezeNet exhibits the best performance under the "Far" condition (94.63%), while it performs the worst under the "Zenith_low" condition (91.53%).In terms of time conditions, encompassing "Night," "Morning and Evening," and "Daytime" attributes, excellent performances are demonstrated by Mo-bileNet (88.95%),ResNet (98.39%), and SqueezeNet (98.10%), respectively, instead of a single model consistently ranking first across all conditions.Compared to the candidate models, the SIATR-BGR method demonstrates excellent recognition performance across various scenarios.This further validates the original intention behind the design of our recommended system: to adaptively recommend the optimal model based on the differences in recognition performance across different scenarios, thereby ensuring overall performance superiority over consistently using a single model.Furthermore, compared to the two types of voting algorithms, the SIATR-BGR method is only inferior to Soft Voting and Hard Voting in the "Far" and "Zenith_low" conditions, respectively.In most scenarios, our method achieves the highest Acc.By observing Figure 7, it is noticeable that there are performance variations among the different methods within the same major scenario.Specifically, during the "Morning and Evening" and "Daytime" attributes, the overall model performance is considerably better than during the "Nighttime" attribute.This is due to the fact that during nighttime conditions, the ship's hull emits low radiation toward the outside because of the lack of solar radiation and lower temperatures.This results in a less clear ship contour, which negatively affects the model's recognition.Additionally, Figure 8 provides the corresponding confusion matrix for our method, which shows a relatively balanced prediction for the three categories.

The Recommendation System Aims to Improve the Credibility
The SIATR-BGR method in this study (with parameters set as β = 0.6, W U = 0, and W P = 1) exhibits an EDMS of 0.7781 on the test set, indicating a performance improvement of 0.0274 compared to MobileNet, which achieves the highest EDMS value for the candidate models.Analogous to Figure 7, a heatmap depicting the EDMS between sample attributes and candidate models is presented in Figure 9. Except for under the "Far" condition, the EDMS value of our approach is 0.7201, which is lower than MnasNet's 0.7319.In various scenarios, our method demonstrates superior recognition confidence performance.Additionally, upon contrasting Figures 7 and 9, it is evident that models with better overall Acc performance under a specific attribute correspond to superior EDMS performance.For instance, models under the "Near" condition outperform those under the "Far" condition in both Figures 7  and 9.This suggests that the model's recognition accuracy and credibility are significantly influenced by changes in the scene and exhibit a consistent pattern of variation.
Mathematics 2024, 12, x FOR PEER REVIEW 15 of 22 there are performance variations among the different methods within the same major scenario.Specifically, during the "Morning and Evening" and "Daytime" attributes, the overall model performance is considerably better than during the "Nighttime" attribute.This is due to the fact that during nighttime conditions, the ship's hull emits low radiation toward the outside because of the lack of solar radiation and lower temperatures.This results in a less clear ship contour, which negatively affects the model's recognition.Additionally, Figure 8 provides the corresponding confusion matrix for our method, which shows a relatively balanced prediction for the three categories.there are performance variations among the different methods within the same major scenario.Specifically, during the "Morning and Evening" and "Daytime" attributes, the overall model performance is considerably better than during the "Nighttime" attribute.This is due to the fact that during nighttime conditions, the ship's hull emits low radiation toward the outside because of the lack of solar radiation and lower temperatures.This results in a less clear ship contour, which negatively affects the model's recognition.Additionally, Figure 8 provides the corresponding confusion matrix for our method, which shows a relatively balanced prediction for the three categories.7, a heatmap depicting the EDMS between sample attributes and candidate models is presented in Figure 9. Except for under the "Far" condition, the EDMS value of our approach is 0.7201, which is lower than MnasNet's 0.7319.In various scenarios, our method demonstrates superior recognition confidence performance.Additionally, upon contrasting Figures 7 and 9, it is evident that models with better overall Acc performance under a specific attribute correspond to superior EDMS performance.For instance, models under the "Near" condition outperform those under the "Far" condition in both Figure 7 and Figure 9.This suggests that the model's recognition accuracy and credibility are significantly influenced by changes in the scene and exhibit a consistent pattern of variation.

The Recommendation System Aims to Improve the Accuracy and Credibility
In Sections 3.3.1 and 3.3.2,we observed that the SIATR-BGR method performed significantly better in optimizing accuracy and credibility independently when the penalty factors  and  were set to 0.4 and 0.6, respectively.Based on these findings, we set  and  to 0.4 and 0.6, respectively, when developing a recommendation system that combines accuracy and credibility.In Figure 10, we present the changes in the validation set's Acc and EDMS for the SIATR-BGR method under different values of the weight W value of 0.7 achieves a better balance between Acc and EDMS , with the corresponding performance metrics of the SIATR-BGR method on the test set presented in Table 6.

The Recommendation System Aims to Improve the Accuracy and Credibility
In Sections 3.3.1 and 3.3.2,we observed that the SIATR-BGR method performed significantly better in optimizing accuracy and credibility independently when the penalty factors α and β were set to 0.4 and 0.6, respectively.Based on these findings, we set α and β to 0.4 and 0.6, respectively, when developing a recommendation system that combines accuracy and credibility.In Figure 10, we present the changes in the validation set's Acc and EDMS for the SIATR-BGR method under different values of the weight W U (W U = 1 − W P ).Observations indicate that as W U gradually increases, Acc also correspondingly improves, while EDMS exhibits the opposite trend.The shaded region W U ∈ [0.45 − 0.8] shown in Figure 10 represents the segments where both metrics outperform the MobileNet model.When W U is between 0.7 and 0.8, the Acc significantly surpasses the range of W U values between 0.45 and 0.65, slightly exceeding the performance of Hard Voting on the test set.However, within the range of W U from 0.45 to 0.7, there is a gradual decrease observed in EDMS.By comparing the W U range of 0.45 to 0.8, it is evident that a W U value of 0.7 achieves a better balance between Acc and EDMS, with the corresponding performance metrics of the SIATR-BGR method on the test set presented in Table 6.In this section, we analyze the relationship between resource consumption and the performance of the SIATR-BGR method.In Figure 11, we provide statistical plots for the number of model recommendations and model sizes.Subfigure a of Figure 11 demonstrates the number of times each model is recommended for accuracy optimization and credibility optimization tasks.We found that the top three models with the best performance, as shown in Table 4, have a significant proportion of total recommendations.For example, SqueezeNet, MobileNet, and MnasNet with higher accuracy are primarily recommended for accuracy optimization, while ResNet, MobileNet, and MnasNet, with higher EDMS , are mainly recommended for credibility optimization.Subfigure b illustrates a comparison of parameter count and memory usage for the six models in the PyTorch environment.We observed that some models (DenseNet and ShuffleNet) have fewer occurrences in actual model recommendations.However, including them in the recommendation system would increase overall complexity and memory size.Especially for devices with limited memory, it is necessary to selectively reduce the resource consumption of the recommendation system.In this section, we analyze the relationship between resource consumption and the performance of the SIATR-BGR method.In Figure 11, we provide statistical plots for the number of model recommendations and model sizes.Subfigure a of Figure 11 demonstrates the number of times each model is recommended for accuracy optimization and credibility optimization tasks.We found that the top three models with the best performance, as shown in Table 4, have a significant proportion of total recommendations.For example, SqueezeNet, MobileNet, and MnasNet with higher accuracy are primarily recommended for accuracy optimization, while ResNet, MobileNet, and MnasNet, with higher EDMS, are mainly recommended for credibility optimization.Subfigure b illustrates a comparison of parameter count and memory usage for the six models in the PyTorch environment.We observed that some models (DenseNet and ShuffleNet) have fewer occurrences in actual model recommendations.However, including them in the recommendation system would increase overall complexity and memory size.Especially for devices with limited memory, it is necessary to selectively reduce the resource consumption of the recommendation system.In subgraph a of Figure 11, the top three recommended model combinations for the two tasks are referred to as the dominant models for accuracy (DMA) and the dominant models for credibility (DMC).Furthermore, Tables 7 and 8 provide a comparative analysis of the performance on the test dataset for different candidate model combinations pertaining to the two tasks.It is evident that the overall performance of SIATR-BGR is predominantly determined by DMA (DMC), with the absence of other low-performance candidate models having a relatively minor impact on the recommendation system's performance.This phenomenon occurs because when recommending a model for a certain sample, even if the original high-volume recommendation system's chosen model is not available in the reduced-volume recommendation system, the latter's chosen candidate model typically exhibits the capability to accurately recognize this sample.To optimize both accuracy and credibility, and considering the limited frequency of recommendations for DenseNet and ShuffleNet across the two tasks, this study opts for the utilization of ResNet, SqueezeNet, MobileNet, and MnasNet in constructing a resource-efficient SIATR-BGR recommendation system.When W U and W P are individually set to 1, the accuracy and credibility performances of the recommendation system in this section align with DMA + ResNet (α = 0.4) in Table 7 and DMC + SqueezeNet (β = 0.7) in Table 8, respectively.In contrast to the original SIATR-BGR model, the resource-efficient SIATR-BGR model exhibits a 28.35% reduction in memory size, with ACC and EDMS metrics experiencing marginal decreases of 0.05% and 0.0027, respectively.Similar to Table 6, this study presents the overall performance of the resource-efficient SIATR-BGR recommendation system in Table 9, under appropriate W U and W P settings.

Conclusions
This paper focuses on the application of ship IR automatic target recognition (SIATR) technology, with a dedicated effort to enhance overall recognition performance and applicability under the constraint of a given number of CNN models and consistent performance levels.To achieve this objective, we propose an innovative method for adaptive recommendation models, called SIATR-BGR.This method, guided by optimization goals centered on recognition accuracy and feature learning credibility, selects six classes of CNN models as candidate models.By establishing a bipartite graph mapping relationship between sample attributes and models, our method can effectively reflect the relationships of influence using edge weight values.In the model recommendation phase, the method extracts corresponding subgraphs based on sample IR attributes and calculates the priority recommendation order of models with knowledge priors.The proposed method is validated using highfidelity simulation data.Initially, we conduct separate analyses for the optimization of recognition accuracy and credibility.The results indicate that compared to the six candidate models, our approach effectively enhances various performance metrics.Subsequently, we further analyze the accuracy and credibility results under different recommendation biases of W U and W P , selecting appropriate values to better balance the performance of these two metrics.Finally, we explore the relationship between candidate model resource consumption and the performance of the recommendation We propose a recommendation system that considers resource consumption, accuracy, and credibility in a balanced manner.The introduction of this method not only provides new avenues and insights for improving the performance of SIATR tasks but also offers valuable references for similar studies in the SAR and visible light domains.
However, there are some limitations to our method's practical application.Firstly, to enhance the credibility of our method, we require the bounding box information for the target, which means that our method needs the support of exact object detection algorithms for practical applications.Additionally, we have not yet tested the effectiveness of our method using real-world data.Going forward, we plan to further test and optimize our approach under appropriate conditions to promote its application in real-world scenarios.

Figure 1 .
Figure 1.Example images and bounding box localization demonstration of the dataset for cruise ship (a), warship (b), and container freighter (c).(b,c) The diversity of target imaging brightness variations and posture distribution using warship and container freighter as examples, respectively.

Figure 1 .
Figure 1.Example images and bounding box localization demonstration of the dataset for cruise ship (a), warship (b), and container freighter (c).(b,c) The diversity of target imaging brightness variations and posture distribution using warship and container freighter as examples, respectively.

Figure 2 .
Figure 2. The generation process of masking the target background area.The SIATR-BGR method consists of two main components, namely, bipartite graph knowledge construction and model-adaptive recommendation, as shown in Figure3.In the knowledge construction phase, SIATR-BGR establishes a bipartite graph mapping relationship between sample attributes and model selection by integrating the output values of the samples with their inherent attributes.The strength of these relationships is reflected by the edge weights of the bipartite graph.In the model-adaptive recommendation section, SIATR-BGR uses pre-established prior knowledge and provides sample attributes to extract relevant information.It then calculates the recommended model priority order, which is the targeted selection of the optimal recognition model.

Figure 3 .
Figure 3.The basic framework of the SIATR-BGR method.

Figure 2 .
Figure 2. The generation process of masking the target background area.The SIATR-BGR method consists of two main components, namely, bipartite graph knowledge construction and model-adaptive recommendation, as shown in Figure3.In the knowledge construction phase, SIATR-BGR establishes a bipartite graph mapping relationship between sample attributes and model selection by integrating the output values of the samples with their inherent attributes.The strength of these relationships is reflected by the edge weights of the bipartite graph.In the model-adaptive recommendation section, SIATR-BGR uses pre-established prior knowledge and provides sample attributes to extract relevant information.It then calculates the recommended model priority order, which is the targeted selection of the optimal recognition model.

Figure 2 .
Figure 2. The generation process of masking the target background area.The SIATR-BGR method consists of two main components, namely, bipartite graph knowledge construction and model-adaptive recommendation, as shown in Figure3.In the knowledge construction phase, SIATR-BGR establishes a bipartite graph mapping relationship between sample attributes and model selection by integrating the output values of the samples with their inherent attributes.The strength of these relationships is reflected by the edge weights of the bipartite graph.In the model-adaptive recommendation section, SIATR-BGR uses pre-established prior knowledge and provides sample attributes to extract relevant information.It then calculates the recommended model priority order, which is the targeted selection of the optimal recognition model.

Figure 3 .
Figure 3.The basic framework of the SIATR-BGR method.Figure 3. The basic framework of the SIATR-BGR method.

Figure 3 .
Figure 3.The basic framework of the SIATR-BGR method.Figure 3. The basic framework of the SIATR-BGR method.

Figure 4 .
Figure 4.The acquisition method of knowledge construction in the SIATR-BGR method.(a) The calculation process of the weight matrix UP; (b) three candidate model examples and the acquisition method of the weights U j,i and P j,i for a single sample with a certain label set to 0.

Figure 5 .
Figure 5. Illustration of model-adaptive recommendation for the SIATR-BGR method.(a) Candidate model selection in the form of a bipartite graph, and (b) the corresponding matrix numerical computation method.
model recommendation scores: This process involves computing the recommendation score _ i j score model of each candidate model for image i in sub G , which is essentially the weighted sum of edge weights for each M node in sub G .Subsequently, the recommendation scores of various models are compared to determine the optimal model * j Model under prior knowledge, which is then used for category prediction.The values of _ i j score model and the subscript * j in * j

Figure 5 .
Figure 5. Illustration of model-adaptive recommendation for the SIATR-BGR method.(a) Candidate model selection in the form of a bipartite graph, and (b) the corresponding matrix numerical computation method.

Figure 6 .
Figure 6.The impact of the penalty factors  (

EDMSFigure 6 .
Figure 6.The impact of the penalty factors α(W U = 1, W P = 0) and β(W U = 0, W P = 1) on the Acc and EDMS of the SIATR-BGR recommendation system under different search values.

Figure 7 .Figure 8 .
Figure 7. Heat map matrix of various methods under multi-class scenarios.The numerical values in each cell of the figure represent the Acc of the method in the corresponding scenarios.

Figure 7 .
Figure 7. Heat map matrix of various methods under multi-class scenarios.The numerical values in each cell of the figure represent the Acc of the method in the corresponding scenarios.

Figure 7 .Figure 8 .
Figure 7. Heat map matrix of various methods under multi-class scenarios.The numerical values in each cell of the figure represent the Acc of the method in the corresponding scenarios.

W
gradually increases, Acc also correspondingly improves, while EDMS exhibits the opposite trend.The shaded region shown in Figure10represents the segments where both metrics outperform the MobileNet model.When U W is between 0.7 and 0.8, the Acc significantly surpasses the range of U W values between 0.45 and 0.65, slightly exceeding the performance of Hard Voting on the test set.However, within the range of U W from 0.45 to 0.7, there is a gradual decrease observed in EDMS .By comparing the U W range of 0.45 to 0.8, it is evident that a U

Figure 9 .
Figure 9. Heat map matrix of various methods under multi-class scenarios.The numerical values in each cell of the figure represent the EDMS of the method in the corresponding scenarios.

Figure 9 .
Figure 9. Heat map matrix of various methods under multi-class scenarios.The numerical values in each cell of the figure represent the EDMS of the method in the corresponding scenarios.

Mathematics 2024 , 22 Figure 10 . 4 .
Figure 10.The impact of varying values of U W on the Acc and EDMS performance of the SIATR-BGR recommendation system under conditions 0.4  = and 0.6  =

Figure 10 .
Figure 10.The impact of varying values of W U on the Acc and EDMS performance of the SIATR-BGR recommendation system under conditions α = 0.4 and β = 0.6.

Figure 11 .
Figure 11.A statistical chart of the recommended frequencies concerning candidate models relative to their sizes.(a) The recommended frequencies of models under conditions where α = 0.4, W U = 1, W P = 0, and β = 0.4, W U = 0, W P = 1; (b) the resource consumption statistics of each model.

Table 7 .
The performance and resource consumption of the SIATR-BGR recommendation system under different combinations of candidate models when accuracy (W U = 1) optimization is the objective.

Table 1 .
Some recent research on SATR tasks based on deep learning.

Table 2 .
IR attribute information of the dataset.: in the simulated environment, the number of samples for each class is equal. Note

Table 3 .
The selected IR attributes and dividing information for constructing the recommendation system.
2.2.2.Knowledge ConstructionKnowledge construction aims to acquire the model-attribute weight matrix UP = U j,a t , P j,a t that carries prior knowledge information.Part a of Figure

Table 4 .
Comparison of performance of various candidate CNN models on the test set.: since the number of samples in each class is equal in the test set, the Acc and Rec of each model are equal. Note

Table 4 .
Comparison of performance of various candidate CNN models on the test set.

Table 5 .
Comparison of the prediction accuracy performance between the SIATR-BGR method and three other methods: MobileNet, Soft Voting, and Hard Voting.

Table 6 .
The performance metrics of the SIATR-BGR recommendation system with six candidate models when aiming to improve accuracy and credibility.

Table 6 .
The performance metrics of the SIATR-BGR recommendation system with six candidate models when aiming to improve accuracy and credibility.

Table 8 .
The performance and resource consumption of the SIATR-BGR recommendation system under different combinations of candidate models when credibility (W P = 1) optimization is the objective.

Table 9 .
The performance metrics of the SIATR-BGR recommendation system with three candidate models when aiming to improve accuracy and credibility, t.