Insect Mass Estimation Based on Radar Cross Section Parameters and Support Vector Regression Algorithm

: Radar cross section (RCS) parameters of insect targets contain information related to their morphological parameters, which are helpful for the identiﬁcation of migratory insects. Several morphological parameter estimation methods have been presented. However, most of these estimations are performed based on polynomial ﬁtting methods, using only one or two parameters, which may limit the estimation accuracy. In this paper, a new insect mass estimation method is proposed based on support vector regression (SVR). Several RCS parameters were extracted for the estimation of insect mass. Support vector regression based on recursive feature elimination (SVRRFE) was used to obtain the optimal feature subset. Speciﬁcally, a dataset including 367 specimens was included to evaluate the performance of the proposed method. Fifteen features were extracted and ranked. The optimal feature subset contained six features and the optimal mass estimation accuracy was 78%. Additionally, traditional insect mass estimation methods were analyzed for comparison. The results prove that the proposed method is more e ﬀ ective and accurate for insect mass estimation. It needs to be emphasized that the poor number of experimental insects available may limit the further improvement of estimation accuracy.


Introduction
Countless insects have long-distance migrations every year and the study of these large-scale movements can contribute to our understanding of insect migration [1]. Many major agricultural pests (such as Nilaparvata lugens [2], Mythimna separate [3] and Helicoverpa armigera [4,5]) have a strong migration capacity, which may lead to catastrophic losses of crops and disease transmission between continents [1,6,7]. Effective monitoring and early warning systems for migratory insects are critically important.
Most insects are small and fly at night, making it difficult to observe their migrations [8]. The emergence of entomological radar makes it possible to effectively monitor long-distance migrations of insects [9]. Entomological radar transmits a beam in the form of an electromagnetic wave to a high-flying insect migrant and the received signal can be used to extract its heading direction, velocity and trajectory [10,11]. Entomological radar has become a superior and irreplaceable tool in the study of insect migrations. Entomologists have conducted extensive research on migratory insects around the world using a variety of entomological radar types, such as scanning and vertical-looking radars (VLRs) [8,12,13]. VLRs that have proved particularly effective for observing insect migrations can provide information characterizing an insect target (i.e. target's size, shape and parameters related to wing beating) [14].
Previous research discussed the relation between the shape estimates and insect target identity and proved that the shape and wingbeat frequency parameters were potential values for insect target identification [15][16][17]. However, only broad classes, such as locusts and moths, could be identified based on the estimated radar cross section (RCS) parameters [16]. The ability of entomological radar to distinguish different insect targets is insufficient, which is one of the most important issues that entomologists and pest managers are concerned about [16]. In addition, in this paper we proved that mass was the most important feature for insect identification [18]. Therefore, precise estimation of insect mass from radar data may contribute to research on insect target identification.
Several insect mass estimation methods have been presented based on RCS parameters retrieved from radar signals [14,[19][20][21][22]. These methods choose only one or two RCS parameters and then adopt polynomial fitting or multiple linear regression to realize an estimation of insect mass. Only very limited useful information is utilized for each of these methods, which may limit the estimation accuracy of insect mass. Machine learning algorithms, which can synthetically use multiple parameter features, should be considered. Support vector machine (SVM), as one of the most popular pattern recognition methods, was originally proposed by Cortes and Vapnik [23]. Support vector regression (SVR) is an important branch of SVM, and can be applied in regression and prediction [24,25]. SVR has been applied in many research fields, including remote sensing [26]. Many studies demonstrate that SVR is superior to other regression methods in many cases [27,28]. In addition, SVR has the advantages of good generalization and high performance in tackling datasets with relatively small samples [26,29]. Therefore, considering that the experimental data that can be used for insect mass estimation is limited, SVR was chosen for the estimation of insect mass in our research.
In the present study, a dataset of 367 insect specimens was collated from several sources [14,21] and several RCS parameters related to insect mass were extracted from the backscattering signals measured in the X band in a microwave anechoic chamber. Fifteen features were extracted from the RCS parameters and selected as the input variables for SVR. The optimal feature subset was determined after feature ranking and then the optimal mass estimation accuracy was achieved. The proposed method can take full advantage of various RCS parameters to acquire a higher estimation accuracy, which may contribute to the species identification of migratory insects based on radar measurement data.

Experimental Datasets
Many X band measurements of ventral-aspect RCSs of insects have been reported [20,21,30]. Most of these measurements were made at 9.4 GHz. The measured RCS parameters of 207 specimens for which the morphological parameters were also available were compiled and summarized in the former research [21]. The estimated RCS parameters mainly consisted of three terms representing the target's radar reflectivity (a 0 , α 2 and α 4 ), two terms representing principal RCSs (σ xx and σ yy ) and two terms (d and ν) which were calculated from the scattering matrix [21]. These measured specimens were divided into three groups (datasets D, L and M) [21]. These datasets were also included in this paper for integrative analysis.
In addition, we also carried out many experiments for measuring RCS parameters of insect specimens in a microwave anechoic chamber. As shown in Figure 1, the experimental rig mainly included a vector network analyzer (VNA), a pair of dual-polarization antennas working at X band and a horn-shaped experimental rig with absorbing materials affixed on the inside. When experiments were performed, an insect target was glued to a short polyethylene line of 0.05 mm diameter and then was hung at a distance of~2 m from the antennas, which ensured that the insect target was in the far field. Then, the echo signal of insect target was obtained by using the above experimental device.
After processing radar data by step-frequency continuous-wave (SFCW) imaging and polarimetric calibration [22], the scattering matrix of the insect targets was obtained. Then the RCS parameters mentioned above were retrieved from the scattering matrix. Note that background subtraction was performed with the SFCW range profiles of the empty scene to eliminate clutter.
imaging and polarimetric calibration [22], the scattering matrix of the insect targets was obtained. Then the RCS parameters mentioned above were retrieved from the scattering matrix. Note that background subtraction was performed with the SFCW range profiles of the empty scene to eliminate clutter. A total of 169 insects belonging to 22 species (denoted here by K) were measured in our experiment ( Table 1). All insects were trapped by a light trap the night before the experiment and only the specimens with no physical damage were selected for our experiment. An electronic balance with an accuracy of 0.1 mg was used to measure the mass of each specimen. The moisture content of the insects changes greatly when they are dead, which has an evident effect on the measured mass and electromagnetic scattering characteristics. Therefore, the measurements of insect mass and RCS were made with specimens that were freshly dead.
The information of the four datasets (D, L, M and K) were listed in Table 2. Four different datasets were combined into one dataset and then 367 specimens in total were included in our study. The mass of all specimens ranges from 1.83 mg to 4120 mg. It is important to note that the RCS parameters of dataset K were measured at multi-frequency points, so only the RCS parameters at 9.45 GHz were used to keep consistency with other datasets.  A total of 169 insects belonging to 22 species (denoted here by K) were measured in our experiment ( Table 1). All insects were trapped by a light trap the night before the experiment and only the specimens with no physical damage were selected for our experiment. An electronic balance with an accuracy of 0.1 mg was used to measure the mass of each specimen. The moisture content of the insects changes greatly when they are dead, which has an evident effect on the measured mass and electromagnetic scattering characteristics. Therefore, the measurements of insect mass and RCS were made with specimens that were freshly dead.
The information of the four datasets (D, L, M and K) were listed in Table 2. Four different datasets were combined into one dataset and then 367 specimens in total were included in our study. The mass of all specimens ranges from 1.83 mg to 4120 mg. It is important to note that the RCS parameters of dataset K were measured at multi-frequency points, so only the RCS parameters at 9.45 GHz were used to keep consistency with other datasets.

Support Vector Regression
SVR is one of the most common application forms of SVM and can obtain the global optimum solution based on limited samples by minimizing the generalization error bound [24,29]. Given a dataset including N vectors x i , y i , (i = 1, 2, . . . , N), x i ∈ R d . Each vector x i contains d-dimensional features and y i represents the target value. The linear function f (x i ) can be represented as [29]: where w represents the weight and b represents the bias parameters, which can be determined by the training set. When the deviation between the predicted value f (x i ) and original target value y i is smaller than ε for every sample, the SVR model can be obtained by solving the following convex optimization problem [24,26]: where C is a constant and determines the trade-off between the flatness of the model and the model toleration of deviations larger than ε. ξ i , ξ * i represent the slack variables [24].
In general, most regression problems are nonlinear and the kernel functions should be introduced to solve the problem by mapping the input space into a high-dimensional feature space. The radial basis function (RBF) kernel is the most popular and was used in this study [31,32].
For an SVR model, the input features are critically important for the regression result. Therefore, feature importance assessment is an important and necessary step. The support vector regression based on recursive feature elimination (SVMRFE) is a popular wrapper feature selection method developed from SVM [33] and has been widely used in research [31,34]. SVRRFE obtains a ranking of features using backward feature elimination. Specifically, the feature selection algorithm starts with all the features and then the feature with the smallest weight is removed recursively at a time until only one feature remains [34].

Feature Extraction
Insect RCS polarization dependence measured by a monostatic radar can be represented as: where h = cos φ sin φ represents the normalized effective length for transmission and reception. φ represents the direction of linear polarization. The scattering of insect target is linear, so the scatter matrix is symmetrical and can be represented as [19]: where α, β and γ are phase factors. γ has no correlation with the RCS and is usually ignored. Therefore, the RCS versus polarization angle can be rearranged as [15]: where a 0 represents polarization-averaged RCS, a 2 and a 4 represent non-negative coefficients and can be determined by the radar scattering matrix S and θ 0 represents either the orientation of the insect or the perpendicular to its orientation. Two principal RCS terms (the parallel values σ xx and transverse values σ yy ) of the target can be obtained [19]: where α 2 , α 4 represent dimensionless parameters.
Values of the parameters a 0 , α 2 , α 4 , σ xx and σ yy have all been utilized for the estimation of insect mass in former research [14,19,20]. In addition, the invariant target parameters d and ν calculated from the Graves power matrix were also explored to improve the estimation accuracy [21]. d represents the scattering matrix's determinant and ν represents the RCS when the polarization direction is perpendicular to the insect's body axis. Please refer to [21] for the calculation of parameters d and ν. The selection of features is crucial to obtain a good result. Therefore, the parameters that have proven effective in mass estimation were also selected as features in this research. The parameters a 0 , σ xx , σ yy , d and ν are often expressed logarithmically [21], therefore the logarithm-transformed parameters are also used in our research. Traditional research also demonstrates that if a third-order polynomial is employed to describe the relation between mass and RCS parameters, better fitting results can be obtained [21]; consequently, the square and cube values of RCS parameters are also selected as features. Based on the traditional insect mass estimation method, 15 variables are selected as inputs, as listed in Table 3. Table 3. Extracted features for insect mass estimation based on support vector regression (SVR). Note: the features were ranked based on the feature importance in estimating insect mass.

Mass Estimation Based on SVR
We studied the estimation of insect mass based on the measured RCS parameters of 367 insects. We selected 250 insect specimens as training samples and the rest were regarded as test samples. Based on the selected features listed in Table 3, a 15-dimensional dataset was constructed and was input into the SVR model for training. Then, the test data were imported into the trained model and the predicted result was obtained. In this study, the SVR algorithm was implemented based on a toolbox Lib-SVM in the MATLAB programming language [35].
The mean relative error of estimated insect mass was 22.41%. However, it should be noted that this result may not be optimal. For a small sample, with the increase of the number of features, the generalization ability of the trained model is relatively poor, which may result in an imprecise result. SVRRFE model can export the score of variable importance, which can be used to evaluate the influence of each variable on the dependent variable. The optimal subset can be constructed based on SVRRFE. Table 3 demonstrates the ranking of feature importance. We can learn that the most useful three features are log 10 d, log 10 ν and α 2 and the least important feature is log 10 σ yy 3 .
Based on the ranking results, we deleted the features that ranked last in the importance ranking one by one and then trained and tested the new model in turn. The relation between estimation accuracy and feature number is shown in Figure 2. With the removal of unimportant features in turn, the estimation accuracy of insect mass was gradually increasing as a whole and the main reason was that the elimination of irrelevant features and redundant features improved the performance of the model. When the estimation accuracy reached the highest value of 78%, it began to show a downward trend, because the useful features were eliminated, which brought down the performance of the model. Therefore, to achieve the optimal estimation accuracy, the first six features listed in Table 3 should be selected.

Mass Estimation Based on SVR
We studied the estimation of insect mass based on the measured RCS parameters of 367 insects. We selected 250 insect specimens as training samples and the rest were regarded as test samples. Based on the selected features listed in Table 3, a 15-dimensional dataset was constructed and was input into the SVR model for training. Then, the test data were imported into the trained model and the predicted result was obtained. In this study, the SVR algorithm was implemented based on a toolbox Lib-SVM in the MATLAB programming language [35].
The mean relative error of estimated insect mass was 22.41%. However, it should be noted that this result may not be optimal. For a small sample, with the increase of the number of features, the generalization ability of the trained model is relatively poor, which may result in an imprecise result. SVRRFE model can export the score of variable importance, which can be used to evaluate the influence of each variable on the dependent variable. The optimal subset can be constructed based on SVRRFE. Table 3 demonstrates the ranking of feature importance. We can learn that the most useful three features are log , log and and the least important feature is log . Based on the ranking results, we deleted the features that ranked last in the importance ranking one by one and then trained and tested the new model in turn. The relation between estimation accuracy and feature number is shown in Figure 2. With the removal of unimportant features in turn, the estimation accuracy of insect mass was gradually increasing as a whole and the main reason was that the elimination of irrelevant features and redundant features improved the performance of the model. When the estimation accuracy reached the highest value of 78%, it began to show a downward trend, because the useful features were eliminated, which brought down the performance of the model. Therefore, to achieve the optimal estimation accuracy, the first six features listed in Table 3 should be selected.

Comparison with Traditional Methods
In this section, we compare the SVR model with traditional insect mass estimation methods. Traditional insect mass estimation methods are mostly realized based on polynomial fittings. In this paper, five traditional methods that have been proved to have relatively good results were selected

Comparison with Traditional Methods
In this section, we compare the SVR model with traditional insect mass estimation methods. Traditional insect mass estimation methods are mostly realized based on polynomial fittings. In this paper, five traditional methods that have been proved to have relatively good results were selected for comparative analysis. The fitting results are shown in Figure 3 and the estimation results based on traditional methods are listed in Table 4. Third-order polynomials were adopted for characterizing the relation between the logarithm of mass and the logarithm of a certain feature (namely a 0 , σ yy , v, or d). The refitted empirical formulas were also calculated (see Equations (8)-(12) for details). As to traditional methods, the estimation based on parameters a 0 and α 2 achieved the best result, while estimation only based on parameter a 0 produced a relatively poor result. Compared with the traditional methods, SVR method achieved a better estimation result, which proved that the model constructed by SVR can actually be employed for insect mass estimation.
Remote Sens. 2020, 12,1903 7 of 10 estimation only based on parameter 0 produced a relatively poor result. Compared with the traditional methods, SVR method achieved a better estimation result, which proved that the model constructed by SVR can actually be employed for insect mass estimation.    • log 10 σ yy method (Aldhous et al. (1989), [19]).

Discussion
For decades, entomologists have paid close attention to the movement of migrant insect, especially the quantification and identification of migratory insects [8,16]. Great progress has been made based on current radar technology. However, reliable identification of radar targets is still a key problem for the research of insect migrations. In terms of species identification of migratory insects based on radar, the main parameters that can be used at present are wingbeat frequency, mass and size of the insect, among which the measurement of the wingbeat frequency can be realized [36], but there are still obvious deficiencies in the measurement of insect mass. In this paper, we study the estimation of insect mass by using a variety of RCS parameters and their deformations based on the SVR algorithms. The good performance of the proposed method can provide insights for the research of species identification of migratory insects.
It should be emphasized that the number of insect specimens used in this paper is relatively small. In particular, the number of experimental insects greater than 1000 mg is very limited. For machine learning methods such as SVR, the accuracy of the training model will be improved with the increase of specimens. Therefore, measurement of the RCS parameters of more insects will be done in future studies, especially for insects greater than 1000 mg. In addition, if there are enough samples of experimental insects, more features should be considered and the estimation accuracy of insect mass will be further improved accordingly.
RCS parameters proved to have the potential to improve the estimation accuracy of insect mass. Mass and other shape parameters of insects are important characteristics to recognize their identity. Therefore, the features used for mass estimation can also be applied to the study of species identification of insects. However, in this study, most species only have less than 20 samples, which limits our further study of species identification based on RCS parameters. If there are a large number of samples of the concerned species, we can do the research of insect species identification based on machine learning methods. More specimens for each species should be included in our future experiments.
In addition, it was proved that the feature parameters extracted from a multi-frequency scattering curve could also be used to estimate insect mass [22]. The scatterings of almost all insects at Ku band are generally in the resonance region [22], where the variation of scattering with frequency is very complex and is difficult to be described by a mathematical formula for various insects. Therefore, it is challenging to extract more useful information from Ku band for insect mass estimation. Machine learning methods may be optional and effective for overcoming the difficulty. However, not enough multi-frequency scattering data of insects have been published, which is insufficient for building a complicated and ideal machine learning model. Therefore, if enough multi-frequency RCS data are accumulated in the future, the estimation of insect mass based on multi-frequency information can be considered.

Conclusions
In this paper, a method based on SVR is proposed for the estimation of insect mass. Fifteen features were extracted and evaluated. For insect mass estimation, the most important three features were log 10 d, log 10 ν and α 2 . The optimal estimation model was also determined when the most important six features were utilized. The best estimation accuracy of insect mass was 78%. In conclusion, the proposed method can accurately measure insect mass, which provides effective support for species identification in entomological radar. It can be predicted that, in the future, multi-frequency and full-polarization scattering information will be used comprehensively to achieve optimal insect species identification performance. In addition, if more experimental insects are available, the estimation accuracy of insect mass may be further improved accordingly.