Recognition of the Duration and Prediction of Insect Prevalence of Stored Rough Rice Infested by the Red Flour Beetle (Tribolium castaneum Herbst) Using an Electronic Nose

The purpose of this research is to explore the feasibility of applying an electronic nose for the intelligent monitoring of injurious insects in a stored grain environment. In this study, we employed an electronic nose to sample rough rice that contained three degrees of red flour beetle (Tribolium castaneum Herbst) infestation for different durations—light degree (LD), middle degree (MD), and heavy degree (HD)—and manually investigated the insect situation at the same time. Manual insect situation investigation shows that, in all three rice treatments, the insect amounts gradually decreased after infestation. When the insect population of stored rough rice was under 13 insects per 60 g of rough rice, the natural speed of decrease of the insect population became very slow and reached the best artificial insect killing period. Linear discriminant analysis (LDA) provided good performance for MD and HD insect harm duration identification, but performed poorly for LD insect harm duration identification. Both k-means clustering analysis (K-means) and fuzzy c-means analysis (FCM) effectively identified the insect harm duration for stored rough rice. The results from the back-propagation artificial neural network (BPNN) insect prevalence prediction for the three degrees of rough rice infestation demonstrated that the electronic nose could effectively predict insect prevalence in stored grain (fitting coefficients were larger than 0.89). The predictive ability was best for LD, second best for MD, and least accurate for HD. This experiment demonstrates the feasibility of electronic noses for detecting both the duration and prevalence of an insect infestation in stored grain and provides a reference for the intelligent monitoring of an insect infestation in stored grains.


Introduction
Rice is the most important crop in China. Approximately 65% of Chinese people live on rice. China is also the largest rice-producing country in the world, amounting to approximately 30% of the world's total production [1]. Pest insects are one of the main factors that cause grain loss. Researchers have reported that [2] 5% of the total grain in the world is lost due to infestation by insects every year. If manpower, material resources, and technology cannot meet the needs of grain protection, losses can reach 20%-30% of total grain. Annual losses of grain depots in China were approximately 0.2% of total grain production. Pest insects must be accurately detected to purposely administer prophylaxis and

Experimental Materials
The experimental rough rice was of the "Meixiangzhan" variety and was harvested at a Baiyun test field, Zhongluotan, Guangdong province, China, in December 2014. After harvesting, natural drying under sunlight was applied until its water content stabilized at between 12% and 14% (the best water content for rice storage). Then, unbroken and undamaged rough rice samples were selected for the test. Each experimental sample contained 60 g of rough rice and was placed in a 200 mL glass beaker. All of the beakers were cleaned using an ultrasonic cleaner and air-dried in a room with no abnormal smells before adding rice. There was 1 control (no-treatment, NT) and 3 experimental treatments: light degree damage treatment (LD), middle degree damage treatment (MD), and heavy degree damage treatment (HD). Each treatment consisted of five replications. LD samples manually received 10 red flour beetles, each MD sample received 50 red flour beetles, and each HD sample received 100 red flour beetles. Each sample was placed in a plastic box (length × width × height = 23 × 17 × 14.5 cm). During storage, each sample was sealed by a gauze element, and the cover of the plastic box was closed. We began the first test 2 days before infestation (−2 d), and then 2 days, 7 days, 13 days, 22 days, 28 days, and 36 days after infestation (2 d, 7 d, 13 d, 22 d, 28 d, and 36 d). Both an electronic nose and manual work detection were employed.

Insect Situation Investigation
The manual work detection method was used in this experiment to investigate insect prevalence. To conduct the investigation, each rice sample was placed onto 2 sheets of white paper and the number of live insects was counted directly with the naked eye and with a magnifying glass. Afterwards, rice samples were sequentially replaced to their respective beakers for storage.
Before sampling, the gauze elements were removed from the beakers and the beakers were then sealed using double-deck plastic films for 1 h. The sampling parameter settings were as follows: the sampling interval was 1 s; flush time was 70 s; zero point trim time was 10 s; measurement time was 60 s; presampling time was 5 s; and injection flow was 300 mL/min. This experiment acquired 140 electronic nose data (4 degrees of damage × 5 samples for each degree × 7 durations for each sample = 140) for different degrees of infested rough rice sampling. The data was obtained at 55 s for each sample was reported as the feature value.

Data Pre-Processing
Electronic noses have inevitably been reported as being influenced by drift noise when detecting based on time series [17]. This drift noise is usually caused by changes in environmental temperature and humidity, or by changes of sample itself. Thus, this experiment referenced research results of Yin et al. [18] to use the reference signal removal method for data pre-processing. Specifically, we used the electronic nose sampling data for LD, MD, and HD minus the average values of the 5 NT electronic nose sampling data at the same times, which effectively removed the drift noise signal of the same time but kept the useful signal only caused by insect infection.

Data Processing Method
Linear discriminant analysis (LDA) is also a linear pattern recognition method that uses dimensionality reduction. LDA, however, focuses on the distributions and distances within each treatment. It can collect information from whole sensors and delineate each treatment using a particular vectorization transformation, which results in the samples within a treatment being condensed and distant samples being sorted in different treatments [19].
K-means clustering analysis (K-means) is a clustering method based on partitioning. It attempts to find the K best clustering centers via iteration, allotting all sample data points to K clustering centers and minimizing the sum of the square errors of clustering at the same time [20].
Fuzzy c-mean clustering (FCM) is an unsupervised recognition method of clustering that allows one data point to belong to two or more clusters. It is based on the minimization of the objective function Jm. m is the weighted index of Jm and is a real number greater than 1. The value of m decides the type of objective function. Thus, finding a suitable m and Jm values is important for accurate FCM classification [21].
Back-propagation artificial neural networks (BPNNs) are one of the most commonly used neural networks and include input, hidden, and output layers. In the process of training BPNNs for analysis, the weights and threshold values of each layer are constantly revised. BPNNs adjust the weights and threshold values repeatedly based on the differences between the expected outputs and actual outputs. Thus, a BPNN is a neural network that spreads information in the forward direction and returns the difference in the reverse direction. This training lasts until the difference between the expected outputs and actual outputs is reduced to a preset range or until the scheduled training times are achieved [22].

Change of the Insect Number at Different Degrees and Durations of Infestation
Average numbers (rounded to whole numbers) of live insects for the five replications were used to represent the live insect number of each infestation degree at a specific time. The insect number changes of in the different infestation degrees of rough rice at different times are shown in Table 1. During injury, live red flour beetle numbers in all of the LD, MD, and HD samples gradually decreased. However, the speed of the decrease in LD was slow after injury and tended to stabilize after 28 d. The decreased speed in LD was rapid from 0 to 13 d after injury, but slowed after 13 d. The decreased speed in HD was rapid from 0 to 22 d after injury, but slowed down after 22 d. According to trends of insect variation in the different infestation treatments, we can infer that insect numbers naturally decline slowly when populations in storage grain are under 13 individuals/60 g, which is the necessary period for manual insect killing.  Figure 1. The LDA results for infestation duration recognition of LD are shown in Figure 1a. The first linear discriminant factor's contribution (LD1) is 53.84%, and the second linear discriminant factor's contribution (LD2) is 19.79%, for a total contribution of LD1 and LD2 of 99.87%. All of the durations of infestation in rough rice did not overlap with others and could be classified. However, 13 d was close to 22 d and could be confused in practical classification and recognition. The LDA results for the infestation duration recognition of MD are shown in Figure 1b. LD1 contributed 56.61%, while LD2 contributed 29.63%, for a total contribution of LD1 and LD2 of 86.24%. All of the infested durations did not overlap and could be classified effectively. The LDA results for the infestation duration recognition of HD are shown in Figure 1c. LD1 contributed 64.01%, and LD2 contributed 22.87%, for a total contribution of LD1 and LD2 of 86.88%. All of the infested durations did not overlap with others and could be classified. Thus, the infested duration LDA classification effect is good in MD and HD, but poor in LD. This may be because the volatiles of LD (less infected) was lighter and changed slower than that of MD and HD.

Clustering Analysis for Duration Recognition of Different Infestation Degrees of Rough Rice
To further explore the duration recognition accuracy of different infestation degrees of rough rice, clustering analysis was applied in this research. So far, there are four commonly used clustering analysis methods, namely, K-means, hierarchical clustering, SOM clustering, and FCM. However, research has shown that K-means and FCM often have better classification abilities than the other two methods [23]. Thus, this experiment used K-means and FCM to further classify the duration of different infestation degrees of rough rice. In addition, research shows that extracting the linear discriminant factor matrices of the LDA results as feature values to replace the original ones can usually further reduce redundant information and yield a better classification effect [24]. Thus, this experiment used the linear discriminant factor matrix of the LDA results as feature values for further research to explore the feasibility of using an electronic nose for the duration recognition of different infestation degrees of rough rice.
The classification results of K-means for the duration of different infestation degrees of rough rice are shown in Table 2. K-means for infestation duration recognition of LD, MD, and HD are 88.57%, 91.42%, and 91.42%, respectively. When FCM was used for classification, weighted m values (weighted index of the objective function Jm) had significant impacts on the classification results. After repeated analyses, the m values and classification results are shown in Table 3. FCM classification accuracies for the duration of LD, MD, and HD were 94.29%, 100%, and 100%, respectively. The classification abilities of FCM were better than the K-means for duration recognition of different infestation degrees in rough rice.

BPNN for the Prediction of The Live Insect Amount in Infested Rough Rice
We used BPNN to explore the feasibility of stored rough rice's insect prevalence predictions based on an electronic nose. There are three infestation degrees included in the analysis, namely, LD, MD, and HD. Each treatment included five samples. Thus, there were 105 electronic nose sample

Clustering Analysis for Duration Recognition of Different Infestation Degrees of Rough Rice
To further explore the duration recognition accuracy of different infestation degrees of rough rice, clustering analysis was applied in this research. So far, there are four commonly used clustering analysis methods, namely, K-means, hierarchical clustering, SOM clustering, and FCM. However, research has shown that K-means and FCM often have better classification abilities than the other two methods [23]. Thus, this experiment used K-means and FCM to further classify the duration of different infestation degrees of rough rice. In addition, research shows that extracting the linear discriminant factor matrices of the LDA results as feature values to replace the original ones can usually further reduce redundant information and yield a better classification effect [24]. Thus, this experiment used the linear discriminant factor matrix of the LDA results as feature values for further research to explore the feasibility of using an electronic nose for the duration recognition of different infestation degrees of rough rice.
The classification results of K-means for the duration of different infestation degrees of rough rice are shown in Table 2. K-means for infestation duration recognition of LD, MD, and HD are 88.57%, 91.42%, and 91.42%, respectively. When FCM was used for classification, weighted m values (weighted index of the objective function J m ) had significant impacts on the classification results. After repeated analyses, the m values and classification results are shown in Table 3. FCM classification accuracies for the duration of LD, MD, and HD were 94.29%, 100%, and 100%, respectively. The classification abilities of FCM were better than the K-means for duration recognition of different infestation degrees in rough rice.

BPNN for the Prediction of The Live Insect Amount in Infested Rough Rice
We used BPNN to explore the feasibility of stored rough rice's insect prevalence predictions based on an electronic nose. There are three infestation degrees included in the analysis, namely, LD, MD, and HD. Each treatment included five samples. Thus, there were 105 electronic nose sample data that were acquired from 7 infestation durations for the insect prevalence prediction. We chose 4 rough rice samples randomly from each treatment of different infestation durations as the training set, and the remaining one sample from each treatment of different infestation durations as the test set. Thus, the training set of each treatment had 28 samples (4 samples per treatment × 7 infestation durations = 28), and the test set of each treatment had 7 samples (1 sample per treatment × 7 infestation durations = 28). Thus, this experiment had 84 samples (28 samples per infestation duration × 3 treatments = 84) in the training set, and 21 samples (7 samples per infestation duration × 3 treatments = 21) in the test set, in total. After repeated training, the BPNN model parameters of LD, MD, and HD were as follows: the nerve cell numbers of the hidden layer was 21, 18, and 21, respectively; all of the treatments utilized the BPNN "trainlm" training algorithm; all of the treatments had BPNN hidden layer numbers of 2; all of the BPNN output layers used the "tansig" function; and the BPNN iteration numbers and sampling frequencies were 2000 and 25, respectively, for all treatments.
The matched curves of the BPNN prediction values and actual values for the infested rough rice live insect amounts are shown in Figure 2  The BPNN prediction results for the LD, MD, and HD test sets are shown in Table 4. The average difference values and average relative errors of the prediction and actual values for LD are the smallest, are intermediate for MD, and are highest for HD. The correct prediction numbers of LD, MD, and HD were 4, 2 and 1, respectively. Thus, the prediction power for LD is the best, is intermediate for MD, and is worst for HD. According to the experimental results, we can infer that the electronic nose method is most suitable for the insect prevalence prediction in stored grain during the insect infestation first occurs.

Discussion
This paper explored the feasibility of using an electronic nose for inferring infestation duration and prevalence in stored grains. The results show that electronic noses can effectively predict the stored grain infestation duration and insect prevalence.
During injury, live insect numbers of all three infestation categories decreased gradually. Furthermore, the higher the degree of infestation, the more rapid the decrease in live insect number. Previous research results of Zhou et al. indicated that [25] serious internecine phenomena will occur if the stocking density is too high, which will cause a gradual decrease in the insect survival rate. However, if the insect survival rate reaches a particular value, then the survival rate will tend to stabilize. The results of Zhou et al. are consistent with the changes in insect number found in this experiment.
LDA, FCM, and K-means results show that the recognition effects of MD and HD rough rice were better than recognition effects of LD rough rice, which indicates that the heavier the insect infection is, the more special the odors [11] in the storage environment will be. In addition, FCM performed better than K-means for the infestation duration prediction. This is because [26] K-means distribute each sample to only one class and cannot distribute samples to several classes at the same time. This analysis method is sensitive to noise and abnormal values and can easily be caught in local maxima during optimization. However, when using FCM for analysis, each sample can be distributed to several classes at the same time [27]. Compared with K-means, FCM can avoid certain problems, such as local optimization maxima.
The results of BPNN show that predicting insect prevalence in stored rough rice based on electronic nose data is feasible for all insect infectious degrees. The predictive abilities of LD were the best, were intermediate for MD, and were the worst for HD. This may be because interference during the detection period increases with additional insects in a fixed size space, where interference may arise from the smell of dead insects and the metabolism of live insects. Research into how to overcome this interference during insect prevalence detection in stored grain is a valuable future direction to consider for new research. In practical insect prevalence detection, the infected situation of an entire granary can be repeatedly evaluated by the results of a random sample, with the detection method of an electronic nose combined with a BPNN model.

Conclusions
This experiment used an electronic nose (PEN3) to sample stored rough rice at different red flour beetle infestation degrees at different time points (2 days before infestation, 2 days, 7 days, 13 days, 22 days, 28 days, and 36 days after infestation). The manual work detection method was used to count the live insect numbers at the same time. The experimental results are as follows.
(1) The results of the manual work detection method show that live red flour beetle numbers in all of the LD, MD, and HD treatments gradually decreased. We can infer that the insect number natural declines slowly when the insect population in storage grain is under 13 individuals/60 g, which is the necessary period for manual insect control. We can also infer that the electronic nose method is the most suitable method for the insect prevalence prediction of stored grain once insects first infest grains. (4) This experiment supports the feasibility of predicting the infestation duration and insect prevalence in stored grains based on electronic nose measurements and provides a reference for the intelligent monitoring of stored grain insect infestations.