Classiﬁcation of Amanita Species Based on Bilinear Networks with Attention Mechanism

: The accurate classiﬁcation of Amanita is helpful to its research on biological control and medical value, and it can also prevent mushroom poisoning incidents. In this paper, we constructed the Bilinear convolutional neural networks (B-CNN) with attention mechanism model based on transfer learning to realize the classiﬁcation of Amanita . When the model is trained, the weight on ImageNet is used for pre-training, and the Adam optimizer is used to update network parameters. In the test process, images of Amanita at different growth stages were used to further test the generalization ability of the model. After comparing our model with other models, the results show that our model greatly reduces the number of parameters while achieving high accuracy (95.2%) and has good generalization ability. It is an efﬁcient classiﬁcation model, which provides a new option for mushroom classiﬁcation in areas with limited computing resources.


Introduction
Amanita is a large fungus, which is an important part of natural medicine resources. At present, using the characteristics of amatoxins to control and treat tumors is a promising method [1][2][3]. Amanita muscaria is a famous hallucinogenic mushroom, which can be used to develop special drugs for anesthesia and sedation [4]. In terms of biological control, the toxins contained in Amanita albicans and Amanita muscaria have certain trapping and killing effects on insects or agricultural pests [3,5]. At present, there is no artificially cultivated Amanita, the amatoxins needed in scientific research can only be extracted from fruit bodies collected in the field [6][7][8]. Moreover, due to the lack of knowledge and ability to identify poisonous mushrooms, there are a number of cases of poisoning death from eating wild mushrooms every year [9][10][11][12][13]. In Europe, 95% of mushroom poisoning deaths are caused by poisonous Amanita [14,15]. Therefore, it is necessary to accurately classify and identify them both in terms of use value and poisoning prevention.
Many researchers have contributed to the classification of mushrooms. For example, Ismail [16] studied the characteristics of mushrooms, such as the shape, surface and color of the cap, roots and stems, and used the principal component analysis (PCA) algorithm to select the best features for the classification experiment using the decision tree (DT) algorithm. Pranjal Maurya [17] used a support vector machine (SVM) classifier to distinguish edible and poisonous mushrooms, with an accuracy of 76.6%. Xiao [18] used the Shuf-fleNetV2 model to quickly identify the toxicity of wild bacteria. The accuracy of the model is 55.18% for Top-1 and 93.55% for Top-5. Chen [19] used the Keras platform to build a convolutional neural network (CNN) for end-to-end model training and migrated to the Android end to realize mushroom recognition on the mobile end, but the recognition effect of his model was poor. Preechasuk J [20] proposed a new model of classifying 45 types of mushrooms including edible and poisonous mushrooms by using a technique of CNN, (1) A self-built Amanita dataset that is 3219 Amanita images obtained from the Internet and divided. (2) The Bilinear convolutional neural networks model was built and fine-tuned to make the model more suitable for the dataset. (3) The Bilinear convolutional neural networks model is combined with the attention mechanism to improve the model. This method can quickly obtain the most effective information.

Image Dataset
In this paper, the original dataset comes from two sources. On the one hand, it is a downloaded mushroom dataset from the Kaggle platform. The data on Kaggle are a public data source and have a certain degree of authority. Special thanks are owed to the Nordic Society of Mycologists, who provided the most common mushroom sources in the region on Kaggle and checked the data and labels. We choose the Amanita dataset based on the label of the mushroom dataset. Another dataset of mushroom images was collected from http://www.mushroom.world (accessed on 24 September 2020). We searched the mushroom database on this website based on the name of the mushroom. Then, we recorded the color and structure of the cap of Amanita (such as egg-shaped, unfolded cap, umbrella-shaped, spherical) according to [29] to confirm the type of Amanita again. Finally, the dataset of Amanita was obtained, as shown in Table 1.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2. In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.  In the paper, in order to make the model more applicable to the wild environment, most of the data pictures are Amanita growing in the wild environment, but also include some pictures of Amanita that were picked by hand.

Data Augmentation
There are seven kinds of Amanita in the original dataset, a total of 3219 pictures. All samples are randomly divided into training and test dataset according to a ratio of 8:2.
Training convolutional neural networks requires a lot of image data to prevent overfitting. Therefore, in this paper, the input image data are enhanced by using the built-in ImageDataGenerator [30,31] interface of Tensorflow2.0. The purpose of increasing the number of images is achieved by combining random rotation, translation, cutting and other operations, as shown in Figure 1b-e. This method roughly increases the number of images by 6 times.

The Efficient Net Model
The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.  The Efficient Net [32] was proposed by the google team in 2019. Through comprehensive optimization of network width, depth, and input image resolution to achieve the goal of index improvement. It has fewer model parameters, but higher accuracy.
EfficientNet-B4 was selected through comprehensive consideration of the parameters and accuracy of the 8 models (EfficientNet-B0 to B7) by consulting the literature [33,34]. The network structure of EfficientNet-B4 is shown in Figure 2.

Bilinear Convolutional Neural Networks
The Bilinear convolutional neural networks (B-CNN) [35,36] proposed by Lin in 2015 is a representative of weakly supervised fine-grained classification. Its network structure is shown in Figure 3. It used two A and B two-way convolutional neural networks to extract two features at each position of the image, then multiply the outer product, and finally classify through the classification layer. The models are coordinated with each other through the CNN A and CNN B networks. The function of CNN A is to locate the feature parts of the image, and CNN B is used to extract the features of the feature regions detected by CNN A [37]. In this way, the local detection and feature extraction tasks in the fine-grained image classification process can be completed.

Visual Attention Mechanism
In this paper, we chose to use Mixed attention, which combines multiple attention mechanisms. It can bring better performance to a certain extent. An excellent work in this area is the Convolutional Attention Mechanism Module (CBAM) [38], which is based on

Bilinear Convolutional Neural Networks
The Bilinear convolutional neural networks (B-CNN) [35,36] proposed by Lin in 2015 is a representative of weakly supervised fine-grained classification. Its network structure is shown in Figure 3. It used two A and B two-way convolutional neural networks to extract two features at each position of the image, then multiply the outer product, and finally classify through the classification layer.

Bilinear Convolutional Neural Networks
The Bilinear convolutional neural networks (B-CNN) [35,36] proposed by Lin in 2015 is a representative of weakly supervised fine-grained classification. Its network structure is shown in Figure 3. It used two A and B two-way convolutional neural networks to extract two features at each position of the image, then multiply the outer product, and finally classify through the classification layer. The models are coordinated with each other through the CNN A and CNN B networks. The function of CNN A is to locate the feature parts of the image, and CNN B is used to extract the features of the feature regions detected by CNN A [37]. In this way, the local detection and feature extraction tasks in the fine-grained image classification process can be completed.

Visual Attention Mechanism
In this paper, we chose to use Mixed attention, which combines multiple attention mechanisms. It can bring better performance to a certain extent. An excellent work in this area is the Convolutional Attention Mechanism Module (CBAM) [38], which is based on The models are coordinated with each other through the CNN A and CNN B networks. The function of CNN A is to locate the feature parts of the image, and CNN B is used to extract the features of the feature regions detected by CNN A [37]. In this way, the local detection and feature extraction tasks in the fine-grained image classification process can be completed.

Visual Attention Mechanism
In this paper, we chose to use Mixed attention, which combines multiple attention mechanisms. It can bring better performance to a certain extent. An excellent work in this area is the Convolutional Attention Mechanism Module (CBAM) [38], which is based on  CAM uses the max-pooling output and the average-pooling output through the shared network [39]. SAM generates two feature maps representing different information by performing global average pooling and global maximum pooling operations. After merging the two feature maps, the feature fusion is performed through a 7 × 7 convolution with a larger receptive field, and finally the weight map is generated by the Sigmoid operation and superimposed back onto the original input feature map [40]. It achieves the purpose of enhancing the target area.

Our Model
In this paper, a bilinear EfficientNet-B4 model is built and combined with the convolutional attention mechanism module (CBAM), CBAM as shown in Figure 5, which is a combination of spatial and channel modules. The structure of adding the attention mechanism to the bilinear model is shown in Figure 6.
The overall process of using the model is: (1) Use the EfficientNet-B4 network architecture to extract the feature layer after data expansion of the input image. (2) Combining the output result of the convolutional layer with CBAM, it will first pass a channel attention module to obtain the weighted result, then pass through a spatial attention module, and finally obtain the extracted result by weighting.   CAM uses the max-pooling output and the average-pooling output through the shared network [39]. SAM generates two feature maps representing different information by performing global average pooling and global maximum pooling operations. After merging the two feature maps, the feature fusion is performed through a 7 × 7 convolution with a larger receptive field, and finally the weight map is generated by the Sigmoid operation and superimposed back onto the original input feature map [40]. It achieves the purpose of enhancing the target area.

Our Model
In this paper, a bilinear EfficientNet-B4 model is built and combined with the convolutional attention mechanism module (CBAM), CBAM as shown in Figure 5, which is a combination of spatial and channel modules. The structure of adding the attention mechanism to the bilinear model is shown in Figure 6.
The overall process of using the model is: (1) Use the EfficientNet-B4 network architecture to extract the feature layer after data expansion of the input image. (2) Combining the output result of the convolutional layer with CBAM, it will first pass a channel attention module to obtain the weighted result, then pass through a spatial attention module, and finally obtain the extracted result by weighting.  CAM uses the max-pooling output and the average-pooling output through the shared network [39]. SAM generates two feature maps representing different information by performing global average pooling and global maximum pooling operations. After merging the two feature maps, the feature fusion is performed through a 7 × 7 convolution with a larger receptive field, and finally the weight map is generated by the Sigmoid operation and superimposed back onto the original input feature map [40]. It achieves the purpose of enhancing the target area.

Our Model
In this paper, a bilinear EfficientNet-B4 model is built and combined with the convolutional attention mechanism module (CBAM), CBAM as shown in Figure 5, which is a combination of spatial and channel modules. The structure of adding the attention mechanism to the bilinear model is shown in Figure 6.
The overall process of using the model is: (1) Use the EfficientNet-B4 network architecture to extract the feature layer after data expansion of the input image. (2) Combining the output result of the convolutional layer with CBAM, it will first pass a channel attention module to obtain the weighted result, then pass through a spatial attention module, and finally obtain the extracted result by weighting.

Parameters and Index
In the choice of the optimizer, we compared the two optimizers stochastic gradient descent (SGD) [41] and adaptive moment estimation (Adam) [42], and found that the performance of the model decreased by about 1% when the SGD optimizer with a set learning rate of 0.001 and momentum of 0.95 was used, so we choose the Adam optimizer with hyperparameters beta_1 = 0.9, beta_2 = 0.999, epsilon =1 × 10 −8 , decay = 0.0 as the optimizer of models. In order to compare the accuracy and efficiency of different models, we used unified hyperparameters to train different network models (Table 2). The simplest and most commonly used metric for evaluating classification models is accuracy, but precision and recall are also needed to evaluate the quality of the model. Precision [43] can be understood as the number of correctly predicted Amanita species images divided by the number of Amanita images predicted by the classifier. Recall [44] is the percentage of the number of correctly predicted Amanita species images to the total number of images actually belonging to the category of Amanita images.

Parameters and Index
In the choice of the optimizer, we compared the two optimizers stochastic gradient descent (SGD) [41] and adaptive moment estimation (Adam) [42], and found that the performance of the model decreased by about 1% when the SGD optimizer with a set learning rate of 0.001 and momentum of 0.95 was used, so we choose the Adam optimizer with hyperparameters beta_1 = 0.9, beta_2 = 0.999, epsilon =1 × 10 −8 , decay = 0.0 as the optimizer of models. In order to compare the accuracy and efficiency of different models, we used unified hyperparameters to train different network models (Table 2). The simplest and most commonly used metric for evaluating classification models is accuracy, but precision and recall are also needed to evaluate the quality of the model. Precision [43] can be understood as the number of correctly predicted Amanita species images divided by the number of Amanita images predicted by the classifier. Recall [44] is the percentage of the number of correctly predicted Amanita species images to the total number of images actually belonging to the category of Amanita images. F 1−score is the harmonic mean of precision and recall. Accuracy, Precision, Recall, and F 1−score are defined as follows: where TP and TN represent the Amanita image correctly classified; FP and FN indicate that the image of Amanita is misclassified.

Model Training
In this paper, the experimental environment is on the Google Colaboratory platform, which uses Tesla K80 GPU resources. The programming environment is Python3, and the framework structure is Keras 2.1.6 and Tensorflow 1.6.
The specific model training steps are as follows: • Data loading; A batch of Amanita pictures (32 pictures) were randomly loaded from the training dataset for subsequent data processing.
• Image preprocessing; Preprocess the image to change the image size to 224 × 224 × 3. Then, put it through the Tensorflow2.0 built-in ImageDataGenerator for data enhancement.

•
Define the model structure and load the pre-training weights; Load the model (such as EfficientNet-B4) and fine-tune the model. Change the fully connected layer to a custom layer and modify the Softmax layer to seven layers according to the number of classifications required. In this paper, different layers are frozen according to the different models, and the method of transfer learning [45] is used to perform pretraining with Imagenet weights.

• Start training;
Before training the model, it is necessary to set the hyperparameters and optimizers related to the network structure. Pass the training dataset pictures (as shown in Figure 7) to the neural network for training. Among them, the feature map of the first convolutional layer (as shown in Figure 8), after one round of model training, obtains the loss and accuracy of the training dataset.

• Stop training;
This experiment is to avoid overtraining the network. An early stopping strategy is set, and the loss is verified by monitoring the training process. When the verification loss does not change within five rounds or the model training reaches the preset value, the model will stop training.
where TP and TN represent the Amanita image correctly classified; FP and FN indicate that the image of Amanita is misclassified.

Model Training
In this paper, the experimental environment is on the Google Colaboratory platform, which uses Tesla K80 GPU resources. The programming environment is Python3, and the framework structure is Keras 2.1.6 and Tensorflow 1.6.
The specific model training steps are as follows: • Data loading; A batch of Amanita pictures (32 pictures) were randomly loaded from the training dataset for subsequent data processing.
• Image preprocessing; Preprocess the image to change the image size to 224 × 224 × 3. Then, put it through the Tensorflow2.0 built-in ImageDataGenerator for data enhancement.

•
Define the model structure and load the pre-training weights; Load the model (such as EfficientNet-B4) and fine-tune the model. Change the fully connected layer to a custom layer and modify the Softmax layer to seven layers according to the number of classifications required. In this paper, different layers are frozen according to the different models, and the method of transfer learning [45] is used to perform pre-training with Imagenet weights.
• Start training; Before training the model, it is necessary to set the hyperparameters and optimizers related to the network structure. Pass the training dataset pictures (as shown in Figure 7) to the neural network for training. Among them, the feature map of the first convolutional layer (as shown in Figure 8), after one round of model training, obtains the loss and accuracy of the training dataset.

Stop training;
This experiment is to avoid overtraining the network. An early stopping strategy is set, and the loss is verified by monitoring the training process. When the verification loss does not change within five rounds or the model training reaches the preset value, the model will stop training.

Comparison of Modeling Methods
In order to verify the performance of the model, we compared the proposed model with other CNN models on the dataset. The structure and parameters of these models are shown in Table 3. Table 3. Some parameters of these models.

Model
Classification Layer In this paper, the steps of the comparison experiment are: (1) Use the VGGnet model with 16 layers (VGG-16) [46], the Residual Network with 50 layers (ResNet-50) [47], compare these two network models with EfficientNet-B4. (2) Combine the bilinear model to build a bilinear EfficientNet-B4, compare the bilinear VGG-16 model (B-CNN(VGG-16, VGG-16)) and the B-CNN(VGG-16, ResNet-50) model. (3) Add an attention mechanism to the model, conduct a comparative experiment, and discuss the effect of adding an attention mechanism. Figure 9 shows the changes in loss and accuracy during training and testing. Table 4 compares the results of different methods. The comprehensive chart can be used to draw the following conclusions: (1) The EfficientNet-B4 is superior to VGG-16 and Resnet-50 in terms of accuracy, model parameters and model size.
(2) On this basis, the bilinear structure was studied and used, and it was found that B-CNN(VGG-16, ResNet-50) has good accuracy. However, it has the largest number of parameters in the model used, and the size of the model is also very large. However,

Comparison of Modeling Methods
In order to verify the performance of the model, we compared the proposed model with other CNN models on the dataset. The structure and parameters of these models are shown in Table 3. Table 3. Some parameters of these models.

Model
Classification In this paper, the steps of the comparison experiment are: (1) Use the VGGnet model with 16 layers (VGG-16) [46], the Residual Network with 50 layers (ResNet-50) [47], compare these two network models with EfficientNet-B4.  Figure 9 shows the changes in loss and accuracy during training and testing. Table 4 compares the results of different methods. The comprehensive chart can be used to draw the following conclusions: (1) The EfficientNet-B4 is superior to VGG-16 and Resnet-50 in terms of accuracy, model parameters and model size.
(2) On this basis, the bilinear structure was studied and used, and it was found that B-CNN(VGG-16, ResNet-50) has good accuracy. However, it has the largest number of parameters in the model used, and the size of the model is also very large. However, Bilinear EfficientNet-B4 has a good performance in accuracy, model size and number of parameters. (3) For EfficientNet-B4 (accuracy rate is 92.76%), after adding the attention mechanism, its accuracy rate is 93.53%, which improves the accuracy rate by 0.77%; after combining the bilinear structure and attention mechanism, its accuracy rate is 95.2%, an increase of 1.77%. In general, adding an attention mechanism to the model will increase the accuracy by about 1% and can reduce the time by 0.5 s.
Bilinear EfficientNet-B4 has a good performance in accuracy, model size and number of parameters. (3) For EfficientNet-B4 (accuracy rate is 92.76%), after adding the attention mechanism, its accuracy rate is 93.53%, which improves the accuracy rate by 0.77%; after combining the bilinear structure and attention mechanism, its accuracy rate is 95.2%, an increase of 1.77%. In general, adding an attention mechanism to the model will increase the accuracy by about 1% and can reduce the time by 0.5 s. Figure 9. The accuracy of models during training and testing. In addition, by comparing the first five rounds of training and testing in Figure 9, it can be found that the accuracy of the test is slightly higher than that of the training. The main reason is that the network is initialized with pre-trained weights. Therefore, the model has better feature extraction capabilities in the first few rounds of testing.

Model Test Results
Not all objects are equally difficult to classify, so it is necessary to observe the accuracy of each category and the confusion between categories. In the test dataset, there are 638 pictures, and the Bilinear EfficientNet-B4 with the attention model is used to obtain the confusion matrix as shown in Figure 10. In addition, by comparing the first five rounds of training and testing in Figure 9, it can be found that the accuracy of the test is slightly higher than that of the training. The main reason is that the network is initialized with pre-trained weights. Therefore, the model has better feature extraction capabilities in the first few rounds of testing.

Model Test Results
Not all objects are equally difficult to classify, so it is necessary to observe the accuracy of each category and the confusion between categories. In the test dataset, there are 638 pictures, and the Bilinear EfficientNet-B4 with the attention model is used to obtain the confusion matrix as shown in Figure 10.  It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period.
(2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%).
Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Table 5. Non-mushroom classification.

Varieties
Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class.  Table 6. Classification of other mushrooms.

Varieties
It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class. It can be seen from Figure 10 that it is difficult to classify among the three types of Amanita vaginata, Amanita bisporigera and Amanita phalloides, resulting in the lowest accuracy and recall rate of Amanita vaginata. Observing the dataset found that there are three main reasons: (1) Amanita vaginata and pure white Amanita bisporigera are similar in shape and feature, except for the difference in color on the surface of the fungus cap. The shapes of Amanita vaginata, Amanita bisporigera and Amanita phalloides are very similar in their juvenile period. (2) Some pictures of Amanita vaginata are overexposed and the pictures are white. At this time, the characteristics are very similar to Amanita bisporigera, so part of Amanita vaginata is classified as Amanita bisporigera. (3) The base of this category in the test dataset is not large.
In addition, the accuracy and recall rate of Amanita muscaria reached 1.0 and 0.959, indicating that this model is most suitable for identifying this type of Amanita.
In order to test the robustness of the classifier, we used other types of mushroom pictures and non-mushroom pictures for classification. Since the unknown class is not added to the data set at the beginning of training, when the classifier classifies an unknown category, it will be forced to be classified as a class in the data set. Therefore, the robustness of the model can be identified according to the probability of the unknown class classification.
The test results of non-mushrooms are shown in Table 5. Combining the classification results with the data set pictures, it can be found that when, the shape is different from the Amanita, it will be divided according to the color of the object. For example, when a white cat image is used for the classification test, the test result is Amanita bisporigera. This is because the color of Amanita bisporigera is white. However, its predicted value is 53%. Using seven kinds of pictures for classification, it can be found that the predicted value is relatively low (<55%) Common edible fungi were used for classification prediction and the test results are shown in Table 6. We can see that when using other mushroom images for classification, their predicted values are generally higher than those in Table 5. This is because they have a higher degree of similarity in appearance. However, in general, when we use the pictures of the Amanita species in the data set for classification prediction, the prediction probability is often more than 97%. Therefore, when the prediction probability is less than a certain value, we can treat the prediction result as an unknown class.

Conclusions
In this paper, eight different convolutional neural networks are used to classify seven different Amanita species. In order to select Amanita suitable for growing in the wild environment, the speed and accuracy of eight classification models were compared. These results show that the classifier based on deep learning is quite suitable for Amanita classification.
In this paper, we used simple models (VGG, ResNet, EfficientNet) for classification and found that the accuracy of these models is not particularly good. Therefore, the Bilinear Networks model is proposed. After building the B-CNN, we found that, although the accuracy of the model has improved, the size of the model is larger and the training time is longer. Therefore, we chose to add an attention mechanism to the model to improve the speed and accuracy of model training.
After comprehensive comparison of models, we found that the best model is B-CNN (EfficientNet-B4, EfficientNet-B4) which adds CBAM. After training, the accuracy of the training set is 99.3%, and the accuracy of the test set is 95.2%, which can solve the problem of difficult image classification of Amanita in the complex environment of the wild to a certain extent. It can provide a certain basis for future classification and identification of mushrooms with high similarity, and its model size is 130 MB. The presented model processed pictures in 4.56 s, which facilitates its application in mobile devices.