Wildfire Susceptibility Prediction Based on a CA-Based CCNN with Active Learning Optimization

: Wildfires cause great losses to the ecological environment, economy, and people’s safety and belongings. As a result, it is crucial to establish wildfire susceptibility models and delineate fire risk levels. It has been proven that the use of remote sensing data, such as meteorological and topographical data, can effectively predict and evaluate wildfire susceptibility. Accordingly, this paper converts meteorological and topographical data into fire-influencing factor raster maps for wildfire susceptibility prediction. The continuous convolutional neural network (CCNN for short) based on coordinate attention (CA for short) can aggregate different location information into channels of the network so as to enhance the feature expression ability; moreover, for different patches with different resolutions, the improved CCNN model does not need to change the structural parameters of the network, which improves the flexibility of the network application in different forest areas. In order to reduce the annotation of training samples, we adopt an active learning method to learn positive features by selecting high-confidence samples, which contributes to enhancing the discriminative ability of the network. We use fire probabilities output from the model to evaluate fire risk levels and generate the fire susceptibility map. Taking Chongqing Municipality in China as an example, the experimental results show that the CA-based CCNN model has a better classification performance; the accuracy reaches 91.7%, and AUC reaches 0.9487, which is 5.1% and 2.09% higher than the optimal comparative method, respectively. Furthermore, if an accuracy of about 86% is desired, our method only requires 50% of labeled samples and thus saves about 20% and 40% of the labeling efforts compared to the other two methods, respectively. Ultimately, the proposed model achieves the balance of high prediction accuracy and low annotation cost and is more helpful in classifying fire high warning zones and fire-free zones.


Introduction
Wildfire poses a serious threat to ecosystems and human society.The frequency and intensity of wildfires are on the rise globally, which causes huge losses to the ecological environment, economy, and people's safety and belongings.To effectively manage forest areas and reduce fire risks, it has become an urgent and important task to establish efficient wildfire susceptibility models and delineate fire risk level zones.
Wildfire susceptibility is influenced by various factors, such as climate, topography, vegetation, and human factors.Among them, meteorological and topographical factors are the two most critical indicators [1][2][3][4].Meteorological factors include precipitation, temperature, humidity, wind speed, etc., which are closely related to fire occurrence and its intensity.Topographical factors, such as elevation, slope, and aspect, play an important part in influencing the incidence, spread, and movement of wildfires [5,6].During the past decades, meteorological and topographical information has been obtained through various channels, such as remote sensing for obtaining detailed information on vegetation conditions, soil moisture, and fire traces [7,8], while GIS is used for mapping fire dynamics and generating fire hazard maps [9,10].Traditional machine learning methods such as random forests [11], logistic regression [12], support vector machines [8], multilayer perceptron neural networks [13], and artificial neural networks [14,15] have been used for predicting wildfire susceptibility.However, such strategies require the selection of specific features based on a particular region, and thus, a given model can only be applicable to the task of fire susceptibility prediction in a particular scene.In addition, when dealing with complex nonlinear problems, machine learning has a relatively weak generalization capability, which hampers its ability to capture complex relationships and distributions of data.
The convolutional neural network (CNN) has found extensive applications in object recognition, target detection, robotics, and natural language processing, thanks to its powerful feature learning capabilities [16,17] in recent years.Zhang et al. [18] have attempted to use Yunnan Province as a study area and have predicted forest fire susceptibility using a CNN model.They verified the effectiveness of CNN in the task of modeling forest fire susceptibility.Although CNN can fully utilize neighborhood information and extract multilevel representations from input data, it still faces some challenges [19].Since convolution kernels of CNN are discrete, structural parameters, including the sizes of convolutional kernels and convolutional steps, need to be readjusted when facing input images with different resolutions.To address the above limitations, a CCNN architecture was built by equipping a CNN with continuous convolutional kernels [20].CCNN eliminates the down-sampling layers and task-dependent structural parameter settings required by current CNN architectures and is therefore suitable for tasks on data of arbitrary resolutions, dimensions, and lengths.
The attention mechanism enhances the model's ability to express features of the input images by focusing more on key information in the image.In this work, we embed the CA module into the feature extraction network of CCNN so as to enhance the model's feature expression capability.In addition, we introduce an active learning method based on two types of complementary training strategies, which can achieve high prediction performance with as few training samples as possible, thus reducing the annotation cost.The proposed model predicts wildfire susceptibility.In the method, meteorological and topographical data are processed by ArcGIS software, including rasterization, projection, resampling, and other operations, and these data are converted into raster maps of fire-influencing factors.The raster maps are then partitioned into small patches as inputs of the proposed model.As far as we know, the most similar to this work is the literature [18]; however, the aspects of our method that differ from the literature [18] are mainly in the following two points: (1) Instead of CNN structure in the literature [18], the proposed method adopts CCNN and improve feature extraction network of CCNN based on CA module.The model can process input patches of different sizes without modifying the structural parameters of the network, such as sizes of convolution kernels and convolution steps, which makes the network application more flexible.(2) We introduce an active learning method during the training process, which improves the discriminative ability of the network and reduces the annotation cost.The following points summarize the contributions of this work:

•
In order to capture the spatial distribution characteristics of pixels and the correlation between local regions in the fire-influencing factor raster map, we embed the CA module into the feature extraction network of CCNN in order to aggregate different location information into channels and thereby model the global information of the fire-influencing factor raster map; • We adopt an active learning approach based on two types of complementary learning strategies in order to improve the network's discriminative performance by prioritizing the selection of fire-influencing factor patches with high confidence.Compared with traditional learning methods, the model can achieve a balance of high prediction accuracy and low annotation cost; • We adopt the CA-based CCNN to output the fire probability of fire-influencing factor raster patches.Instead of fixed input patches used in CNN, the proposed method allows different patch sizes without changing the structural parameters of the model, which contributes to improving the application flexibility of the wildfire susceptibility prediction model in different scenes.

Study Area
As shown in Figure 1, Chongqing Municipality is located in southwestern China, and its geographical location is between N Lat.28 • 10 ′ to 32 • 13 ′ and E Long.105 • 11 ′ to 110 • 11 ′ .Chongqing is densely vegetated, with more than 55 percent of the area covered by forests.Vegetation types are diverse, with subtropical evergreen broad-leaved forests and deciduous broad-leaved forests at low and middle altitudes and alpine mixed coniferous and broad-leaved forests, alpine meadows, and mixed coniferous and broad-leaved forests at high altitudes.Due to the rich vegetation, wildfires are relatively frequent.The terrain is complex, with mountains accounting for more than 70% of the topographical areas.Chongqing is well known as the 'stove' in China.In summer, the highest temperature can reach 42 • C, and the average temperature is above 32 • C; therefore, wildfires occur more frequently.

•
We adopt the CA-based CCNN to output the fire probability of fire-influencing factor raster patches.Instead of fixed input patches used in CNN, the proposed method allows different patch sizes without changing the structural parameters of the model, which contributes to improving the application flexibility of the wildfire susceptibility prediction model in different scenes.

Study Area
As shown in Figure 1, Chongqing Municipality is located in southwestern China, and its geographical location is between N Lat.28°10′ to 32°13′ and E Long.105°11′ to 110°11′.Chongqing is densely vegetated, with more than 55 percent of the area covered by forests.Vegetation types are diverse, with subtropical evergreen broad-leaved forests and deciduous broad-leaved forests at low and middle altitudes and alpine mixed coniferous and broad-leaved forests, alpine meadows, and mixed coniferous and broad-leaved forests at high altitudes.Due to the rich vegetation, wildfires are relatively frequent.The terrain is complex, with mountains accounting for more than 70% of the topographical areas.Chongqing is well known as the 'stove' in China.In summer, the highest temperature can reach 42 °C, and the average temperature is above 32 °C; therefore, wildfires occur more frequently.

Data Sources
Historical wildfire data sources from the Fire Information for Resource Management System website (https://firms.modaps.eosdis.nasa.gov/map/)(accessed on 6 October 2023) provided by the National Aeronautics and Space Administration.Records were derived from the fire products provided by Moderate-resolution Imaging Spectroradiometer (MODIS) from the Terra and Aqua satellites.The product covers information such as the time of fire occurrence, geographic location, and confidence of the fire point.And the spatial resolution is 1 km.We selected fire point records of years from 2012 to 2017 with frequent wildfires.
In this work, topographical and meteorological factors are used for fire susceptibility prediction.The topographical data, including elevation, slope, and aspect, are obtained from the Geospatial Data Cloud site (https://www.gscloud.cn/)(accessed on 6 October 2023) of the Computer Network Information Center, Chinese Academy of Sciences, and further calculated to obtain slope, elevation, and aspect by using the geographic data processing software.Four meteorological factors, including temperature, precipitation, wind

Data Sources
Historical wildfire data sources from the Fire Information for Resource Management System website (https://firms.modaps.eosdis.nasa.gov/map/)(accessed on 6 October 2023) provided by the National Aeronautics and Space Administration.Records were derived from the fire products provided by Moderate-resolution Imaging Spectroradiometer (MODIS) from the Terra and Aqua satellites.The product covers information such as the time of fire occurrence, geographic location, and confidence of the fire point.And the spatial resolution is 1 km.We selected fire point records of years from 2012 to 2017 with frequent wildfires.
In this work, topographical and meteorological factors are used for fire susceptibility prediction.The topographical data, including elevation, slope, and aspect, are obtained from the Geospatial Data Cloud site (https://www.gscloud.cn/)(accessed on 6 October 2023) of the Computer Network Information Center, Chinese Academy of Sciences, and further calculated to obtain slope, elevation, and aspect by using the geographic data processing software.Four meteorological factors, including temperature, precipitation, wind speed, and relative humidity, are from the National Earth System Science Data Center, National Science & Technology Infrastructure of China (http://www.geodata.cn)(accessed on 6 October 2023).The spatial resolution is 1 km, and the temporal resolution is month-by-month.As the occurrence of wildfires in Chongqing shows obvious seasonality, therefore, we apply the data in summer for modeling, namely, from June to August.

Methods
Figure 2 illustrates the proposed model for wildfire susceptibility prediction.The raster maps of fire-influencing factors are generated by using the software ArcGIS, and then they are integrated into one raster image, which is segmented into small image patches.The proposed model takes the patches as the inputs and predicts the fire risk probability of each patch.The CA module is embedded into the backbone network in order to capture long-range dependencies while retaining precise location information.This design can more accurately capture the spatial distribution characteristics in image patches of fire risk raster maps, thereby more comprehensively capturing the dependency relationships between features.To solve the problems of few-sample prediction and massive sample annotation, an active learning optimization approach, i.e., two types of complementary training strategy, is used to prioritize the screening of uncertain patches and patches with high confidence, where high-confidence patches are automatically assigned with pseudolabels, which improves the model's learning capability while reducing the labeling cost.).The spatial resolution is 1 km, and the temporal resolution is monthby-month.As the occurrence of wildfires in Chongqing shows obvious seasonality, therefore, we apply the data in summer for modeling, namely, from June to August.

Methods
Figure 2 illustrates the proposed model for wildfire susceptibility prediction.The raster maps of fire-influencing factors are generated by using the software ArcGIS, and then they are integrated into one raster image, which is segmented into small image patches.The proposed model takes the patches as the inputs and predicts the fire risk probability of each patch.The CA module is embedded into the backbone network in order to capture long-range dependencies while retaining precise location information.This design can more accurately capture the spatial distribution characteristics in image patches of fire risk raster maps, thereby more comprehensively capturing the dependency relationships between features.To solve the problems of few-sample prediction and massive sample annotation, an active learning optimization approach, i.e., two types of complementary training strategy, is used to prioritize the screening of uncertain patches and patches with high confidence, where high-confidence patches are automatically assigned with pseudolabels, which improves the model's learning capability while reducing the labeling cost.

Generation of Fire-Influencing Factor Raster Maps
The data of seven influencing factors are utilized for wildfire susceptibility modeling.For each meteorological factor, the 3-month average value (June to August) is used to generate its raster map.We convert all influencing factor data into a raster format with the

Generation of Fire-Influencing Factor Raster Maps
The data of seven influencing factors are utilized for wildfire susceptibility modeling.For each meteorological factor, the 3-month average value (June to August) is used to generate its raster map.We convert all influencing factor data into a raster format with the same data type, coordinate system, and resolution.Subsequently, the 7 factor raster maps of a year after the normalization operation were integrated into one raster map using the composite bands tool.All fire-influencing factor data from 2012 to 2017 were processed in the same way to generate 6 integrated raster maps.Taking the average temperature as an example, Figure 3 shows a raster map of this influencing factor in 2017.
same data type, coordinate system, and resolution.Subsequently, the 7 factor raster maps of a year after the normalization operation were integrated into one raster map using the composite bands tool.All fire-influencing factor data from 2012 to 2017 were processed in the same way to generate 6 integrated raster maps.Taking the average temperature as an example, Figure 3 shows a raster map of this influencing factor in 2017.

The Model Input
The pixels in the raster map correspond to the geographical coordinates on the map, so the fire susceptibility prediction model evaluates the fire risk level of location points in different geographical coordinates.The fire-influencing factor data of a pixel in the raster map are approximately equal to that of its neighboring pixels; that is, the pixels in an image block usually have the same or similar fire risk level.Therefore, we segment the raster map fusing 7 fire-influencing factor information into small patches.Each patch was expressed as a 3D array with the dimension of  ×  × , where  is the patch size and  denotes the number of influencing factors.

The Structure of the CA-based CCNN Model
In the frame of the CA-based CCNN model, the input data have 7 channels corresponding to 7 factors.After the convolution of 7 channel directions is calculated separately, the point-by-point convolution is performed.Therefore, 7 feature maps are weighted and combined in the depth direction to generate the new feature map.The convolution kernel uses SepFlexConv [20] with a stride of 1 and the same padding mode.SepFlexConv has fewer parameters and computation costs compared with the conventional convolution.Each convolutional layer is succeeded by a batch normalization (BN) layer and a Gaussian error linear unit (GELU).A total of 6 CCNN blocks are subsequently connected.The CA module is embedded after the CCNN blocks and followed by a BN layer, a global average pooling layer, and a point-wise linear connection layer.

The Model Input
The pixels in the raster map correspond to the geographical coordinates on the map, so the fire susceptibility prediction model evaluates the fire risk level of location points in different geographical coordinates.The fire-influencing factor data of a pixel in the raster map are approximately equal to that of its neighboring pixels; that is, the pixels in an image block usually have the same or similar fire risk level.Therefore, we segment the raster map fusing 7 fire-influencing factor information into small patches.Each patch was expressed as a 3D array with the dimension of n × n × c, where n is the patch size and c denotes the number of influencing factors.

The Structure of the CA-Based CCNN Model
In the frame of the CA-based CCNN model, the input data have 7 channels corresponding to 7 factors.After the convolution of 7 channel directions is calculated separately, the point-by-point convolution is performed.Therefore, 7 feature maps are weighted and combined in the depth direction to generate the new feature map.The convolution kernel uses SepFlexConv [20] with a stride of 1 and the same padding mode.SepFlexConv has fewer parameters and computation costs compared with the conventional convolution.Each convolutional layer is succeeded by a batch normalization (BN) layer and a Gaussian error linear unit (GELU).A total of 6 CCNN blocks are subsequently connected.The CA module is embedded after the CCNN blocks and followed by a BN layer, a global average pooling layer, and a point-wise linear connection layer.
The distinctiveness of CCNN is the introduction of a continuous convolutional kernel to handle classification tasks with different resolutions, dimensions, and lengths.The continuous convolutional kernel regards the convolution kernel as a continuous function parameterized with the help of a small neural network G Kernal acting as a kernel generator network [21].In this study, the kernel generator network is parameterized as a 3-layer MAGNet with 64 hidden units [22].The neural network maps the coordinate c i ∈ R D (D is the data dimension) to the value of the convolution kernel at the corresponding position.A convolution kernel of equal size is constructed by transmitting a vector containing k coordinates [c i ] i∈[1,...,k] through G Kernal .Thereafter, the input signal is convolved with the generated convolution kernel to obtain the output feature representa- tion.Parameterizing the convolution kernel by a neural network can be viewed as a way of embedding data information in the neural network weights to construct continuous data representations [23,24].The parameterization addresses the discrete characteristic of standard convolutional kernels, enabling the construction of convolutional kernels that are independent of resolution.
The CA module [25] is designed as follows: firstly, we adopt a mean pooling operation both horizontally and vertically to obtain two attention maps embedded with directionspecific information and then splice them into one attention map.Next, the spliced attention map is passed to a learning module with a convolutional layer, a BN layer, and a nonlinear connection layer.The learning module generates horizontal and vertical feature maps, which undergo a convolution operation, respectively.Ultimately, the Sigmoid activation function is employed to compute the attention weights, and the final feature map is generated by multiplying the weights into the input features.

Active Learning Training Strategy
To solve the problem of large sample annotation and few-sample predictions, we introduce an active learning method based on two types of complementary learning strategies [26].Let D U and D L denote the unlabeled sample sets and labeled sample sets, respectively.At the beginning, D U contains all the unlabeled training samples, while D L is empty.Subsequently, two types of samples are used for training the network, i.e., uncertain samples and high-confidence samples, which are selected from D U according to their fire probability calculated by the model.
We determine the uncertain samples based on the criterion of least confidence.The maximum value between the fire probability and non-fire probability of a sample is defined as the confidence of the sample.The confidence lc i is calculated as where x i denotes the ith sample and j represents the jth classifier.y i = j represents the label of x i .w are the model parameters.p(y i = j| x i ; w) denotes the probability that x i belongs to the jth classifier.The K samples with the lowest confidence values are chosen as uncertain samples.
The high-confidence samples facilitate the learning of fire features as a way to improve the discriminative performance of the model.The model can easily discern samples with high confidence; thus, they can be automatically assigned pseudo-labels by the model.We select samples with high confidence from D U based on the entropy.The entropy en i is calculated as The sample x i in D U is chosen as high-confidence samples if the entropy en i is less than the threshold δ.The samples with high confidence are automatically assigned pseudolabels of 1 or 0. The initial threshold δ 0 is set to a large value to ensure high reliability of pseudo-label assignment.After each iteration, samples with high confidence are returned to D U , and their pseudo-labels are removed.After each iteration, we update the threshold.
The selected uncertain samples need to be manually labeled and then added to the labeled sample set D L , while the model automatically assigns pseudo-labels to samples with high confidence.As the number of iterations increases, more uncertain samples are selected; at the same time, the model has higher performance, which made the model achieve a balance between the number of sample annotations and model performance.Since we automatically assign pseudo-labels to high-confidence samples without manual labeling, the cost of manual labeling is greatly reduced.

Experiments
In this section, the configurations of experiments and parameters of the training process are provided, and then the metrics used in this work are introduced.

Implement Details
Our framework is implemented using PyTorch Lightning 1.6.4.The configurations of the computer are as follows: a GPU (RTX 2080Ti) and a CPU (Intel i9-9900K).We employ AdamW [27] as the optimizer with a weight decay set to 0. The default learning rate is 0.01.In addition, we introduce a cosine annealing learning rate scheduler [28] along with a linear learning rate warm-up stage of 10 epochs for optimization.The dropout rate is set to 0.15 to prevent overfitting.
The fire points listed on the inventory from NASA represent the origin or center of a fire region rather than the actual fire region; therefore, we adopted buffer analysis [29] to expand fire point data into a fire region.Drawing on the literature [18], a buffer zone was set within 5 km around the fire point, and pixels inside the buffer zone were labeled as 1 (i.e., fire), while pixels outside the buffer zone were labeled as 0 (i.e., non-fire).All fire point data from 2012 to 2017 were processed to generate six ignition raster maps.The annual raster maps from 2012 to 2016 are used for model training and are divided into n × n × 7 3D patches, where n is set to 15, 25, or 35.A total of 5000 patches, including 2500 fire samples and 2500 non-fire samples, are randomly chosen for model training.The training and validation sets consist of 4000 and 1000 samples, respectively.The data from 2017 served as the testing set.A total of 10% of the labeled samples in the training set are randomly selected for network initialization.

Metrics
In this work, the prediction for wildfire susceptibility is realized by a binary classification method.Thus, we use metrics such as sensitivity, specificity, positive predictive value (PPV for short), and negative predictive value (NPV for short) and accuracy to evaluate the model performance [9].The calculation of the five metrics is as follows: The overall performance of the model is assessed using the area under the curve of the receiver operating characteristic (i.e., ROC-AUC) [30][31][32].A higher AUC value, closer to 1, indicates better prediction performance [33].AUC is calculated as follows:

Results
In this section, we verify that our method achieves high prediction accuracy at different patch sizes.At the same time, our method is compared with several state-of-the-art methods, and wildfire susceptibility maps are generated, respectively.Finally, the effectiveness of the two types of complementary learning strategies in reducing training samples is verified.

Ablation Study
We conducted experiments with different sizes of patches, and as shown in Table 1, the best prediction performance was achieved at the patch size of 25 × 25; hence, the 25 × 25 patch was chosen for the subsequent experiments in this work.CNN is more suitable for classifying such small-sized input patches than other deep learning models; thus, we compared our method with the CNN-based method proposed by [18].Additionally, we evaluated our method against CCNN, CCNN + SE, and CCNN + CBAM models.As shown in Table 2, our method outperforms these models across several metrics, including sensitivity, specificity, PPV, NPV, and accuracy.Figure 4 demonstrates the ROC curves for the five methods on the testing set.The AUC of our method is 0.9487, which is 2.09% higher than the CNN-based method and 0.73% higher than the CCNN method.Furthermore, the AUC of our method exceeds that of the CCNN model using two other additional attentional mechanisms.This indicates a better overall fit between the CA-based CCNN wildfire susceptibility prediction model and the testing set.

Comparison with State-of-the-Art Machine Learning Methods
In addition to CNN, we also use machine learning methods for comparative experiments.The experimental results are shown in Table 3 and Figure 5, where our method shows higher sensitivity (94.74%), specificity (88.74%),PPV (89.14%), and NPV (94.53%) than the four machine learning methods on the validation set.And the overall accuracy of the method in this work reaches 91.7%, which is more than 15% higher than the machine learning methods.Similarly, CNN-based methods have achieved high performance.These results indicate that the raster-based scheme may outperform traditional machine learning methods in wildfire susceptibility tasks.

Comparison with State-of-the-Art Machine Learning Methods
In addition to CNN, we also use machine learning methods for comparative experiments.The experimental results are shown in Table 3 and Figure 5, where our method shows higher sensitivity (94.74%), specificity (88.74%),PPV (89.14%), and NPV (94.53%) than the four machine learning methods on the validation set.And the overall accuracy of the method in this work reaches 91.7%, which is more than 15% higher than the machine learning methods.Similarly, CNN-based methods have achieved high performance.These results indicate that the raster-based scheme may outperform traditional machine learning methods in wildfire susceptibility tasks.

Wildfire Susceptibility Maps
Wildfire susceptibility was predicted using the testing set.We utilize fire probability predicted by the model to generate the wildfire susceptibility map and then classify wildfire susceptibility into five classes: extremely low, low, moderate, high, and extremely high, by selecting the Natural Breaks method in ArcGIS for the threshold setting.
Figure 6 displays wildfire susceptibility maps generated by our method and five state-of-the-art methods, respectively.In these maps, blue triangular arrows represent fire points.The extremely high fire susceptibility approximately indicates the occurrence of a fire.The areas with extremely high fire susceptibility predicted by our method are consistent with the actual fire location.However, the MLP, SVM, and RF methods incorrectly classify these areas as low-and medium-susceptibility zones.In comparison with the CNN-based method, the susceptibility map predicted by our method offers slightly more accurate delineation in the extremely low and extremely high susceptibility classes.

Wildfire Susceptibility Maps
Wildfire susceptibility was predicted using the testing set.We utilize fire probability predicted by the model to generate the wildfire susceptibility map and then classify wildfire susceptibility into five classes: extremely low, low, moderate, high, and extremely high, by selecting the Natural Breaks method in ArcGIS for the threshold setting.
Figure 6 displays wildfire susceptibility maps generated by our method and five stateof-the-art methods, respectively.In these maps, blue triangular arrows represent fire points.The extremely high fire susceptibility approximately indicates the occurrence of a fire.The areas with extremely high fire susceptibility predicted by our method are consistent with the actual fire location.However, the MLP, SVM, and RF methods incorrectly classify these areas as low-and medium-susceptibility zones.In comparison with the CNN-based method, the susceptibility map predicted by our method offers slightly more accurate delineation in the extremely low and extremely high susceptibility classes.

Verification of Training Efficiency of Two Types of Complementary Learning Strategy
In order to demonstrate the training efficiency of the active learning strategy used in this work, two training methods, i.e., randomly selecting samples and only selecting uncertain samples [34], were used for the comparison.The parameters of the active learning strategy are set as follows: δ 0 = 0.05 and K = 400.
Figure 7 displays the classification accuracy curve of these three methods under different percentages of labeled samples.The proposed method exhibits superior prediction accuracy compared to the other two methods with the same percentage of labeled samples, particularly when the percentage of labeled samples is low (20-60%), with the superiority becoming more pronounced.

Fire-Influencing Factors and Fire Susceptibility Prediction
Overall, wildfires in Chongqing occur mainly in forested areas.In particular, the southwest region of Chongqing has low elevation and high temperature in summer, which heightens the likelihood of fire occurrence.Wind speed increases with the increment of slope gradient in the forest areas in the north-central region.Therefore, these two regions are prone to frequent fires, which is consistent with the prediction results of our method.Therefore, there is a direct causal relationship between fire susceptibility and the selected fire-influencing factors, including topographical and meteorological factors.

Advantages of CCNN
Compared with CNN, CCNN applies a continuous convolutional kernel, enabling the creation of convolutional kernels independent of resolution and thus increasing the flexibility of the model to be applied in different regions.As seen in Table 2 and Figure 6, the model parameters, such as the size of the convolutional kernel and convolutional strides, were determined after training.The superior performance of the proposed method is not affected by the size of input patches due to the introduction of continuous convolutional kernels into CCNN.Overall, wildfires in Chongqing occur mainly in forested areas.In particular, the southwest region of Chongqing has low elevation and high temperature in summer, which heightens the likelihood of fire occurrence.Wind speed increases with the increment of slope gradient in the forest areas in the north-central region.Therefore, these two regions are prone to frequent fires, which is consistent with the prediction results of our method.Therefore, there is a direct causal relationship between fire susceptibility and the selected fire-influencing factors, including topographical and meteorological factors.

Advantages of CCNN
Compared with CNN, CCNN applies a continuous convolutional kernel, enabling the creation of convolutional kernels independent of resolution and thus increasing the flexibility of the model to be applied in different regions.As seen in Table 2 and Figure 6, the model parameters, such as the size of the convolutional kernel and convolutional strides, were determined after training.The superior performance of the proposed method is not affected by the size of input patches due to the introduction of continuous convolutional kernels into CCNN.

The Basis for CA Module Selection
Currently, the widely employed attention mechanisms include the Squeeze-and-Excitation (SE) mechanism, CBAM mechanism, and CA mechanism.We introduce the CA mechanism in order to capture the spatial distribution characteristics of pixels in the fire-influencing factor raster map.As shown in Table 2, our method has higher prediction accuracy than CCNN + CBAM and CCNN + SE.Because CA aggregates different location information into channels of the network, the correlation between local regions is captured [25].Due to discrete pixels in one patch, we utilize CA to model the long-range dependencies between pixels.SE mechanism focuses solely on the encoded inter-channel information, overlooking the significance of positional information [35].CBAM is limited to capturing local relationships [36].Therefore, CA is selected to model the global information in the patches of fire-influencing factor raster map.

Labeling Cost
By analyzing Figure 7, we find that if an accuracy of about 86% is desired, the methods of only selecting uncertain samples and randomly selecting samples require about 70% and 90% labeled training samples, respectively.In contrast, our method requires only 50% labeled samples, saving about 20% and 40% labeling efforts.Thus, the proposed method can save the labeling cost under the demand for relatively high prediction accuracy.We also observe that the method of only selecting uncertain samples has slightly higher prediction accuracy than the other two comparative methods when the proportion of labeled training samples is 70%.This is because the best training samples are accidentally chosen; that is, this excellent performance is random and uncertain.

Wildfire Susceptibility Map
As shown in Figure 6, we mark the fire risk level on the map of Chongqing.Therefore, the fire susceptibility map includes geographic information of the predicted area, achieving the visualization of fire risk prediction.In addition, as seen in Figure 2, after the fireinfluencing factor raster map is input into the model, it is segmented into small patches so that we can simultaneously output the prediction results for all regions of Chongqing.On the contrary, traditional machine learning methods can only predict the fire risk of a small area and output the fire risk value of that area.

Conclusions
In this work, we proposed a CA-based CCNN model with active learning optimization for wildfire susceptibility prediction.Due to the CCNN architecture, the model is compatible with different input patch sizes, which contributes to applying the pre-trained fire susceptibility prediction model to different forest areas.In addition, the model adopted CA to build the long-range dependencies for discrete pixels in small fire-influencing factor patches.Experimental results indicated that the CCNN + CA model had higher prediction accuracy than the comparative state-of-the-art methods.Furthermore, compared with traditional machine learning methods, the model saved the labeling cost by using an active learning method.It was worth mentioning that we utilized the model to generate a fire susceptibility map that visualized the fire risk level with geographic information.In future work, we will take into account the influence of anthropogenic activities on wildfire occurrence to reveal the interests and conflicts between wildfires and human activities.We will also draw lessons from the continuous convolution approach of CCNN into lightweight convolutional networks so as to further reduce complexity.

Figure 1 .
Figure 1.Location of Chongqing Municipality, in China, and distribution of wildfires in 2017.

Figure 1 .
Figure 1.Location of Chongqing Municipality, in China, and distribution of wildfires in 2017.

Figure 2 .
Figure 2. The model framework for the wildfire susceptibility prediction.

Figure 2 .
Figure 2. The model framework for the wildfire susceptibility prediction.

Figure 3 .
Figure 3. Raster map of the average temperature in Chongqing, China, in 2017.

Figure 3 .
Figure 3. Raster map of the average temperature in Chongqing, China, in 2017.

Figure 4 .
Figure 4. ROC curves and AUC of the five methods.

Figure 4 .
Figure 4. ROC curves and AUC of the five methods.

Figure 5 .
Figure 5. Radar maps of metrics derived from six different models on the validation set.

Figure 5 .
Figure 5. Radar maps of metrics derived from six different models on the validation set.

FireFigure 7
Figure7displays the classification accuracy curve of these three methods under different percentages of labeled samples.The proposed method exhibits superior prediction accuracy compared to the other two methods with the same percentage of labeled samples, particularly when the percentage of labeled samples is low (20-60%), with the superiority becoming more pronounced.

Figure 7 .
Figure 7. Classification accuracy with different percentages of labeled samples.

Figure 7 .
Figure 7. Classification accuracy with different percentages of labeled samples.
Fire-Influencing Factors and Fire Susceptibility Prediction Fire 2024, 7, x FOR PEER REVIEW 4 of 15 speed, and relative humidity, are from the National Earth System Science Data Center, National Science & Technology Infrastructure of China (http://www.geodata.cn)(accessed on 6 October 2023

Table 1 .
Performance comparison with different input patch sizes.

Table 2 .
Performance comparison of our method with state-of-the-art methods.

Table 3 .
Performance comparison of our method with machine learning methods.

Table 3 .
Performance comparison of our method with machine learning methods.