Spatial Fuzzy C-Means Clustering Analysis of U.S. Presidential Election and COVID-19 Related Factors in the Rustbelt States in 2020

: The rustbelt states play a key role in determining the vote turnout in the U.S. elections. The current study attempts to utilize the spatial fuzzy C-means method to analyze the U.S. presidential election in the rustbelt states in 2020. We intend to explore that the U.S. presidential election had related factors, including COVID-19-related factors, such as the mask-wearing percentage and the COVID-19 death tolls in each county of the rust belt states. Contrary to the related literature, the study uses education level, number of house units, unemployment rate, household income, COVID-19-related factors and the share of Republican’s votes in the presidential election. The results indicate that spatial generalized fuzzy C-means analysis has better clustering results than the C-means clustering method. Moreover, the COVID-19 death toll in each county did not affect the Republican’s vote share in the rustbelt states, while the mask-wearing behavior in some regions had a negative impact on the Republican’s vote share. MSC: 03B52; 03C45


Introduction
The U.S. presidential election in 2020 was influenced by the COVID-19 pandemic, including increasing infections, death tolls, and lockdowns. The previous literature indicated that political polarization was aggravated due to intense fear during the disaster [1,2]. People tended to search for assuage by insisting on their conservative political viewpoints and supporting the ruling party, while other scholars believed that some voters would punish the political elite for worse management during the natural or man-made disaster. Since COVID-19-related policies were created in a very short period of time, without full deliberation, it was possible to arouse public discontent [3]. People were more supportive of their governments during the early stage of the COVID-19 pandemic [4]. However, the evaluations of the policies about the pandemic were influenced by two polarized mindsets. Some voters chose to punish the politicians for the conditions caused by the pandemic, which were out of their control, while some voters were attentive to the political elites' reactions and determined their feelings accordingly [5].
The previous literature about the U.S. presidential election in 2020 focused on the effects of COVID-19 on the U.S. presidential election results. Hart (2021) stated that the COVID-19 pandemic seemed to have decreased the support for Trump among the Democrats, while it increased for independent voters [6]. Baccini et al. [7] pointed out that COVID-19-related factors negatively affected Donald Trump's re-election, and the effect was stronger in urban areas. They also observed that COVID-19 had a positive effect on the voters' mobilization for Joe Biden. The rustbelt states are traditionally "swing states" in the U.S. presidential elections, including Illinois, Wisconsin, Indiana, Michigan, Ohio, West Virginia, Pennsylvania, and New York. Geographical and racial divergences increased in the counties of rustbelt states in the past five years [8]. The geographical factors enable these divergences to become more visible, and people tend to live in more politically polarized conditions [9]. The voting results of rustbelt states have a pivotal influence on the whole country. However, there are fewer instances in the literature about the voting results' analysis of the rustbelt states. Gimpel [10] pointed out that some counties in rustbelt states changed their support to the Democrats in the presential election in 2020. The influencing factors of the voting results need to be examined. In order to analyze the topic more thoroughly, we attempt to analyze the COVID-19 pandemic effects along with the regional factors' influence, the related economic variables, and the Republican's support rate in the 2020 U.S. presidential election.
The structure of this research is as follows: the Research Method Section presents our research design and related descriptive statistics of the variables. The Discussion Section presents the results of the research model. The research findings are listed in the Conclusions Section.

Research Method
The current study used the spatial fuzzy C-means clustering method to analyze the influencing factors of COVID-19 on the U.S. presidential election. In order to explore the impacts of COVID-19 and other factors, such as social and geographical factors, as the mentioned in the Introduction, the study also used educational level, number of house units, unemployment rate, and household income variables to create the clustering. The previous literature utilized daily experience sampling (ESM) to analyze the impact of COVID-19 on employee uncertainty [11]. Di Nardo et al. (2019) utilized the literature review method to provide useful information about COVID-19 infection on neonates and children [12]. Regarding the fuzzy clustering approach, Indelicato et al. (2022) used the method with the fuzzy TOPSIS model to analyze the determinants of immigrants in Cuenca, Ecuador [13]. Compared to the COVID-19-related research about its effects on U.S. elections, the study considered spatial factors and attempted to describe the regional differences under the influence of these variables.

Data Description
The study explored the influencing factors of the pandemic on the 2020 U.S presidential election. The study used the Republican's voting share (X 1 ) in the U.S. presidential election in 2020 as one of the variables related to the U.S. presidential election. The data were obtained from the web repository (https://github.com/tonmcg/US_County_Level_ Election_Results_08-20 (accessed on 6 August 2022)); it collected the 2020 election results at the county level, which were scraped from the results published by Fox News, Politico, and the New York Times.
In order to measure mask-wearing behavior in the rustbelt states (X 2 ), the study used the dataset collected by the survey firm, Dynata. Dynata surveyed 250 thousand respondents in the U.S. between 2 and 14 July 2020. The survey asked the respondents whether or not they wore face masks often in public. The responses included "always", "frequently", "sometimes", "rarely", and "never", according to the descending frequency.
The variables (X 3 , X 4 , X 5 , X 6 ) were obtained from the dataset of the U.S. Census Bureau. These variables were released on a flow basis throughout each year.
The study also used the death toll (X 7 ) before the U.S. presidential election as a COVID-19-related variable. Other variables included education level and household economic condition. The descriptive statistics of all the variables are listed in Tables 1 and 2: The share of respondents who thought they wore face masks often

X 3
The number of housing units The number of residents who were high-school graduates or above X 5 Unemployment rate X 6 Household income

X 7
Death toll of COVID-19 cases

C-Means Clustering
Initially, the study used the classical C-means method to create the fuzzy unsupervised classification. The fuzziness degree (m) was set at 1.5 in order to obtain the satisfied results. The classical C-means method includes the following two equations. The first equation is the updated values of membership in each iteration of u ik [14]: The center of the cluster is as follows: In Equations (1) and (2), x k represents the observation of k's value, v i is the value of the center of the cluster i, c is the cluster number, and m is the index of fuzziness.

Fuzzy C-Means Clustering
Fuzzy C-means clustering is an algorithm that permits a data point to pertain to two or more clusters. Let X = {x 1 , x 2 , . . . , x n } represent an image with n pixels, where x i is the gray value of the ith pixel. The objective function of the standard FCM algorithm is as follows: In Equation (3), the center of the kth cluster is v k (1 ≤ k ≤ K), and u ki (1 ≤ k ≤ K, 1 ≤ i ≤ n) is the membership degree function value of the ith pixel, which pertains to the kth cluster. u ki also needs to meet the requirements of the following constraints: In Equation (3), the distance between x i and v k is used in the Euclidean form, and parameter m (m > 1) is a weighting parameter that relates to the level of fuzziness and the resulting partition. The minimization of the objective function in Equation (3) can obtain the updated equations of the membership degree function u ki and the cluster center v k as follows: The goal of these functions is to obtain suitable clusters for the data points.

Spatial Fuzzy C-Means Clustering
Fuzzy C-means clustering (FCM) has shortcomings due to its sensitivity to noise. Some algorithms were developed to overcome this shortcoming by utilizing the spatial information obtained from the neighborhood window around each pixel. Mean spatial information and median spatial information are two prevalent types of local information. The mean spatial information of the ith pixel is denoted as follows [15]: In Equation (7), S i is the set of neighboring pixels in a window centered at the ith pixel, and |S i | represents its cardinality. The median spatial information can be represented as: Most of the FCM algorithms utilize the above-mentioned local spatial information in the objective function; however, FCM algorithms with local spatial information can obtain a better image segmentation performance with a low noise level. The local spatial information obtained from the near pixels of a pixel is not efficient due to possible contamination. In fact, there are many pixels with a similar neighborhood configuration in an image. It is more beneficial to utilize pixels with a similar neighborhood configurations to the given pixel to obtain the spatial information than only using the neighboring pixels of the given pixel. Such types of spatial information can be taken as non-local spatial information. The non-local spatial information for the ith pixel x i is calculated by the following equation [16]: In Equation (9), ω r i represents the r × r search window centered at the ith pixel. The non-local spatial information of the ith pixel is computed by using the pixels in the window. The weight between the ith and jth pixels can be denoted as w ij j ∈ w r i , 0 ≤ w ij ≤ 1 and ∑ j∈w r i w ij = 1. The weight w ij is defined as follows: In Equation (10), h means the filtering degree parameter and directs the decreasing weight function w ij , and is the normalizing constant. The weight w ij depends on the similarity between the ith and jth pixels. The similarity is computed by the Gaussian weighted Euclidean distance ||x(N i ) − x N j || 2 2,σ . The positive term σ is the Euclidean distance, which means the standard deviation of the Gaussian kernel. x(N i ) is the gray level vector with an s × s square neighborhood N i centered at ith pixel. Fuzzy clustering algorithm with spatial information uses the spatial information for individual pixels to determine the spatial constant term, and then obtains the spatial constraint to the objective function of FCM.

Fuzzy C-Means and Generalized Fuzzy C-Means Clustering
The study used the classical K-means to determine the number of clusters. According to Figure 1, the four clusters can explain almost 40% of the original data variance. Fuzzy clustering algorithm with spatial information uses the spatial information for individual pixels to determine the spatial constant term, and then obtains the spatial constraint to the objective function of FCM.

Fuzzy C-Means and Generalized Fuzzy C-Means Clustering
The study used the classical K-means to determine the number of clusters. According to Figure 1, the four clusters can explain almost 40% of the original data variance. Then, the study used the "fclust" package of R language to analyze the quality of the classification [17]. The study also utilized the "geocmeans" package of the R language to compute the generalized version of the c-means algorithm [18]. The algorithm can accelerate convergence and obtain less fuzzy results by adjusting the membership matrix at each iteration. It needs an extra beta parameter controlling the effectiveness of the modification. The modification only influences the formula updating the membership matrix.
In Equation (11), = (|| − || 2 ) and 0 ≤ β ≤ 1. In order to choose an adequate value for this parameter, the study sought all the possible values between 0 and 1 with a step of 0.05. The results of the related index were obtained according to the ascending β values in Table 3.  Then, the study used the "fclust" package of R language to analyze the quality of the classification [17]. The study also utilized the "geocmeans" package of the R language to compute the generalized version of the c-means algorithm [18]. The algorithm can accelerate convergence and obtain less fuzzy results by adjusting the membership matrix at each iteration. It needs an extra beta parameter controlling the effectiveness of the modification. The modification only influences the formula updating the membership matrix.
In Equation (11), β k = min(||x k − v|| 2 ) and 0 ≤ β ≤ 1. In order to choose an adequate value for this parameter, the study sought all the possible values between 0 and 1 with a step of 0.05. The results of the related index were obtained according to the ascending β values in Table 3.  According to Table 1, the study chose beta = 0.8, maintained a satisfied silhouette index, increased the Xie and Beni index, and explained inertia. The results of GFCM (generalized version of fuzzy C-means clustering) and FCM are listed in Table 4. The results indicate that the GFCM provides a less fuzzy solution (with higher explained inertia and lower partition entropy), but keeps a good silhouette index and a lower Xie and Beni index. The study created two membership matrices maps and the most likely group for each observation. The study used the function map clusters from geocmeans in R language. We set a threshold of 0.45. If an observation only obtained values below this probability in a membership matrix, it was marked as "undecided" (represented by transparency on the map).
In Figure 2, the left-hand-side graph was the fuzzy C-means clustering result. The right-hand-side graph was the generalized fuzzy C-means clustering result. We can observe that the right-hand-side graph had fewer undecided parts. According to Table 1, the study chose beta = 0.8, maintained a satisfied silhouette index, increased the Xie and Beni index, and explained inertia. The results of GFCM (generalized version of fuzzy C-means clustering) and FCM are listed in Table 4. The results indicate that the GFCM provides a less fuzzy solution (with higher explained inertia and lower partition entropy), but keeps a good silhouette index and a lower Xie and Beni index. The study created two membership matrices maps and the most likely group for each observation. The study used the function map clusters from geocmeans in R language. We set a threshold of 0.45. If an observation only obtained values below this probability in a membership matrix, it was marked as "undecided" (represented by transparency on the map).
In Figure 2, the left-hand-side graph was the fuzzy C-means clustering result. The right-hand-side graph was the generalized fuzzy C-means clustering result. We can observe that the right-hand-side graph had fewer undecided parts.

Spatial C-Means and Generalized C-Means
The study used the SFCM function of R language to execute spatial c-means clustering. The first step was to determine a spatial weight matrix indicating the observations that were neighbors and the strength of their relationship. The study attempted to use a basic queen neighbor matrix (built with the spdep package of R language). The matrix should be row-standardized to ensure that the interpretation of all the parameters remains clear.
The two following equations indicate how the functions renewing the condition of the membership matrix and the centers of the clusters are modified.
In Equations (12) and (13), x is the lagged version of x, and α ≥ 0. The SFCM (spatial fuzzy C-means) can be taken as a spatially smoothed version of the classical c-means, and alpha controls the degree of spatial smoothness. This smoothing can be taken as an attempt to reduce the spatial overfitting of the classical c-means.
The study chose the best alpha value in order to reduce spatial inconsistency as much as possible and to maintain a good classification quality. The relationship between the spatial inconsistency and alpha value is shown in Figure 3. The study used the SFCM function of R language to execute spatial c-means clustering. The first step was to determine a spatial weight matrix indicating the observations that were neighbors and the strength of their relationship. The study attempted to use a basic queen neighbor matrix (built with the spdep package of R language). The matrix should be row-standardized to ensure that the interpretation of all the parameters remains clear.
The two following equations indicate how the functions renewing the condition of the membership matrix and the centers of the clusters are modified.
In Equations (12) and (13), ̅ is the lagged version of x, and α ≥ 0. The SFCM (spatial fuzzy C-means) can be taken as a spatially smoothed version of the classical c-means, and alpha controls the degree of spatial smoothness. This smoothing can be taken as an attempt to reduce the spatial overfitting of the classical c-means.
The study chose the best alpha value in order to reduce spatial inconsistency as much as possible and to maintain a good classification quality. The relationship between the spatial inconsistency and alpha value is shown in Figure 3. In Figure 3, the increasing alpha value results in the decrease in the spatial inconsistency.
In Figure 4, the explained inertia decreased when the alpha value increased and again followed an inverse function. The classification searched for a compromise between the original and lagged values. However, the loss was only 3% between alpha = 0 and alpha = 2. In Figure 3, the increasing alpha value results in the decrease in the spatial inconsistency. In Figure 4, the explained inertia decreased when the alpha value increased and again followed an inverse function. The classification searched for a compromise between the original and lagged values. However, the loss was only 3% between alpha = 0 and alpha = 2.  According to Figures 5 and 6, as a larger silhouette index means a better classification, and a smaller Xie and Beni index represents a better classification, the study intended to retain the alpha = 0.25 value to provide a good balance between spatial consistency and classification quality.  According to Figures 5 and 6, as a larger silhouette index means a better classification, and a smaller Xie and Beni index represents a better classification, the study intended to retain the alpha = 0.25 value to provide a good balance between spatial consistency and classification quality.  According to Figures 5 and 6, as a larger silhouette index means a better classification, and a smaller Xie and Beni index represents a better classification, the study intended to retain the alpha = 0.25 value to provide a good balance between spatial consistency and classification quality.    According to Figures 5 and 6, as a larger silhouette index means a better classification, and a smaller Xie and Beni index represents a better classification, the study intended to retain the alpha = 0.25 value to provide a good balance between spatial consistency and classification quality.

Spatial Generalized Fuzzy C-Means (SGFCM)
In order to facilitate the clustering process of the SGFCM method, we needed to determine the alpha and beta values of the following equation regarding the center of the clusters.
The study attempted to use the multiprocessing approach to select the suitable alpha and beta values. The impact of alpha and beta values on the various indices is shown as follows: Figures 7 and 8 indicate that some specific combinations of alpha and beta values generate good results in the range of 0.3 < alpha < 0.7 and 0.4 < beta < 0.6. Figure 9 shows that the selection of beta has no impact on spatial consistency.
Axioms 2022, 11, x FOR PEER REVIEW 9 of 14 Figure 6. Link between alpha value and silhouette index.

Spatial Generalized Fuzzy C-Means (SGFCM)
In order to facilitate the clustering process of the SGFCM method, we needed to determine the alpha and beta values of the following equation regarding the center of the clusters.
The study attempted to use the multiprocessing approach to select the suitable alpha and beta values. The impact of alpha and beta values on the various indices is shown as follows: Figures 7 and 8 indicate that some specific combinations of alpha and beta values generate good results in the range of 0.3 < alpha < 0.7 and 0.4 < beta < 0.6. Figure 9 shows that the selection of beta has no impact on spatial consistency.

Spatial Generalized Fuzzy C-Means (SGFCM)
In order to facilitate the clustering process of the SGFCM method, we needed to determine the alpha and beta values of the following equation regarding the center of the clusters.
The study attempted to use the multiprocessing approach to select the suitable alpha and beta values. The impact of alpha and beta values on the various indices is shown as follows: Figures 7 and 8 indicate that some specific combinations of alpha and beta values generate good results in the range of 0.3 < alpha < 0.7 and 0.4 < beta < 0.6. Figure 9 shows that the selection of beta has no impact on spatial consistency.

Spatial Generalized Fuzzy C-Means (SGFCM)
In order to facilitate the clustering process of the SGFCM method, we needed to determine the alpha and beta values of the following equation regarding the center of the clusters.
The study attempted to use the multiprocessing approach to select the suitable alpha and beta values. The impact of alpha and beta values on the various indices is shown as follows: Figures 7 and 8 indicate that some specific combinations of alpha and beta values generate good results in the range of 0.3 < alpha < 0.7 and 0.4 < beta < 0.6. Figure 9 shows that the selection of beta has no impact on spatial consistency.    Regarding Figures 7-9, the study selected beta = 0.5 and alpha = 0.25, which obtained better results for all the indices considered. Based on the alpha and beta values, the study acquired the results of the SFCM and SGFCM results (see Table 5). The results of the SGFCM are better concerning the semantic and spatial aspects due to the lower partition entropy, Xie Beni index, and Fukuyama Sugeno index, and higher values of other indices.
The SFCM and SGFCM clustering maps are listed as follows. According to Figure 10, the right-hand-side graph is the SGFCM clustering map. The left-hand-side graph is the SFCM clustering map. We can observe that the undecided units are less on the SGFCM clustering map. Regarding Figures 7-9, the study selected beta = 0.5 and alpha = 0.25, which obtained better results for all the indices considered. Based on the alpha and beta values, the study acquired the results of the SFCM and SGFCM results (see Table 5). The results of the SGFCM are better concerning the semantic and spatial aspects due to the lower partition entropy, Xie Beni index, and Fukuyama Sugeno index, and higher values of other indices.
The SFCM and SGFCM clustering maps are listed as follows. According to Figure 10, the right-hand-side graph is the SGFCM clustering map. The left-hand-side graph is the SFCM clustering map. We can observe that the undecided units are less on the SGFCM clustering map.

Comparison of the Four Algorithms
The study attempted to perform a thorough spatial analysis and compare the spatial consistency of the four classifications (FCM, GFCM, SFCM, SGFCM) (see Table 6). The Moran I value according to the membership matrices were higher for SFCM and SGFCM, representing strongaer spatial structures in the classifications.
The study also checked that the values of spatial inconsistency for SGFCM were significantly lower than those of SFCM. The study used the previously mentioned 250 values

Comparison of the Four Algorithms
The study attempted to perform a thorough spatial analysis and compare the spatial consistency of the four classifications (FCM, GFCM, SFCM, SGFCM) (see Table 6). The Moran I value according to the membership matrices were higher for SFCM and SGFCM, representing strongaer spatial structures in the classifications.
The study also checked that the values of spatial inconsistency for SGFCM were significantly lower than those of SFCM. The study used the previously mentioned 250 values obtained by permutations; we could calculate a pseudo p-value = 0.032 > 1/250 = 0.004. This means that the SGFCM algorithm did not have a predominant advantage over the SFCM algorithm. However, the SGFCM clustering map indicated that the undecided points were fewer than that of the SFCM.
We can observe that the undecided parts were fewer as compared with Figures 2 and 10.

Discussion
The study attempted to utilize the spatial fuzzy C-means clustering method to analyze the relationship among COVID-19-related factors and the vote share of Republicans in the U.S. presidential election in the rustbelt states in 2020. The study found that spatial generalized fuzzy C-means clustering (SGFCM) produced better results compared to the other three algorithms according to Table 3. The study also found the SGFCM clustering graph in Figure 10 presented better results because the uncertain parts (areas that did not belong to any cluster) were fewer compared to the other clustering results shown in Figure 2.
The descriptive statistics of the four clusters (Tables A1-A4) are listed in the Appendix A. According to the four tables, we can conclude the four clusters are as follows: (1) First cluster: the cluster had lower X 1 (mean < 0.5), higher X 2 , higher X 4 , lower X 5 , and higher X 6 values. Other variables did not seem obvious. We can conclude that people in this region were not inclined to support the Republican candidate, often wore masks, had more high-school graduates or above, had a lower unemployment rate, and a higher income. The first cluster included a little part of southeastern Pennsylvania, New York state and other scatter parts of the rustbelt states. (2) Second cluster: The cluster had higher X 1 (mean > 0.5), higher X 2 , lower X 4 , lower X 5 , and higher X 6 values. Other variables did not seem obvious. We can conclude that people in this region were inclined to support the Republican candidate, often wore masks, had less high-school graduates, a lower unemployment rate, and higher income. The second cluster included the larger part of New York state, most part of Michigan and northern Illinois. (3) Third cluster: The cluster had higher X 1 (mean > 0.5), lower X 2 , lower X 4 , higher X 5 , lower X 6 , and higher X 7 values. This means that people in this region tended to support the Republican candidate, wore masks less frequently, had less highschool graduates or above, a higher unemployment rate, lower income, and higher COVID-19 death toll. The cluster included some parts of Kentucky, West Virginia and Ohio and other scatter parts of the rustbelt states. (4) Fourth cluster: The cluster had higher X 1 (mean > 0.5), lower X 2 , lower X 4 , lower X 5 , higher X 6 , and higher X 7 values. This means that people in this region tended to support the Republican candidate, wore masks less frequently, had less high-school graduates or above, a lower unemployment rate, higher income, and higher COVID-19 death toll. The cluster included the larger part of Indiana, Ohio and part of Illinois.
The results seem to slightly contrast with the previous literature. Warshaw et al. (2020) found that COVID-19 fatalities decreased the support for Donald Trump in the 2020 presidential election [19]. However, our results show that the third and fourth clusters in the rustbelt states have higher COVID-19 death tolls with higher Republican vote shares and residents less inclined to wear face masks. Meanwhile, the second cluster had higher Republican vote shares and the residents there often wore face masks, while the COVID-19 death toll seemed unimportant. We can conclude that the COVID-19 death toll in each county did not affect the Republican vote shares in the rustbelt states, while the maskwearing behavior in some regions had a negative impact on the Republican vote shares.
According to Figure 11, we can observe that cluster 2 accounts for the largest area in the rustbelt states. Cluster 1 accounts for the smallest area. The clustering results indicate that the U.S. presidential election-related factors and COVID-19-related factors are closely related to the clustering results. It enables the researchers in the related field to conduct further studies. related to the clustering results. It enables the researchers in the related field to conduct further studies.

Conclusions
The present study intended to use the spatial fuzzy C-means clustering to analyze the related factors of COVID-19 and the U.S. presidential election in the rustbelt states in 2020. The study found that the spatial generalized fuzzy C-means (SGFCM) method produced better clustering results. The SGFCM method divided the rustbelt states into four areas. The results imply that the COVID-19 death toll in each county did not affect the Republican vote shares in the rustbelt states, while the mask-wearing behavior in some regions had a negative impact on the Republican vote shares. It is worth conducting further research.

Conclusions
The present study intended to use the spatial fuzzy C-means clustering to analyze the related factors of COVID-19 and the U.S. presidential election in the rustbelt states in 2020. The study found that the spatial generalized fuzzy C-means (SGFCM) method produced better clustering results. The SGFCM method divided the rustbelt states into four areas. The results imply that the COVID-19 death toll in each county did not affect the Republican vote shares in the rustbelt states, while the mask-wearing behavior in some regions had a negative impact on the Republican vote shares. It is worth conducting further research.