Characteristics of Photochemical Reactions with VOCs Using Multivariate Statistical Techniques on Data from Photochemical Assessment Monitoring Stations

: This study assesses the concentrations of the 54 ozone precursors (all being volatile or ‐ ganic compounds (VOCs)) detected at the four photochemical assessment monitoring stations that are part of the air quality monitoring network in the Kaohsiung ‐ Pingtung area in Taiwan. Factor and cluster analyses of the multivariate statistical analysis are performed to explore the interrela ‐ tionship among the 10 VOCs of relatively higher concentrations selected from the 54 ozone precur ‐ sors to identify significant factors affecting ozone pollution levels in the study area. Moreover, the multivariate statistical analysis can faithfully reflect why the study area has been affected by photo ‐ chemical pollution. First, results of the factor analysis suggest that the factors affecting how photo ‐ chemical reactions occur in the study area can be divided into the following: “pollution from mobile sources”, “pollution from stationary sources”, and “pollution from energy sources”. Among them, mobile sources have the greatest impact on photochemical pollution levels. Second, the impacts of photochemical pollution on air quality in the study area can be classified into four clusters via clus ‐ ter analysis. Each cluster represents how the 10 VOCs affect air quality, with different characteris ‐ tics, and how they contribute to photochemical pollution in the study area. If there are more types and samples of photochemical pollutants when performing a multivariate statistical analysis, the analysis results will be more stable. This study adopts data on VOC monitoring over a period of nearly two years, which can effectively improve the validity and reliability of the factor analysis results, while helping environmental agencies review the effectiveness of air quality management in the future and serving as reference for the effectiveness of reducing photochemical pollution in the atmosphere.


Introduction
Photochemical pollution is a new type of pollutant that came about after petroleum was used as a fuel. It first occurred in the third largest city in the United States, Los Angeles, in 1944. Since the 1950s, photochemical pollution has become a recurrent problem across the globe [1,2]. When the primary pollutants of automotive exhaust, such as hydrocarbons and NO2, are released into the atmosphere, they absorb energy from the sun under strong ultraviolet irradiation. The molecules in said substances will become unstable after they absorb energy from the sun, the original chemical bonds will be broken, and new substances are formed. Such a chemical reaction is called a photochemical reaction, and its product is the toxic photochemical smog. In areas where there are a high number of transport vehicles and industrial districts where factories emitting nitrogen oxides and hydrocarbons are concentrated, people need to be aware of photochemical pollution [3][4][5].
VOCs refer to any carbon-containing volatile compounds [6]. They can be found in many products, such as solvent-based paints, printing inks, consumer goods, organic solvents, and petroleum products [6,7]. Vehicles and vessels also emit VOCs, ultimately leading to air pollution and smog. VOCs play an important role in the formation of ozone in the atmosphere. Under sunlight, VOCs will react with the nitrogen oxides emitted from automobiles, power plants, and industrial activities to form ozone. It is clear that VOCs are harmful to human health. When the concentration of a VOC in a family room reaches a certain level, people will have headaches, nausea, vomiting, and limb weakness. In severe cases, they will have a seizure, fall into a coma, or have hypomnesia [8][9][10]. Their respiratory system, lungs, liver, kidneys, nervous system, hematopoietic system, and digestive system will be affected when they are exposed to an environment with high concentrations of VOCs. Some VOCs are carcinogenic. Long-term exposure to an environment with high concentrations of VOCs will increase the risk of developing cancer [8,10]. When a photochemical reaction occurs, VOCs in the atmosphere will hinder the growth of crops, including decreasing crop yields and increasing plants' susceptibility to disease [11]. Moreover, VOCs will cause pollution to the local environment as wind blows or rain falls, or they will enter bodies of water or soil when they move and diffuse.
Multivariate statistical analysis is a branch of statistics commonly used in the fields of management science, social science, and life science. It is used to analyze data containing multiple variables, explore data relevance, or clarify the data structure. It differs from traditional statistical approaches that emphasize parametric estimation and hypothesis testing [12]. As multivariate statistical analysis requires complicated and massive calculations, computers are used to perform said calculations. Among multivariate statistical analysis techniques, factor analysis aims to simplify a set of interrelated variables, turning them into a smaller number of meaningful dimensions or factors. In other words, a factor can be used to represent or replace some variables of similar nature among said variables. Thus, we hope to use a few main factors to cover a set of variables via factor analysis [13,14]. A cluster analysis classifies a group of individuals based on their variables or characteristics. When performing a cluster analysis, clusters of said individuals are not known beforehand, and individuals of similar characteristics are classified based on their variable data. Hence, as with factor analysis results, cluster analysis results are obtained by calculating the mathematical statistics of the provided data. Whether said results have substantial or theoretical meaning and how many clusters are to be adopted in the end must be judged by those performing the analysis [15][16][17].
This study performs factor and cluster analyses of the multivariate statistical analysis of 10 VOCs of relatively higher concentrations selected from the 54 ozone precursors whose concentrations at the four photochemical assessment monitoring stations in the study area have been assessed over the years. The goal is to properly classify the pollution caused by the emissions of the 10 VOCs. It is hoped that the exploration into the characteristics of the distribution of photochemical pollution caused by emissions of the 10 VOCs can immediately reflect the differences in the extent to which the photochemical assessment monitoring stations have been affected by photochemical pollution. Moreover, said exploration can serve as reference for environmental agencies when implementing air quality management in the future.

Locations of Photochemical Assessment Monitoring Stations and Data Selection
This study selected four photochemical assessment monitoring stations set up by Taiwan's Environmental Protection Administration in Kaohsiung and Pingtung based on local geographical features (i.e., Qiaotou station, Xiaogang station, and Linyuan station in Kaohsiung City and Chaozhou station in Pingtung). The study area is located in the Pingtung Plain at an elevation of approximately 30-100 m above sea level. Its average yearly temperature is 24-25 °C. The average temperature of the warmest months (July ~ August) is above 30 °C, and that of the coldest months (January ~ February) is above 18 °C. The annual precipitation is 1500 mm−2000 mm [6,10]. The study area has a tropical monsoon climate. The hottest and most humid months of the year are between May and October. It is particularly clear that June to August comprise the rainy season, during which the rainfall accounts for 82% of the annual rainfall in Kaohsiung. The coldest month in the study area is January, with an average low temperature of 19.9 °C; the hottest month is July, with an average high temperature of 29.4 °C. The least humid month is December, with an average relative humidity of 72.4%; the most humid months are between June and August, with an average relative humidity of 81.3%. Due to its geographical location, high population density (Kaohsiung City has a population of 2.77 million), and the fact that it is the most important heavy industrial area in Taiwan, the study area has a large number of motor vehicles and stationary sources (chimneys) at usual times. It is thus more likely to be affected by air pollution. Since Kaohsiung has a high population density and relatively frequent stationary and mobile sources, more photochemical assessment monitoring stations are located here. Ten VOCs of higher atmospheric concentrations among the 54 ozone precursors (all being VOCs), whose concentrations in the monitoring stations have been assessed over the years, are used as a basis for the analysis. Said 10 VOCs include acetylene (AT), isopentane (IP), n-butane (NB), 1,2,4-trimethylbenzene (TB), toluene (TU), propene (PP), 2-methylhexane (MH), ethane (EA), isopropyl benzene (IB), and nundecane (NC). The monitoring and operating models of photochemical assessment monitoring stations in the USA were introduced by Taiwan's Environmental Protection Administration to the photochemical assessment monitoring stations in question, which are equipped with instruments and facilities. PerkinElmer's automatic VOC analysis system has been used as the basic monitoring facilities in the photochemical assessment monitoring stations. This analytical instrument automatically collects samples and makes measurements every hour. VOCs in the air will be retained in the electronic cooling concentration trap when the measurements are made. The trap will be filled with an appropriate adsorbent to obtain a fixed amount of cooled, concentrated C2-C11 VOCs. Monitoring data collected between 1 July 2020 and 31 May 2022 were selected; 24 complete VOC data entries per day (24 × 600 days = 14,400 entries) are used as the basis for the model analysis. Figure 1 shows the geographical locations of the four photochemical assessment monitoring stations. The selection of monitoring data collected over a period of one and a half years takes into consideration trends of seasonal and time-series variations in air pollutants (e.g., different concentration levels in the mornings and evenings). Evidence-based software (SPSS Statistics 26, IBM, Armonk, NY, USA) was adopted for this study.

Multivariate Statistical Analyses-Factor Analysis
Factor analysis aims to identify basic variables (referring to the 10 VOCs in this study) to explain the relevant modes demonstrated in the set of observed variables. It is typically used for the dimensionality reduction of data. The goal is to identify a few factors and use them to explain the variance observed among the numerous manifest variables. The advantage of factor analysis is that it can explain most variations by means of fewer variables. With principal component analysis (PCA), these variables can be integrated into indexes to reveal the common factors from the original data that are relevant to one another [18].
In factor analyses, each variable, i.e., x1~xp, in a set of p variables is decomposed into q common factors, i. e. f1~fp (q ≦ p), which are linearly combined with the special factors єi.
The model of factor analyses can be represented by:

Taiwan Strait
The

Pacific Ocean
Taiwan xp = μp + ℓp1f1 + ℓp2f2 +…+ℓpqfq + єp (1) where f1,…, fq are common factors contained in each variable xi, єi is special factor contained only in the ith variable(xi), and ℓij is the loading of ith factor to the jth common factor (fj).

Multivariate Statistical Analyses-Cluster Analysis
The so-called cluster analysis is a method used to group similar samples among p samples or subjects in the same cluster based on their characteristics. As such, said samples or subjects can be classified into several clusters and characteristics of each cluster analyzed. The objective of a cluster analysis is to divide samples into different clusters to maximize the homogeneity of each cluster and the heterogeneity among the clusters. Each cluster thus describes, in terms of the data collected, the class to which its members belong, and this description may be abstracted through use from the particular to the general class or type. Hierarchical agglomerative clustering is the most common approach, providing intuitive similarity relationships between any one sample and the entire dataset, and it is typically illustrated by a dendrogram (tree diagram) [19,20]. The dendrogram provides a visual summary of the clustering processes, presenting a picture of the groups and their proximity, with a dramatic reduction in dimensionality of the original data. The Euclidean distance usually gives the similarity between two samples, and a distance can be represented by the difference between analytical values from the samples [20]. The Ward's method uses an analysis of variance approach to evaluate the distances between clusters in an attempt to minimize the sum of squares of any two clusters that can be formed at each step.

VOC Data Processing
The data on the VOCs were collected over the period from 1 July 2020 to 31 May 2022, and a total of 14,400 data entries were used. Before analyzing the sequence of the 10 VOCs using the multivariate model, this study standardized the source data since the data on the 10 VOCs possessed seasonal and periodic characteristics as follows: where is the sequence of i VOCs and the measured value of the j sample, is the average of the sequence of i VOCs, and is the standard deviation of the sequence of i VOCs.

Selecting the Results of the Factor Analysis
This study explains the characteristics of factors using the varimax rotation in factor analysis to perform orthogonal rotation. The analysis results show three factors with eigenvalues of greater than 1, as shown in Table 1. The eigenvalue is used to indicate whether a factor in question is significant or not. If the eigenvalue is greater than 1, the factor is considered "significant". Otherwise, the factor is "insignificant" or it is white noise that can be ignored. These three factors can be used to fully explain the main factors affecting the generation of photochemical pollution in the study area. The cumulative explained variance of the three common factors is 83.437%, and their eigenvalues are 2.244, 1.682, and 1.161, respectively. Table 2 shows that the KMO value is 0.876. According to Kaiser, this result is suitable for a factor analysis since it is greater than 0.5. Moreover, as shown in Table 2, the chi-square distribution value of Bartlett's test of sphericity is 2411.390 (and the degree of freedom is 22). Having achieved the significance level, this value represents the existence of common factors among the relevant matrices of the population and is therefore suitable for a factor analysis.

Determining the Number of Factors
The number of eigenvalues greater than 1, as shown in Table 1, can help to determine the number of main factors affecting the mechanism for the generation of photochemical pollution in the study area. Table 3 shows the component matrix after orthogonal rotation. The matrix after rotation can be used to explain the characteristics of the factors. The three factors can also be used to describe the relevance among the 10 VOCs detected in the air quality monitoring network and the impacts of said VOCs on the generation of photochemical pollution in the study area.  Table 3 shows the three main factors affecting the mechanism for the generation and characteristics of photochemical pollution in the study area. The following is a complete description of the characteristics of the formation of said three factors.

Factor 1
As shown in Table 3, Factor 1 consists of three VOCs: TU, TB, and IB. Table 1 shows that their total variance can reach 43.478%. The impact of said VOCs on atmospheric photochemical pollution accounts for almost half of the total variance explained. It can be learned from Table 3 that TU, a constituent of Factor 1, has the highest loading of 0.879, followed by TB and IB with loadings of 0.792 and 0.741, respectively. The mechanism for the formation of said three VOCs is related to organic pollutants emitted by motor vehicles in the study area, which is attributable to the fact that the study area has a high population density. It not only has a huge number of permanent residents, but also of motor vehicles, especially in the Kaohsiung metropolitan area, where there is a high ratio of using motor vehicles for commuting or travel purposes during holidays. Mobile sources tend to affect photochemical pollution easily in one day [3]. The study area is the major heavy industrial area in Taiwan; thus, considerable concentrations of VOCs emitted from stationary sources are detected here [17]. In addition, stationary sources in Kaohsiung such as oil refineries, power plants, and petrochemical plants usually release VOCs of higher concentrations into the atmosphere during the daytime. There is also a large number of vehicles traveling to and from the Kaohsiung metropolitan area at the usual times and during holidays. Thus, the concentrations of VOCs are higher here during the daytime. On the contrary, the concentrations of pollution from various stationary and mobile sources are significantly lower at nighttime than those during of the daytime. In terms of trends of seasonal variations, the study area is in the northeast monsoon, wake low wind area in the winter. Air pollutants generated in the north are transported to the south by the monsoon, whereby photochemical pollution is more likely to occur. At this time, the wind gradually blows toward the east in the local atmospheric wind field. As the area is on the leeward side, it is subject to lower average wind speeds and lower rainfall levels, which makes it difficult for VOCs to diffuse. Consequently, the VOCs contribute to air pollution for a longer period of time. However, the sources of TU, TB, and IB are the emissions from motor vehicles. Moreover, there is a considerable number of industrial areas in Kaohsiung, along with the Port of Kaohsiung, which is the 15th largest port in the world. There are large vehicles traveling to and from Kaohsiung at the usual times, resulting in high concentrations of VOCs in the atmosphere even at off-peak times. The study area has long been affected by a higher concentration of TU emitted by motor vehicles. In spring, autumn, and winter, TU in the atmosphere can easily accumulate due to changes in weather conditions, low rainfall, and other factors (such as mixing height, boundary layer height, suppressed convective mixing, etc.); consequently, it cannot be easily diffused [21]. Although the concentrations of TB and IB in the atmosphere are not as high as that of TU, they are the main types of mobile sources in the study area. Moreover, they are the main sources of photochemical pollution.
To sum up, the analysis results suggest that the main sources and emission behaviors of TU, TB, and IB detected in the study area are closely related to exhaust emissions from motor vehicles. Their molecular structure belongs in that of aromatic hydrocarbons. They also constitute the main factors that contribute to the generation of photochemical pollution in the study area. As such, Factor 1 can be called "the factor of pollution from mobile sources".

Factor 2
As shown in Table 3, Factor 2 consists of three VOCs: IP, AT, and MH. Table 1 shows that their total variance can reach 26.087%. It can also be learned from Table 3 that IP, a constituent of Factor 2, has the highest loading of 0.822, followed by AT and MH with loadings of 0.770 and 0.713, respectively. As mentioned earlier, Kaohsiung is a highly polluted industrial area. Besides being affected by pollution from mobile sources at the usual times, Kaohsiung is also affected by pollution from stationary sources, such as oil refineries, petrochemical plants, steel mills, and power plants. Higher concentrations of IP, AT, and MH among the VOCs emitted from stationary sources at specific times have been detected, in addition to those of the three VOCs that constitute Factor 1. Lower concentrations of IP, AT, and MH among the VOCs are detected at off-peak times since there are no emissions from most of the stationary sources. Currently, competent environmental authorities in the study area have formulated regulations governing the control of emissions from stationary sources [15,17]. However, stationary sources are distributed across the Pingtung Plain in the study area, whereby the dispersion of pollutants, such as malodorous substances, toxic chemical substances, or VOCs, from stationary sources is inevitable. When said pollutants react with sunlight, the photochemical reaction produces ozone [22]. The results of monitoring said reaction by Taiwan's Environmental Protection Administration over the years show higher concentrations of IP, AT, and MH in the atmosphere. As the three VOCs that comprise of Factor 1 are emitted from mobile sources, they have a more significant impact on photochemical pollution. Thus, TU, TB, and IB are not parts of Factor 2. In other words, as Table 2 shows, TU, TB, and IB, among the constituents of Factor 1, have higher loadings of 0.879, 0.792, and 0.741, respectively. The results show that the three said VOCs can better represent and explain the pollution characteristics of Factor 1. Since TU, TB, and IB, among the constituents of Factor 2, generally have lower loadings of 0.211, −0.095, and 0.117, respectively, they cannot adequately represent and explain the pollution characteristics of Factor 2.
The above analysis results suggest that the main sources and emission behaviors of the IP, AT, and MH detected in the study area are related to the numerous stationary sources of air pollution. Hence, Factor 2 can be called "the factor of pollution from stationary sources."

Factor 3
As shown in Table 3, Factor 3 consists of four VOCs: EA, PP, NB, and NC. Table 1 shows that their total variance can reach 13.872%. It can also be learned from Table 3 that EA, a constituent of Factor 3, has the highest loading of 0.858, followed by PP, NB, and NC with loadings of 0.780, 0.733, and 0.681, respectively. All four VOCs have straightchain, single-bond alkane structures. Among them, EA, PP, and NB are the major components of natural gas, while the long-chain NC is generally found in liquefied petroleum gas. Currently, the oil refineries in Kaohsiung include Dalin Refinery in Xiaogang District, the liquefied natural gas terminal in Yongan District, and the No. 3 and No. 4 Naphtha Cracking Plants in the petrochemical industrial zone in Linyuan. These oil refineries and natural gas storage facilities separate the hydrocarbons of different boiling points via heating during the refining processes every day. Many VOCs are generated from said processes; the four VOCs that constitute Factor 3 are the first air pollutants generated from the processes. They are evidently the primary sources of photochemical pollution during the refining processes [23]. The liquefied natural gas terminal in Yongan District is currently the largest liquefied natural gas terminal. It can store up to 4.5 million tons of liquefied natural gas per year, providing power plants in Taiwan with natural gas all year round. The volume of natural gas to be used in Taiwan must reach 50% by 2025. As its storage volume increases, pollution caused by emissions of VOCs will inevitably increase. Moreover, there are several Naphtha Cracking Plants in Kaohsiung. Naphtha produced from crude oil during the refining process will decompose into VOCs with smaller molecules, such as ethylene, propylene, and butadiene [24]. Since crude oil contains a high concentration of long-chain NC, as well as that of EA, PP, and NB, its refining process can easily lead to the dispersion of VOCs and indirectly cause photochemical reactions.
The above analysis results suggest that the formation of EA, PP, NB, and NC in the atmosphere in the study area has to do with the liquefied natural gas terminal and Naphtha Cracking Plants in operation. The four VOCs are generated from the processes of refining the basic elements of non-renewable energy. Thus, Factor 3 can be called the "factor of pollution from energy sources."

Results of Photochemical Pollution Characteristics Analyses-Cluster Analysis
A two-stage clustering process is used. First, hierarchical clustering is used for general clustering; then, the K-mean method is used to test the cluster numbers. Finally, five clusters are used to categorize the characteristics of the 10 VOC contaminants. Figure 2 shows the relationship between the clusters and the factors. Additionally, among the many VOC pollutants, some may share common characteristics. Understanding these characteristics will assist in analyzing the variations of relationship among the various photochemical pollution characteristics. The following characteristics can be derived from Table 4:  As shown in Figure 2, Cluster 1 has the fourth highest scores on all the three factors: pollution from mobile sources, pollution from stationary sources, and pollution from energy sources. In other words, the concentrations of VOCs distributed across the study area are low in this cluster. Table 4 shows that the concentration values of the 10 VOCs in this cluster are only higher than those in Cluster 5, indicating an insignificant impact of photochemical reactions on this cluster. Moreover, VOCs in this cluster are generally detected in Chaozhou, Pingtung due to the fact that Chaozhou's photochemical monitoring station is the only station among the four stations that is not located in Kaohsiung. Chaozhou is less affected by pollution from mobile sources since it has a fewer number of permanent residents; areas near it are nearly unaffected by pollution from stationary sources. However, it is worth noting that Chaozhou can be easily affected by a northeast monsoon in autumn and winter, and it is constrained by terrain. Air pollutants generated in Kaohsiung are transported to Pingtung by monsoons and settle in Chaozhou after being obstructed by the Central Mountain Range. Thus, there remain certain percentages of concentrations of VOCs in the atmosphere [25]. Overall, this cluster accounts for 22.1% of the total data entries (3186 out of 14,400 entries). Although the concentrations of VOCs in the atmosphere in this cluster are low, the impact of photochemical reactions on it is at a general level. This cluster is therefore classified as "the general photochemical pollution cluster." (2) Cluster 2 As shown in Figure 2, Cluster 2 has the lowest scores on all three factors. Moreover, its factor scores are much lower than those in the other four clusters. Table 4 shows that the concentration values of the 10 VOCs in this cluster are lower than those in other clusters, and there is no sudden increase in the concentration values of VOCs, indicating a relatively insignificant impact of the atmospheric photochemical reactions on this cluster. In this cluster, the concentrations of VOCs reflected in the three factors are low; they are generally distributed across Qiaotou in Kaohsiung and Chaozhou in Pingtung. As mentioned earlier, areas near Chaozhou are less affected by pollution from stationary sources (i.e., there is no significant pollution from stationary sources) and that from mobile sources. Thus, the concentrations of VOCs detected in these areas are low. Although Qiaotou is close to the liquefied natural gas terminal in Yongan District, it is not significantly affected by pollution from energy sources. Moreover, significant stationary sources in Kaohsiung, such as Dalin Refinery, the No. 3 and No. 4 Naphtha Cracking Plants in the petrochemical industrial zone in Linyuan, and the Nanzih Export Processing Zone, are located south of Qiaotou. The air pollutants generated here are transported southward toward Pingtung by northeast monsoons in autumn and winter [3,24]. As Qiaotou is not a metropolitan area, pollution from mobile sources is not significant. Consequently, there is not a significant increase in the concentrations of VOCs detected in Qiaotou. This cluster accounts for 52.3% of the total data entries (7525 out of 14,400 entries). In other words, the study area is not affected by photochemical reactions for approximately more than half the year. This cluster can then be said to be the cluster least affected by photochemical reactions among the five clusters. It is therefore classified as "the mild photochemical pollution cluster." As shown in Figure 2, Cluster 3 has the third highest scores on the factors of pollution from mobile sources and stationary sources, as well as the highest score on the factor of pollution from energy sources. Table 4 shows that the concentration values of all the VOCs in this cluster are higher than those in Clusters 1 and 2, especially that of PP, which constitutes part of the pollution from energy sources, and is the highest among all the clusters. This result indicates that areas where the VOCs in this cluster are detected can be more easily affected by the concentration of PP. As mentioned earlier, PP is a by-product generated from processing natural gas or refining crude oil. Moreover, when boilers are in operation, they often emit PP [24]. Table 4 shows that the atmospheric concentrations of PP can reach 99.1 ppb, while those of NB can reach 89.2 ppb, indicating a relatively significant impact of this factor of pollution from energy sources on this cluster. The concentrations of VOCs in this cluster are generally distributed across Qiaotou and Linyuan, which are located north and south of Kaohsiung, respectively. Both are on the outskirts of the Kaohsiung metropolitan area, and thus they are not as affected by the pollution from mobile sources as Xiaogang is. As such, the concentrations of the three VOCs that constitute Factor 1 are not abnormally high. To sum up, this cluster accounts for 10.1% of the total data entries (1457 out of 14,400 entries). Although pollution from energy sources is the main source of photochemical pollution for this cluster, its total variance is the lowest among the three factors. Its impact on photochemical pollution is not as significant as that of the other two factors, either. This cluster can thus be classified as "the cluster of moderate pollution from stationary sources and energy sources".
(4) Cluster 4 As shown in Figure 2, Cluster 4 has the highest score on the factor of pollution from mobile sources, as well as the second-highest scores on the factors of pollution from stationary sources and energy sources. Table 4 shows that the concentrations of TU, TB, and IB that constitute the factor of pollution from mobile sources can reach 112.4 ppb, 79.3 ppb, and 80.5 ppb, respectively, all of which are the highest among the five clusters. In addition to the factor of pollution from mobile sources, those of pollution from stationary sources and energy sources have an impact on this cluster, where the concentrations of VOCs are relatively high. The concentrations of VOCs in this cluster are distributed across Xiaogang due to the fact that it is within the Kaohsiung metropolitan area. There are fewer days that the concentration of ozone fails to comply with the air quality standards than those with an acceptable concentration of total suspended particulates (TSP) detected in the study area [26,27]. In the Kaohsiung metropolitan area, there is a high ratio of using motor vehicles for commuting or travel purposes during holidays. The concentrations of TU, TB, and IB are higher among the pollutants emitted from mobile sources. Moreover, there is a high number of heavy industrial areas near Xiaogang, and Taiwan's largest international port, the Port of Kaohsiung, is in Kaohsiung. There are large vehicles traveling to and from said industrial areas and the Port of Kaohsiung at the usual times. The VOCs emitted from mobile sources and stationary sources can easily have photochemical reactions. They will form hydroxyl radicals after they are oxidized under ultraviolet irradiation in the atmosphere. Following that, they will react with nitric oxide (NO) to give rise to high concentrations of ozone and peroxyacetyl nitrates (PANs) [21]. In addition to pollution from mobile sources and stationary sources, pollution from energy sources, such as the PP, NB, and NC emitted from the steel mills, power plants, and oil refineries in Xiaogang, also has a significant impact on this cluster. This cluster accounts for 7.1% of the total data entries (1024 out of 14,400 entries). Most of the concentrations of VOCs in this cluster are detected in autumn and winter. It can thus be classified as "the cluster of moderate to severe pollution from mobile sources." (5) Cluster 5 As shown in Figure 2, Cluster 5 has the highest score on the factor of pollution from stationary sources, as well as the second-highest score on the factor of pollution from mobile sources, among all the five clusters. Table 4 shows that the concentrations of IP, AT, MH, and EA that constitute the factor of pollution from stationary sources can reach 42.1 ppb, 50.5 ppb, 43.0 ppb, and 60.8 ppb, respectively, all of which are the highest among the five clusters. As mentioned in Cluster 4, Xiaogang in Kaohsiung has the highest number of heavy industrial areas and stationary sources in Taiwan. Coupled with the factor of pollution from mobile sources, which has the most significant impact on photochemical pollution, the concentrations of VOCs (especially that of aromatic hydrocarbons) from stationary sources detected in Xiaogang are generally higher than those detected in other areas. Moreover, there are two Naphtha Cracking Plants and a high number of stationary sources in Linyuan, which is located south of Xiaogang. In Cluster 3, we discussed the impact of the northeast monsoons on the VOCs detected in autumn and winter. Said VOCs are transported southward toward Pingtung by northeast monsoons [27]. However, the atmosphere in Linyuan is also affected by industrial production and pollution from mobile sources, and higher concentrations of VOCs are often detected in autumn and winter. This cluster accounts for 8.4% of the total data entries (1208 out of 14,400 entries), a proportion which is close to that of Cluster 4. To summarize, the primary source of photochemical reactions in this cluster is the factor of pollution from stationary sources. This cluster can thus be classified as "the cluster of moderate to severe pollution from stationary sources".

Conclusions
This study assessed the concentrations of 10 VOCs detected at four photochemical assessment monitoring stations that are part of the air quality monitoring network in the study area in southern Taiwan. The authors used factor and cluster analyses of the multivariate statistical analysis to explain all significant factors affecting ozone pollution levels and the mechanism for the formation of ozone in the study area. These influencing factors were properly classified into clusters that represent the photochemical reactions of different characteristics and photochemical pollution levels in communities within the study area. Ten VOCs of higher concentrations were selected from 54 ozone precursors as pollutants. The results of the factor analysis indicated three factors as the main factors affecting photochemical pollutants in the study area: pollution from mobile sources, pollution from stationary sources, and pollution from energy sources. K-means clustering was adopted for the cluster analysis to test a number of clusters. The conditions and levels of photochemical pollution in the study area can be classified into five clusters: "the general photochemical pollution cluster", "the mild photochemical pollution cluster", "the cluster of moderate pollution from stationary sources and energy sources", "the cluster of moderate to severe pollution from mobile sources", and "the cluster of moderate to severe pollution from stationary sources." Among the five clusters, Cluster 4 is most severely affected by the three VOCs that constitute the factor of pollution from mobile sources, and thus it reflects the most significant level of photochemical reactions. Data on the 10 VOCs were collected over a period of one year and nine months because the analysis results are more stable when there are more types and samples of pollutants when performing a factor analysis, whereby the validity and reliability of the factor analysis results can be effectively improved. The findings of this study can effectively create a classification of the characteristics of photochemical pollution and a mechanism for the formation of photochemical pollutants, which can serve as references for environmental agencies when implementing air quality management strategies.