Source characterization of volatile organic compounds affecting the air quality in a coastal urban area of South Texas.

Selected Volatile Organic Compounds (VOC) emitted from various anthropogenic sources including industries and motor vehicles act as primary precursors of ozone, while some VOC are classified as air toxic compounds. Significantly large VOC emission sources impact the air quality in Corpus Christi, Texas. This urban area is located in a semi-arid region of South Texas and is home to several large petrochemical refineries and industrial facilities along a busy ship-channel. The Texas Commission on Environmental Quality has setup two continuous ambient monitoring stations (CAMS 633 and 634) along the ship channel to monitor VOC concentrations in the urban atmosphere. The hourly concentrations of 46 VOC compounds were acquired from TCEQ for a comprehensive source apportionment study. The primary objective of this study was to identify and quantify the sources affecting the ambient air quality within this urban airshed. Principal Component Analysis/Absolute Principal Component Scores (PCA/APCS) was applied to the dataset. PCA identified five possible sources accounting for 69% of the total variance affecting the VOC levels measured at CAMS 633 and six possible sources affecting CAMS 634 accounting for 75% of the total variance. APCS identified natural gas emissions to be the major source contributor at CAMS 633 and it accounted for 70% of the measured VOC concentrations. The other major sources identified at CAMS 633 included flare emissions (12%), fugitive gasoline emissions (9%), refinery operations (7%), and vehicle exhaust (2%). At CAMS 634, natural gas sources were identified as the major source category contributing to 31% of the observed VOC. The other sources affecting this site included: refinery operations (24%), flare emissions (22%), secondary industrial processes (12%), fugitive gasoline emissions (8%) and vehicle exhaust (3%).


Introduction
Volatile organic compounds (VOC) emitted from anthropogenic and natural sources are of great interest from the standpoint of urban air quality. Based on epidemiological studies, the United States Environmental Protection Agency (EPA) has recognized some of the volatile organic compounds (VOC) compounds as air toxics or carcinogens [1,2]. In the presence of sunlight and oxides of nitrogen, VOC undergo complex photochemical reactions to form ozone, a key component of urban smog. There are many sources of VOC in an urban environment. Identification of all possible sources is vital for the design and implementation of effective air emission control strategies and for assessing the impacts on human health. Numerous studies have been conducted in the United States, Europe and Asia for the source apportionment of VOC in the ambient air. Source receptor models such as Chemical Mass Balance (CMB), Positive Matrix Factorization (PMF), UNMIX, Principal Component Analysis/Absolute Principal Component Scores (PCA/APCS) have been used in these studies.
Srivastava applied CMB 8.0 for the source apportionment of VOC measured in Mumbai city in India and identified evaporative emissions to be predominant [3]. In a similar source apportionment study conducted by Srivastava et al. for the city of Delhi, India using CMB approach with multiple linear least squares regression analysis, emissions from diesel internal combustion were found to be among the predominant sources [4].
PCA/APCS was employed as a factor analysis technique that identified vehicular emissions and biofuel burning as the major sources of VOC affecting eastern China in a study conducted by Guo et al. [5]. Jorquera et al. applied receptor models like UNMIX and PMF for source apportionment studies conducted in Santiago, Chile and they identified fuel evaporation and gasoline exhaust as the predominant sources [6]. In a similar source apportionment study conducted by Brown et al. for the Los Angeles area, evaporative emissions were identified to be the major contributors [7]. Similar studies were conducted for monitored data from industrialized urban areas of Texas. Houston is a major industrialized urban area in Texas with several petrochemical refineries located along its ship channel. The ozone level in this region is above the National Ambient Air Quality Standards (NAAQS) and hence is designated as a non-attainment region by the US Environmental Protection Agency (EPA). Since VOC is a class of ozone precursors that affects Houston, Buzcu et al. conducted a source apportionment study of the measured VOC using PMF [8]. Their finding suggests that emissions from refineries, petrochemical operations and evaporative emissions are the major sources of VOC along the ship channel in Houston, Texas.
Corpus Christi is a growing industrialized urban area located along the Gulf of Mexico in South Texas and it is currently in attainment of the national ambient air quality standards for ozone. However, there has been a steady growth in activities related to mobile sources and industries over the past several years. These sources contribute to the overall air quality within the region. Thus, the primary focus of this study was to identify and quantify the various sources contributing to the VOC levels within the urban air shed. This is among the first source apportionment study for VOC ever conducted for the Corpus Christi region. In this study, PCA which is a basic factor analysis technique was applied to identify the possible sources affecting this region. APCS with multiple linear regressions was further employed to quantify the identified sources and evaluate the net impact of source contribution.

Study Area and Data
Corpus Christi is the eighth largest consolidated metropolitan statistical area located in South Texas and is a major tourist attraction. It is located on the coast and is home to the fifth largest port in the U.S. Over a dozen petrochemical industries and refineries are located along the ship channel.
The Center for Energy and Environmental Resources at University of Texas currently operates and maintains automated gas chromatographs (auto GCs) at two Continuous Ambient Monitoring Stations (CAMS) 633 (27° 49' 45" North, 97° 32' 32" West) and 634 (27° 47' 56" North, 97° 26' 02" West) located along the Corpus Christi ship channel. Fig. 1 shows the physical location of the monitoring sites within the Corpus Christi urban airshed. The two CAMS sites are located within the industrial cluster and are heavily influenced by local industrial and residential sources along with mobile sources. Several major transportation arteries and roads are also located near the monitoring sites. The auto-GCs collect hourly samples and analyze for 67 VOC. However, the data for only 46 detectable VOC is reported. The project is funded by the Texas Commission on Environmental Quality (TCEQ) and the data generated from this is provided online for public access. Hourly concentrations of 46 VOC compounds measured during the study period of March 2005 to December 2006 were acquired from the TCEQ's website for this source characterization analysis.

Methodology
Receptor models like UNMIX, PMF, PCA have been used for source apportionment studies of air pollutants. Principal Component Analysis (PCA)/Absolute Principal Component Scores (APCS) are the primary factor analysis and source apportionment techniques employed in this study.

Principal Component Analysis/Absolute Principal Component Scores (PCA/APCS)
PCA is a factor analysis technique that uses eigenvalues to apportion data sets and is widely used for source apportionment studies. It is a data reducing tool for large datasets, where correlated data is reduced to a small number of independent factors or principal components that can explain the variance in the data. Rotational strategies are used to change the axes without changing the relative locations of the variables in space in order to obtain a better pattern of factor loadings. Accordingly, variance maximizing ("varimax") rotation is widely used in atmospheric data manipulation to identify the principal components extracted [5,9]. In this study varimax rotation was applied to identify the principal components with clear pattern of factor loadings. Absolute principal component analysis (APCS) was used for further quantification of sources which were identified through PCA. The process begins by standardizing the data using the following equation.
where, i=1,…,m; k=1,…,n. E ik is the concentration for the i th species in the k th observation, Ē i is the arithmetic mean concentration and σ i is the standard deviation. The PCA operation is described by the following equation.
where, i=1,…, m; k=1,…,n. G ij is the loading factor for the i th compound from j th source, and is determined from eigenvector decomposition matrix. H jk is the factor score for the k th observation. Once the loading factors, G, are determined, Eq. 2 is inverted and the factor scores, H, are calculated.
These factor scores are based on normalized data, and the true zero was calculated to rescore the factors [10]. These rescored factors are known as absolute principal component scores (APCS). Absolute principal component scores (APCS) were then used to determine source contribution estimates and source profiles using multiple linear regression analysis (MLRA) shown in the following equation [9,10,11].
where, M k is the measured concentration in sample j. The ζ j APCS jk term is the rotated absolute component score for the j th source in the k th sample. ζ 0 is a factor for mass contributions from unaccounted sources. Source profiles are then calculated from multiple linear regressions (MLR) between E ik and ζ j APCS jk , and these sources are further used to calculate compound contributions from individual sources.

Results and Discussion
The hourly concentrations of various VOC compounds measured by the auto-GCs for the study period of March 2005 through December 2006 were acquired from TCEQ for this source apportionment analysis. The initial dataset obtained for both CAMS 633 and CAMS 634 consisted of 16,104 observations. A detailed data check was performed and the incomplete hourly data sets and missing values were excluded from the initial dataset. Thus, the final dataset for CAMS 633 contained 10,762 observations and this was approximately 67% of the initial dataset. Similarly, the final dataset for CAMS 634 contained 12,343 observations and this was approximately 77% of the initial dataset.

Meteorological Characteristics
Wind rose analysis was performed to study the prevailing meteorological conditions during the sampling days.
The hourly wind speed and wind direction measured by weather sensors at both CAMS 633 and CAMS 634 along with the measured total VOC concentrations for the study period of March 2005 to December 2006 were collected from TCEQ for this analysis. Total VOC concentrations along with the corresponding meteorological parameters were selected for characterizing the predominant meteorological conditions prevalent during high VOC days. This also allows for the identification of potential local sources affecting each of the monitoring sites. Fig. 2 and Fig. 3 show the results of wind rose analysis performed for CAMS 633 and CAMS 634.  As shown in Fig. 2, winds from the north and northeast as well as from the south-southeast predominantly affected CAMS 633. Petroleum storage tanks and refineries along with a major interstate highway (IH-37) are located to the north-northeast of the monitoring site, while other localized sources of VOC emissions including mobile sources from major traffic arteries and natural gas related sources such as pipelines, residential and commercial sources also affected this site. As shown in Fig. 3, predominant winds from the north and northeast were observed at CAMS 634. This strongly suggests the influence of emissions from refinery operations located north of the monitoring site. In addition, significant impact of mobile source emissions from IH-37 located immediately to the north and from Leopard Street, a major roadway, located to the south of the monitoring site was also noticed on these high VOC days.

Principal Component Analysis (PCA)
Varimax rotated factor analysis was applied on each dataset to identify the principal components whose factor loadings were greater than 0.4. The principal components with eigenvalues greater than 1 were considered for further analysis. Based on the factor loadings and eigenvalues, five and six possible sources were identified as those contributing to the measured values at CAMS 633 and 634, respectively. The five sources identified at CAMS 633 explained a variance of approximately 69% of the total VOC. Similarly the six sources identified at CAMS 634 explained a variance of over 75% of the total VOC. The eigenvalues and percentage variance explained by each source for both sites are shown in Table 1. The factor loadings estimated by the PCA analysis were then used to determine possible sources affecting the Corpus Christi urban airshed. PCA revealed nine possible sources based on the eigen values, however only five of them could be resolved. Table 2 shows the factor loadings estimated by the PCA analysis for the five sources.  The first principal component (PC) was identified to account for 40 percent of the total variance. Major contributing species for this PC were similar to those of fugitive gasoline emissions. Fugitive gasoline emissions generally result because of evaporation from storage tanks at refineries or gas stations and are characterized by the presence of butanes, pentanes, butenes, pentenes, benzene, xylenes and C 5 -C 9 alkanes [8,9,12,13]. High factor loadings of butenes, pentenes, benzene, xylene and nonane were also observed. In addition, high factor loadings for several octane isomers which are used in gasoline to increase the octane rating of the fuel blend were also observed. The most likely sources for this include a large tank farm lying northeast of the sampling station and from fuel evaporation from parked vehicles near a meat processing plant located across the street to the southwest. A detailed conditional probability frequency analysis of the concentrations along with the wind data validated this finding.
The second PC accounted for 12.5 percent of variance. High factor loadings of alkanes like ethane, propane, isobutane, n-butane, isopentane and n-pentane along with 2,2-dimethylbutane, methylcyclopentane and cyclohexane were observed; hence the source was associated with natural gas. The refining operations located directly to the northeast and petrochemical facility located to the southwest were attributed to be the possible sources. The petrochemical plant to the southwest potentially contributed to the lighter hydrocarbons measured at the monitoring sites. The third PC accounted for 6.5 percent of the variance. Based on the factor loadings the key species identified included 1,3-butadiene, benzene, and toluene. Ethylene, acetylene, and 1,3-butadiene are key components of exhaust from internal combustion of vehicular sources. Benzene, toluene and the xylene congeners are some of the common species present in vehicle exhaust. Hence the third PC was classified as vehicle exhaust. CAMS 633 is located near an interstate highway (IH-37), which is aligned from the northwest to the southeast and is located due north of the sampling station. Leopard Street, one of the major traffic arteries, is also located adjacent to the monitoring site. Emissions from vehicles passing on IH-37 and Leopard Street are the major sources affecting CAMS 633. The vehicles at the meat processing plant to the south also could have contributed to this source profile.
Refinery operations were identified as the fourth PC accounting for approximately 5.5 percent of the total variance. Species with significant factor loadings for this PC included n-hexane, n-heptane and methylcyclohexane. The fifth source profile accounted for 4.8 percent of the variance and it was identified to be associated with industrial flare emissions. The species with significant factor loadings in this PC were n-nonane, 1,3,5trimethylbenzene and n-decane. Several flares are located to the east and northeast of CAMS 633, and the petrochemical plant to southwest also has several large flares. Overall, nearly 70 percent of the variance was explained by the top five source profiles at this monitoring site. Six possible sources affecting the ambient air quality were identified at CAMS 634. The source profiles were similar to those identified at CAMS 633. Table 3 shows the factor loadings estimated by the principal component analysis. The first PC was identified to be refinery operations which explained about 48.5 percent of the total variance. The factor loadings for refinery operations at CAMS 634 were found to be similar to those of CAMS 633 and were characterized by the presence of alkanes, BTEX compounds along with methylcyclopentane, 2methylhexane, 2,3-dimethylpentane, 3-methylhexane, methylcyclohexane, 2-methylheptane, and 3methylheptane. Factor loadings for hexane, heptane and methylcyclohexane were equal to or above 0.85 for both CAMS sites.
The second PC for CAMS 634 was identified to be flare emissions and this accounted for approximately 10.6 percent of the total variance. This source profile was characterized by the presence of butanes, pentenes, propylene, n-octane, n-nonane and n-decane. Similar signature was observed for flare emissions at CAMS 633. The third PC was classified as vehicular exhaust and was similar to the profile identified at CAMS 633. It accounted for about 6.3 percent of total variance. The chemical characteristics of this PC were characterized by high factor loadings of 1,3-butadiene. The site is located between three major traffic arteries. IH-37 is located to the north of the site, Leopard Street to the south and the Up River Road to the east. Hence the emissions from mobile sources on these arteries are major contributors impacting this site. The fourth profile characterized by the presence of cyclopentane, isopentane, n-pentane and 2,2dimethylbutane accounted for about 4.2 percent of the total variance. Localized emissions from secondary industrial processes such as paint and body shops, metal fabricators and processing units are located southwest and southeast of the monitoring site and these are probable sources contributing to this profile. Most of these small industries are affiliated with major refineries and petrochemical industries in Corpus Christi and these relatively small facilities are located closer to CAMS 634. The fifth PC for CAMS 634 was characterized by 1,3,5trimethylbenzene, n-decane and 1,2,3-trimethylbenzene, and it accounted for about 3.1 percent of the total variance. Based on the factor loadings it was identified as fugitive gasoline emissions.
Natural gas source emissions accounted for 2.8 percent of the variance and this characterised the sixth PC. The predominant species of this profile were ethane, ethylene, propane, isobutane and n-butane. Overall, slightly over 75 percent of the variance was explained by the top six source profiles at this monitoring site.

Absolute Principal Component Scores (APCS)
Absolute principal component scores (APCS) calculated based on the principal component scores identified by the PCA and the true zero principal component were used for further source quantification.
Multiple linear regression analysis was performed for each species using the measured concentrations and the calculated APCS. The coefficients of the equation predicted were used to estimate the concentration of each VOC compound contributed by the sources identified by the PCA. The time series concentrations of species so generated were then used to estimate the mean concentrations contributed by each of the identified sources affecting the monitoring sites and this is shown in Tables 4 and 5. Based on these mean VOC concentrations computed for CAMS 633, natural gas sources were identified as the dominant source contributing approximately 70% of the apportioned mass. Flare emissions accounted for nearly 12% of the concentrations and this was the second largest contributor of VOC. The other sources included fugitive gasoline emissions (9%), refinery operations (7%) and vehicle exhaust (2%) and these are highlighted in Fig. 4. Emissions from natural gas sources including pipelines and production units were identified as large contributors of the measured VOC concentrations at CAMS 634 and this accounted for nearly 31%. Emissions from refinery operations contributed nearly 24%, followed by flare emissions (22%). The other sources included secondary industrial processes (12%), fugitive gasoline emissions (8%) and vehicle exhaust (3%) and these are highlighted in Fig. 5.   To evaluate the performance of the PCA/APCS model, a detailed correlation analysis between the measured total VOC concentrations and the model predicted concentrations was performed. The R 2 (coefficient of correlation) values estimated by this analysis was 0.94 for CAMS 633 and it was 0.88 for CAMS 634, indicating statistically sound model predictions.
The model predictions uncertainty was quantified for each species at the monitoring sites and it was observed that the model predictions for alkanes were robust, while some of the alkenes and alkynes were slightly underpredicted. Overall, the model performance satisfied most qualitative and statistical checks used in source apportionment studies.

Conclusion
Source apportionment analysis was conducted using hourly VOC concentrations measured at CAMS 633 and 634 during March 2005 through December 2006. Varimax rotated factor whose loadings were greater than 0.4 was only considered for this study. PCA identified five major sources affecting CAMS 633 and these explained nearly 70% of the total variance. The major source affecting CAMS 633 was found to be emissions from natural gas sources including pipelines and production units and this accounted nearly 70% of the measured VOC levels. The other predominant sources affecting CAMS 633 were identified to be flare emissions (12%), fugitive gasoline emissions (9%), refinery operations (7%), and vehicle exhaust (2%) based on the measured VOC levels. While at CAMS 634, the PCA identified nine possible sources out of which six sources were resolved. Approximately 75% of the measured mass was accounted for at this site. The major source category impacting the measured VOC concentrations at this site was emissions from natural gas production and this accounted for approximately 31%, which was noticeably smaller than that observed at CAMS 634.
The other sources identified included refinery operations (24%), flare emissions (22%), secondary industrial processes (12%), fugitive gasoline emissions (8%) and vehicular exhaust (3%). Overall, the source apportionment analysis confirmed the fact that VOC from industrial and mobile sources were major contributors at both monitoring sites within the Corpus Christi urban airshed. Source apportionment analysis of VOC provides a reasonable estimate of the major source categories that affect the ambient air quality in an urban airshed and this approach can serve as an effective tool for air quality planners and decision makers in assessing the environmental health related impact of hydrocarbon compounds in the ambient atmosphere.