Evaluation of PM2.5 Particulate Matter and Noise Pollution in Tikrit University Based on GIS and Statistical Modeling

In this paper, we assess the extent of environmental pollution in terms of PM2.5 particulate matter and noise in Tikrit University, located in Tikrit City of Iraq. The geographic information systems (GIS) technology was used for data analysis. Moreover, we built two multiple linear regression models (based on two different data inputs) for the prediction of PM2.5 particulate matter, which were based on the explanatory variables of maximum and minimum noise, temperature, and humidity. Furthermore, the maximum prediction coefficient R2 of the best models was 0.82, with a validated (via testing data) coefficient R2 of 0.94. From the actual total distribution of PM2.5 particulate values ranging from 35–58 μg/m3, our best model managed to predict values between 34.9–60.6 μg/m3. At the end of the study, the overall air quality was determined between moderate and harmful. In addition, the overall detected noise ranged from 49.30–85.79 dB, which inevitably designated the study area to be categorized as a noisy zone, despite being an educational institution.


Introduction
Air pollution is one of the major issues plaguing the world today, which highly correlates with vast industrialization, in addition to the already existing particulate matter pollutants [1][2][3]. There are a variety of air pollution compositions, but the majority contain PM2.5 and PM10 particulate matter. PM2.5 in particular (with a diameter of 2.5 µm), has been shown to be harmful to humans based on several epidemiological studies [3][4][5][6]. Due to this, experts put an emphasis on PM2.5 when performing air quality monitoring.
In general, particulate concentration measurements are done by dedicated monitoring stations that are geographically dispersed [7][8][9]. Jumaah et al. [10] found such dispersions to be problematic as insufficient samples might be collected to come up with any meaningful analysis. This is supported by Zhao et al. [11] and Hsu et al. [12], where both assert the importance of more comprehensive area coverage to allow reliable and continuous sampling.
In addition to location, gauging pollution levels in the chronological structure of a particular pollutant, such as PM2.5, is highly influenced by atmospheric factors and noise. Studying the influence or correlation of such variables on particulate matter can provide insights for better pollution monitoring [13]. The work in [14], for instance, asserts that the correlation between particulate matter and traffic noise should be looked into for better air pollution insights. This is supported by [15], where their work demonstrated how various air contaminants along with noise led to worsened air quality. In addition to traffic (as the main cause of noise pollution), weather can be an influential periodic factor that produces seasonal variations of noise [16].

Noise Pollution Mapping
Chandrappa and Das [17] define noise as undesired sound. As of 2016, the World Health Organization (WHO) puts noise contamination as the third greatest environmental contamination after air and water [18]. Studies have shown the adverse impacts of noise on human health, such as being destructive to human hearing [19]. Therefore, it is sensible to reduce (or even eliminate) unnecessary noise for the overall well-being of humanity [20,21].
Advancements in noise pollution mapping has facilitated in noise pollution assessment. Since such maps display the spatial distributions of noise, one can map noise at different times of the day such as in the morning, mid-day, evening, and night [22]. This allows researchers and authorities to evaluate locations containing possible noise pollutions and decide on necessary actions. One practical application is the analysis and assessment of traffic noise contamination [23]. In other research, noise mapping is performed through interpolating data from different monitoring stations where noise pollution can then be assessed through equivalent sound pressure data [24]. An example of such work was done by Harman et al. [25] where the noise map of Isparta city was generated using inverse distance weighted (IDW), Kriging, and multiquadric interpolation methods using various parameters. Location-wise noise analysis was then assessed based on national environmental noise thresholds. The most applied methodology has always been getting information on the features of various sample points as much as possible in the particular geographic region, as well as predicting the value of the unobserved point from the value of the known point over spatial interpolation [26].

Geographical Information Systems (GIS)
Environmental modeling possesses significant history and progress, and has many applications in problems related to ecology [27]. Urban ecological problems relate to studying wide areas that take advantage of geographical information systems (GISs) [28]. GISs are able to incorporate various information sources allowing data interpretation through various modeling and visualization techniques [29]. Hence, GISs can be considered decision support systems for the relevant authorities to perform assessments and decision making [30][31][32][33]. The use of spatial modeling and statistics has risen up-to-date [34]. Multiple new techniques for the statistical assessment and model patterns have been developed currently [35]. For example, modeling air quality is helpful in air pollution problems controlling [36].
In this study, we evaluate air quality and determine noise distributions in silent zones inside Tikrit University, Tikrit, Iraq. Building upon GIS techniques, we apply the leastsquare model to investigate the impact of noise and meteorological factors on air pollution and PM2.5 prediction. Specifically, the correlation between noise with climatic parameters (i.e., as independent variables) is examined in a multivariate regression model for the PM2.5 particulate quantity estimation. Our data consist of daily manually measured data around Tikrit University (during July 2019) as well as NASA satellite remotely sensed data of PM2.5. The month of July is characterized by high temperatures throughout Iraq, and the university is devoid of students. Therefore, it is possible to determine the extent of noise pollution with the least amount of sound and the extent of the impact of atmospheric factors at peak times. Moreover, we can determine the extent of noise from the university's surroundings. We expect this study to offer insights for the proposal of future air quality management protocols in the study area and the city of Tikrit.

Study Area
Tikrit University is chosen as the study area. It is located in Tikrit city, situated in the Salahuddin province of Iraq, which is 155 km from the capital city of Baghdad [37]. The study area (Figure 1 The main goal of this study is to evaluate the environmental impacts of air and noise pollution occurring inside the university area. Tikrit University is selected due to the fact that it is an important educational region. university's surroundings. We expect this study to offer insights for the proposal of future air quality management protocols in the study area and the city of Tikrit.

Study Area
Tikrit University is chosen as the study area. It is located in Tikrit city, situated in the Salahuddin province of Iraq, which is 155 km from the capital city of Baghdad [37]. The study area (Figure 1) lies between 43° 38′ 56.4″-43° 39′ 35.5″ E and 34° 40′ 33.3″-34° 41′ 2.4″ N. The main goal of this study is to evaluate the environmental impacts of air and noise pollution occurring inside the university area. Tikrit University is selected due to the fact that it is an important educational region.

Data Acquisition and GIS Techniques
Measurements were taken along the study area, which consist of the following:

Data Acquisition and GIS Techniques
Measurements were taken along the study area, which consist of the following: 1. PM2.5 particulate mass (µg/m 3 ). Source: NASA Worldview data; 2.
The overall amassed dataset was then processed using ArcGIS10.3. Two methods were used to represent the data distribution, namely the IDW and least square modeling (LSM). Figure 2 shows the details of data acquisition/data types, along with the measuring devices that were used (i.e., mini sound meter and air quality multimeter).
The overall amassed dataset was then processed using ArcGIS10.3. Two methods were used to represent the data distribution, namely the IDW and least square modeling (LSM). Figure 2 shows the details of data acquisition/data types, along with the measuring devices that were used (i.e., mini sound meter and air quality multimeter). In this study, noise levels were measured using the mini sound meter. Fieldwork involved measuring the maximum noise and minimum noise pollutions inside the university and surrounding areas, which were conducted throughout July 2019. For each sampling site, noise measurements were continuously taken for 30 days. The data collected from each location were processed for statistical analysis.
The data that belong to noise pollution are shown in Figure 2b, which depict the average values of maximum and minimum noise levels in the silence zone of Tikrit city at various time intervals (i.e., 9:00 a.m., 11:00 a.m., 2:30 p.m., and 4:30 p.m.). Weather data (temperature and humidity) were measured by the air quality multimeter. To validate our results, historical data of PM2.5 levels were obtained from air matter, which was provided by the Global Air Quality Service Provider and downloaded from https://air-matters.com/ (accessed on 20 August 2021).
To assess and evaluate the influence of noise and air pollution, the IDW interpolation technique was employed. This method was chosen due to its suitability in flat lands where there is uniformity between the variables. As IDW is a statistical technique, each known point is assumed to affect the magnitude of unknown points. Therefore, the values of points near the known points can be calculated [4,38]. Note that the unknown points' values can be deduced, but with the risk of low accuracy. This is due to the fact that values of converging points searched by IDW can vary significantly. Therefore, interpolated In this study, noise levels were measured using the mini sound meter. Fieldwork involved measuring the maximum noise and minimum noise pollutions inside the university and surrounding areas, which were conducted throughout July 2019. For each sampling site, noise measurements were continuously taken for 30 days. The data collected from each location were processed for statistical analysis.
The data that belong to noise pollution are shown in Figure 2b, which depict the average values of maximum and minimum noise levels in the silence zone of Tikrit city at various time intervals (i.e., 9:00 a.m., 11:00 a.m., 2:30 p.m., and 4:30 p.m.). Weather data (temperature and humidity) were measured by the air quality multimeter. To validate our results, historical data of PM2.5 levels were obtained from air matter, which was provided by the Global Air Quality Service Provider and downloaded from https://air-matters.com/ (accessed on 20 September 2019).
To assess and evaluate the influence of noise and air pollution, the IDW interpolation technique was employed. This method was chosen due to its suitability in flat lands where there is uniformity between the variables. As IDW is a statistical technique, each known point is assumed to affect the magnitude of unknown points. Therefore, the values of points near the known points can be calculated [4,38]. Note that the unknown points' values can be deduced, but with the risk of low accuracy. This is due to the fact that values of converging points searched by IDW can vary significantly. Therefore, interpolated point values were collected in small and closely adjacent areas, ensuring higher point distribution accuracy. The equation for IDW and estimation of z at (x) can be written as Equation (1): where z i denotes the control value for the ith sample point, and w i is a weight that defines the relative importance of the specific control point z i in the interpolation process [30]. The IDW analysis is generated based on the concept of spatial dependence making it a reliable interpolation process for air pollution status prediction. IDW also measures the ratio of the dependency relationships between adjacent and discrete features and specifies the result of the cell in the segment that requires metadata. After performing IDW, two empirical linear models are applied to the results where the first model is constructed based on 100 points from the field data, whereas the second model is constructed based on remotely sensed PM2.5 data. As for the image properties, moderate resolution imaging spectroradiometer (MODIS) images were downloaded at a 30 m resolution per pixel, since 7 July 2019 with WGS 84 projections.

Linear Regression
Linear regression is a statistical prediction technique that models the linear dependence/relationship between a variable with other variables (known as explanatory variables). Based on training/observed data, a linear fit (i.e., the linear model) is estimated through an iterative process of parameter/coefficient updates. These parameters are updated based on the concept of error reduction between each iteration's predicted value (i.e., hypotheses) and the respective known value in the dataset. Once the overall error is minimized, the linear model is considered converged and ready to be deployed. The model we generated correlates PM2.5 to the noise and weather data (i.e., the two explanatory variables) to predict PM2.5 values in the specified region inside the university. This means that the model takes in as input, real-world (previously unseen by the model during training) values of the explanatory variables to estimate the PM2.5 response. If the purpose is to interpret the changes in the response features that can be related to changes in the explanatory features, linear regressions can be used to determine the power of the correlation within the dependent and the explanatory features. Moreover, particularly to ascertain if some of the explanatory features may not have a linear relation with the response ever or to distinguish if any subset of explanatory features may include irrelevant information of the response [39].
where i represents a point location, z i is the estimated factor at i, x 1i . . . x ki are the explanatory factors at i, β 0 is the intercept term, β 1 , . . . , β k are the factor coefficients, and ε is the error term.

Resultant Distribution Maps
We generated the map for the PM2.5 distribution in the study area and the PM2.5 concentration (as air quality evaluation) was in the range of moderate to unhealthy/harmful. Figure 3a,b shows the PM2.5 distribution maps in Tikrit University, indicating field dataset values between 35.01 to 58 µg/m 3 . Moreover, the satellite imagery distribution validated this dataset, which ranged between 38-58 µg/m 3 . The analyzed noise dataset is subsequently mapped to visualize clusters that exhibit different levels of noise (in dB). Two maps of maximum and minimum noise distributions were produced for Tikrit University, as shown in Figure 4. The analyzed noise dataset is subsequently mapped to visualize clusters that exhibit different levels of noise (in dB). Two maps of maximum and minimum noise distributions were produced for Tikrit University, as shown in Figure 4.  It is important to note that these maps were produced through field measurements at pre-defined sites. Maximum noise levels ranged from 53.20-85.79 dB. In particular, higher maximum noise was observed at location points 48, 76, 77, 78, 79, and 88. Point 48, for example, is near the engineering college laboratories, where loud noise can be due to the electrical generator present in the premises. Points 76 and 77 are near a pharmacy college and a restaurant. Point 78 is located at the sub road between the pharmacy and the science colleges, where a generator set is accompanied by road noise. Points 79 and 88 are It is important to note that these maps were produced through field measurements at pre-defined sites. Maximum noise levels ranged from 53.20-85.79 dB. In particular, higher maximum noise was observed at location points 48, 76, 77, 78, 79, and 88. Point 48, for example, is near the engineering college laboratories, where loud noise can be due to the electrical generator present in the premises. Points 76 and 77 are near a pharmacy college and a restaurant. Point 78 is located at the sub road between the pharmacy and the science colleges, where a generator set is accompanied by road noise. Points 79 and 88 are near the science colleges. Some collected points are positioned near the abandoned buildings in the university, which might explain why low maximum noise levels were recorded at these locations (i.e., points 41 and 42).
Minimum noise levels were recorded between 49.30 and 75.09 dB. Higher minimum noise levels were recorded at locations 94, 106, 107, and 108. For point 94, the louder minimum noise is due to its location near construction laboratories. Point 106 is on the main road of the university. Point 107 was in the sub-road between the veterinary and science colleges. Point 108 is located near the sports hall and education college lecture rooms. Low minimum noise levels are noticed at points 24, 38, 41,42, 43, and, 44, which were in open space areas and near abandoned buildings.

Generated Regression Maps
The regression results and model performance are shown in Figure 5. This figure presents the PM2.5 prediction maps at Tikrit University. near the science colleges. Some collected points are positioned near the abandoned buildings in the university, which might explain why low maximum noise levels were recorded at these locations (i.e., points 41 and 42). Minimum noise levels were recorded between 49.30 and 75.09 dB. Higher minimum noise levels were recorded at locations 94, 106, 107, and 108. For point 94, the louder minimum noise is due to its location near construction laboratories. Point 106 is on the main road of the university. Point 107 was in the sub-road between the veterinary and science colleges. Point 108 is located near the sports hall and education college lecture rooms. Low minimum noise levels are noticed at points 24, 38, 41,42, 43, and, 44, which were in open space areas and near abandoned buildings.

Generated Regression Maps
The regression results and model performance are shown in Figure 5. This figure presents the PM2.5 prediction maps at Tikrit University. parameters. The PM2.5 predicted values ranged between 34.9 and 48.1 µg/m 3 based on 100 measured points in the study area. On the other hand, the predicted PM2.5 values based on satellite imagery ranged between 38.8 and 60.6 µg/m 3 . Table 1 shows the regression  statistics and Table 2 shows the modeling synopsis outputs. From Table 1, Multiple R refers to the correlation coefficient of the estimated equation. This output is equal to 0.71 and 0.90 for field dataset and remotely sensed data, respectively. Previous studies indicate that a good correlation has a Multiple R value of 0.70 or greater [43]. R square (R 2 ) values are 0.51 and 0.82, respectively for field dataset and remotely sensed data. R 2 refers to the variance ratio for PM2.5, which is explained by the other parameters in the regression equation. In our study, the remotely sensed data show more potential in variance interpretation between variables, where the model correlates PM2.5 to the noise and weather data to predict PM2.5 values in the specified region within the university area. The Adjusted R squared is similar to R 2 , but adjusted for the number of predictors in the regression model. The Standard Error is the average estimation error of the model.
One of the synopsis outputs in Table 2 is the p-value. In our work, we determined that a variable's p-value must not exceed 0.05 or that particular variable will be excluded from the model (i.e., deemed statistically insignificant). As a result, the Noise max variable is not included in the regression equation for the field dataset, as expressed in Equation (3), but included in the equation for the remotely sensed data (Equation (4)). On the other hand, each coefficient isolates the role of the respective variable from all of the other variables. The intercept term is simply the y-intercept where the fitted regression line crosses the y-axis.

PM2.5 Predicted
where I is the intercept term, C H is the humidity coefficient, C T is the temperature coefficient, C N max is the maximum noise coefficient, C N min is the minimum noise coefficient, H is humidity, T is temperature, N max is the maximum noise, and N min is the minimum noise. Figure 6a,b shows the linear regression model generated using the field and remotely sensed datasets, respectively. For Figure 6a, the resultant R 2 based on the field dataset is 0.51. This indicates a low description of the variability, i.e., 51% as the maximum. On the other hand, the R 2 based on remotely sensed dataset is 0.82, indicating a maximum variability description of 82%, which also indicates higher prediction prowess. Figure 6a,b shows the linear regression model generated using the field and remotely sensed datasets, respectively. For Figure 6a, the resultant R 2 based on the field dataset is 0.51. This indicates a low description of the variability, i.e., 51% as the maximum. On the other hand, the R 2 based on remotely sensed dataset is 0.82, indicating a maximum variability description of 82%, which also indicates higher prediction prowess.

Model Testing
We present the model testing results for both datasets, as shown in Figure 7. R 2 were 0.91 and 0.94 for the model based on field data and model based on the remotely sensed points, respectively. The results indicate that both models fit the test data well and fall within the confidence range. We present the model testing results for both datasets, as shown in Figure 7. R 2 were 0.91 and 0.94 for the model based on field data and model based on the remotely sensed points, respectively. The results indicate that both models fit the test data well and fall within the confidence range.

Discussion
Based on both validated models (Figure 5a,b), the predicted PM2.5 levels were between 34.9-48.1 μg/m 3 and 38.8-60.6 μg/m 3 , respectively. This indicates moderate to harmful air quality (as suggested by the model based on field data) and harmful to unsafe (as suggested by the model based on remotely sensed data). Although both models differed slightly in the predicted values, the air quality seems to be within an unsafe and non-standard safety level, as mentioned in [44,45].
The model in Figure 5a had a low maximum accuracy of 51% obtained by field data fitting. However, increasing the sampling can potentially increase the accuracy. On the

Discussion
Based on both validated models (Figure 5a,b), the predicted PM2.5 levels were between 34.9-48.1 µg/m 3 and 38.8-60.6 µg/m 3 , respectively. This indicates moderate to harmful air quality (as suggested by the model based on field data) and harmful to unsafe (as suggested by the model based on remotely sensed data). Although both models differed slightly in the predicted values, the air quality seems to be within an unsafe and non-standard safety level, as mentioned in [44,45].
The model in Figure 5a had a low maximum accuracy of 51% obtained by field data fitting. However, increasing the sampling can potentially increase the accuracy. On the other hand, the fitting for the model based on remotely sensed data had a maximum accuracy of 82%. This study tried to attain, at least, a minimum correlation of the noise in the PM2.5 prediction.
The geospatial analysis by GIS technology is one of the important and effective methods for determining emissions of pollutants into the air. At the heart of statistical processes, the regression and estimation models are decision-making tools [46]. This study also looked at noise pollution. Environmental noise in Tikrit University was measured and then compared with the recommended health standards from WHO. Figure 4 shows that overall maximum noise levels ranged from 53.20 to 85.79 dB, while minimum noise levels were from 49.30 to 75.09 dB. Note that the WHO guidelines state that noise pollution occurs when levels are above 65 dB [15]. Moreover, according to [47], silent zones should not exceed 45 dB. Based on these standards, the results clearly show that the University of Tikrit is categorized as noisy, which is opposite to what a university should be (i.e., a silent zone). Based on conclusions derived from [48,49], university noise levels should be within 35-45 dB. Our findings are crucial as high nose levels can cause non-auditory impacts on students, lecturers, and others alike [50]. Moreover, this can lead to attention deficits and impaired learning and communication [51]. The overall outcomes on noise pollution in different areas of Tikrit University indicate that noise levels were high in the different sampling locations. We posit that this was due to the relevant surrounding activities involved, large number of motor vehicles, and the existence of generators.

Conclusions
Undoubtedly, air and noise pollution have harmful effects on human health. However, they are unavoidable environmental elements in most urban settings. Researchers are now actively measuring and analyzing both pollutions to gain insight on the causes, as well as to possibly propose solutions where possible. Remotely sensed information and distribution maps can be useful tools for such tasks.
In this paper, air (PM2.5 particulate matter) and noise pollutions are investigated. Specifically, two multiple linear prediction models are generated based on PM2.5 particulate matter measurements and environmental variables (i.e., humidity, temperature, maximum noise, and minimum noise). This study applied the IDW technique and regression analysis based on field measurements and remotely sensed data. Both trained regression models indicate that the PM2.5 particulate and noise pollutions are at undesirable levels in Tikrit University, which could lead to negative consequences if not mitigated. Moreover, and worryingly, this study designated the university as a noisy zone, rather than its supposed designation of being a silent zone.
In terms of the generated linear models, the remotely sensed data-based model had a higher validation accuracy at 82%. Furthermore, model testing showed a 94% accuracy. However, the final prediction values did not differ significantly from the field data-model (at 51% accuracy and 91% testing accuracy).
In conclusion, we believe that mitigative measures can be taken to decrease noise pollutions through planting more trees on both sides of the roads, proper maintenance of the roads, and ensuring that road pavement is based on standard specifications. Moreover, the electrical generators can be covered by silencers to further reduce noise.