Toward a Survey-Based Assessment of Wind Turbine Noise: The Impacts on Wellbeing of Local Residents

: As a renewable energy source, wind energy harvesting provides a desirable solution to address the environmental concerns associated with energy production to satisfy the increasingly global demand. Over the years, the penetration of wind turbines has experienced a rapid growth, however, the impacts of turbine noise correspondingly become a major concern in wind energy harvesting. Recent studies indicate that the noise emitted by turbine operating could increase the risk of nuisance, which might further a ﬀ ect the well-being of local residents. However, the main factors a ﬀ ecting turbine noise assessment and to what extent they contribute to the assessment are still unclear. In this study, a survey-based approach is developed to identify these major factors and to explore the interactions between the factors and assessment results. Principal component analysis method was adapted to extract key factors; followed by reliability assessment, validity analysis, descriptive assessment, and correlation analysis were conducted to test the robust of the proposed methodology, as well as to examine the interactions between variables. Regression analysis was ﬁnally employed to measure the impacts on results contributed by the key factors. Findings of this study indicate that key factors including physical conditions, control capacity, and subjective opinions are of signiﬁcant impact on residents’ response to wind turbine noise, while the factor of subjective opinions contributes predominately to the assessment results. Further validations also indicate that the proposed approach is robust and can be extensively applied in survey-based assessments for other ﬁelds.


Introduction
Given the environmental impacts and sustainability concerns resulting from fossil-based energy production such as wastes and associated greenhouse gas (GHG) emissions, wind energy harvesting has been found to be a cleaner and sustainable solution to satisfy the increasing global energy demand with lower marginal operating costs. Therefore, the wind turbine penetration has experienced significant growth over the years to promote non-fossil energy production. A survey from the European Union indicated that the capacity of global wind energy production has increased by 9.6% to 591 GW, satisfying 14% of the electricity demanding in 2018 [1]. As the largest wind power market in the world, China is leading the wind energy harvesting in both cumulative installed wind power capacity and newly installed wind power capacity over the past years [2]. According to a study conducted by the International Energy Agency (IEA), the world wind energy harvesting could satisfy over 6% global energy demand by 2023 [3].
With merits, such as cleaner, renewable, and sustainable, however, concerns raised over the noise emission associated with wind turbine operations could not be overlooked. According to the statement of the World Health Organization, health is defined as 'a state of complete physical, mental, and social well-being and not merely the absence of disease or infirmity'. Therefore, noise annoyance and sleep disturbance result from wind turbine operations are widely discussed as a potential health issue [4]. Studies conducted by Poulsen et al. found that people over 65 years old are more likely to fill prescriptions for sleep medication when exposed to high levels of wind turbine noise [5]. This is further supported by another study from Abbasi et al. claiming that noise exposure had a significant effect on general health, sleep disturbance, and annoyance of people living near wind farms [6]. Another study conducted by Michaud et al. indicated that annoyance might increase with the raising of noise levels around the wind turbines [7]. Findings obtained by Onakpoya et al. also demonstrated that the odds of sleep disturbance increased significantly with greater exposure to wind turbine noise [8]. Additionally, Van Renterghem stated that wind turbine noises would become more annoying when mixed with local road traffic noise [9]. However, on the other side, a study funded by the Denmark government claimed that there is a lack of direct connection between cardiovascular disease (associated with short-term exposure) and wind turbine noise [5]. Despite that wind turbine noise might increase the risk of noise nuisances, the identification and validation of interactions between turbine noise exposure and symptoms (e.g., tinnitus, hearing loss, dizziness, and headache) still suffer from scientific evidence [10].
Over the years, a considerable amount of studies has been implemented to examine the impacts on local residents' health results from turbine noise, with indicators including visual effect, self-control of noise, physical condition, attitude, and subjective opinions, as well as sensory acuity and sensitivity have been defined [7,11,12]. Among these indicators, individual sensitivity to turbine noise is a highly controversial topic, as some survey-based studies indicate that people living in higher noise-exposed areas are more sensitive to turbines noise [13,14], while others also argued that it is indeed the people living in lower noise-exposed areas who have higher wind turbines noise sensitivity [15,16]. Visibility of wind turbines was also found to be an important factor affecting the assessment of turbine noise. A study from literature demonstrated that the distance away from wind farms is the essential visual factor which is followed by the color of wind turbines and then the number of turbines installed [17]. This is further supported by a study conducted by Schäffer et al., which indicated that annoyance level might increase with the visibility of wind turbines, however, that also could be relieved by the increase of landscape visibility [5]. In addition, other factors such as attitude on turbine deployment (including feedback on noise emission and visual landscaping), as well as the lifestyle preference and expectation of local residents also play an important role in the noise assessment [18,19]. Findings from an existing study revealed that even people from the same local community are of different attitudes and responses to the turbine noise, due to diverse lifestyle preferences, demand, and expectations [12]. Therefore, a combination of noise experiments with questionnaires could contribute to a comprehensive understanding of impacts on turbine noise response, as well as to identify the indicators which can be employed in the assessment and quantitative analysis of turbine noise effects.
In light of the above, despite a considerable number of studies being conducted to identify the indicators for the ecologic and sustainable assessments of wind energy harvesting with turbines, there is still a gap in developing methods for quantitative analysis of turbine noise impacts. In this study, an integrated modeling strategy was developed with three indicators, including physical condition, control capacity, and subjective opinions were employed, to support the quantitative analysis of turbine noise. The rest of this paper is organized as follows: establishment of the research method and development of survey questions are introduced in Section 2; followed by the formulation of indicators presented in Section 3. After that, data analysis and discussions are conducted in Section 4; and finally comes the concluding remarks and recommendations for future studies in Section 4.

Study Setting and Framework
The survey has been conducted with 178 participants engaged in a random sampling manner. Part of questions in some participants' reports were submitted without response, and there are also some reports of high similarity, 168 of the 178 reports were finally selected for further processing and case studies. As shown in Figure 1, implementation of this research can be divided into six inter-linked phases. Firstly, background analysis is conducted for an initial understanding of participants' background (e.g., age groups and relevant knowledge in wind turbine system and wind energy harvesting). At this stage, participants' personal factors which might affect the assessment would be eliminated. Phase 2 is developed to identify the exploratory factors and explore the interactions problem and factors. The operations including factors identification use the PCA and maximum variance rotated tools, integrated with KMO and Bartlett tests, followed by examination of interactions between problem and factors with a SPSSAU tool. Reliability analysis and validity testing are conducted in Phase 3 to validate the reliability of the sample data, as well as to assess whether the research items can effectively interpret the conceptual situations of research variables. Factors passed the reliability validity tests would be involved in Phase 4 for a descriptive analysis, which is designed to present data distribution, as well as to identify outliers and typos. Correlation analysis is employed in Phase 5 to explore the relationships among variables. While at the final stage, regression analysis is carried out to examine the interdependent quantitative relationships among variables.
Energies 2020, 13, x FOR PEER REVIEW 3 of 16 conducted in Section 4; and finally comes the concluding remarks and recommendations for future studies in Section 4.

Study Setting and Framework
The survey has been conducted with 178 participants engaged in a random sampling manner. Part of questions in some participants' reports were submitted without response, and there are also some reports of high similarity, 168 of the 178 reports were finally selected for further processing and case studies. As shown in Figure 1, implementation of this research can be divided into six interlinked phases. Firstly, background analysis is conducted for an initial understanding of participants' background (e.g., age groups and relevant knowledge in wind turbine system and wind energy harvesting). At this stage, participants' personal factors which might affect the assessment would be eliminated. Phase 2 is developed to identify the exploratory factors and explore the interactions problem and factors. The operations including factors identification use the PCA and maximum variance rotated tools, integrated with KMO and Bartlett tests, followed by examination of interactions between problem and factors with a SPSSAU tool. Reliability analysis and validity testing are conducted in Phase 3 to validate the reliability of the sample data, as well as to assess whether the research items can effectively interpret the conceptual situations of research variables. Factors passed the reliability validity tests would be involved in Phase 4 for a descriptive analysis, which is designed to present data distribution, as well as to identify outliers and typos. Correlation analysis is employed in Phase 5 to explore the relationships among variables. While at the final stage, regression analysis is carried out to examine the interdependent quantitative relationships among variables.

Questionnaire Development
In this study, a three-step data quality control strategy was applied to extract critical information from the questionnaires, which can be detailed as: at the beginning of this survey, participants were provided with a brief introduction of research background and the related examined requirements, followed by a question and answer session conducted on condition that all engaged participants fully understand the requirements. Finally, text clarity for the questionnaires was completed to avoid misunderstanding and confusion. Contents of valid questionnaires are shown in Appendix A with the response data from participants. The questionnaire is composed of two sections, the preexperiments section including Q1 to Q7 was developed to identify the potential critical factors reflecting turbine noise impacts; while the after-experiments section involves Q8 to Q13 was designed

Questionnaire Development
In this study, a three-step data quality control strategy was applied to extract critical information from the questionnaires, which can be detailed as: at the beginning of this survey, participants were provided with a brief introduction of research background and the related examined requirements, followed by a question and answer session conducted on condition that all engaged participants fully understand the requirements. Finally, text clarity for the questionnaires was completed to avoid misunderstanding and confusion. Contents of valid questionnaires are shown in Appendix A with the response data from participants. The questionnaire is composed of two sections, the pre-experiments section including Q1 to Q7 was developed to identify the potential critical factors reflecting turbine noise impacts; while the after-experiments section involves Q8 to Q13 was designed to examine how turbine noise affects participates' wellbeing. During the survey, participants were required to score the questions to evaluate the levels of impact.
The noise signal employed in this study for testing purposes was collected from a large wind farm located at Hunan Province, China. The signal was recorded with normal operations of wind turbines. Figure 2 presents the characteristics of this noise signal with an amplitude spectrum.
Energies 2020, 13, x FOR PEER REVIEW 4 of 16 to examine how turbine noise affects participates' wellbeing. During the survey, participants were required to score the questions to evaluate the levels of impact. The noise signal employed in this study for testing purposes was collected from a large wind farm located at Hunan Province, China. The signal was recorded with normal operations of wind turbines. Figure 2 presents the characteristics of this noise signal with an amplitude spectrum.

Background Analysis
Sample size and data quality assurance are critical features in survey-based assessments and statistical analysis. A larger sample size generally results in precision improvement in unknown factors identification and parameters estimation. However, it also leads to the increase of costs and is time consuming. In light of the above, a sample size of 300 was selected based on a target confidence level of 99% and an effect size (known as the difference between the sample statistics divided by the standard error) of 0.99. The 300 questionnaire copies were distributed to 257 contributors involved in this study. While 178 responses were finally selected for further processing and analysis, considering the impacts from errors such as lack of completion, high similarity, and selection bias in responses.
Followed by the initial selection of samples, the backgrounds of the 178 participates were analyzed with the confirmations including: the age group of participants is mainly between 20 and 25 years old; and there is no participant of industrial experience or professional background in wind energy harvesting.

Exploratory Factor Analysis
Exploratory factor analysis (EFA) is a statistical approach which is widely used in multivariate statistics to explore the hidden interactions between a large number of measured variables [20]. Q1 to Q7 involved in the questionnaire are not conventional questions with prior hypothesis about factors and patterns of measured variables, therefore EFA was employed in this study to explore the interactions between indicators. Since the current stage of this study is preliminarily focused on the identification of indicators that might affect the response of local residents to turbine noise, leaving the quantitative assessment of turbine noise elaborated in future studies.
In the questionnaire study, the principal component analysis (PCA) method was used to extract key factors from samples, while the maximum variance method was employed for rotation processing. PCA is a method widely used to extract the low dimensional features from a high dimensional data access, while retaining trends and patterns [21]. By converting data set into limited dimensions containing essential features, this method is considered more robust and desirable for one-way results. The data processor SPSSAU can rotate the factor space corresponding to the factor rotation, and the normal factors then would be matched with actual factors in this manner. Literature

Background Analysis
Sample size and data quality assurance are critical features in survey-based assessments and statistical analysis. A larger sample size generally results in precision improvement in unknown factors identification and parameters estimation. However, it also leads to the increase of costs and is time consuming. In light of the above, a sample size of 300 was selected based on a target confidence level of 99% and an effect size (known as the difference between the sample statistics divided by the standard error) of 0.99. The 300 questionnaire copies were distributed to 257 contributors involved in this study. While 178 responses were finally selected for further processing and analysis, considering the impacts from errors such as lack of completion, high similarity, and selection bias in responses.
Followed by the initial selection of samples, the backgrounds of the 178 participates were analyzed with the confirmations including: the age group of participants is mainly between 20 and 25 years old; and there is no participant of industrial experience or professional background in wind energy harvesting.

Exploratory Factor Analysis
Exploratory factor analysis (EFA) is a statistical approach which is widely used in multivariate statistics to explore the hidden interactions between a large number of measured variables [20]. Q1 to Q7 involved in the questionnaire are not conventional questions with prior hypothesis about factors and patterns of measured variables, therefore EFA was employed in this study to explore the interactions between indicators. Since the current stage of this study is preliminarily focused on the identification of indicators that might affect the response of local residents to turbine noise, leaving the quantitative assessment of turbine noise elaborated in future studies.
In the questionnaire study, the principal component analysis (PCA) method was used to extract key factors from samples, while the maximum variance method was employed for rotation processing. PCA is a method widely used to extract the low dimensional features from a high dimensional data access, while retaining trends and patterns [21]. By converting data set into limited dimensions containing essential features, this method is considered more robust and desirable for one-way results. The data processor SPSSAU can rotate the factor space corresponding to the factor rotation, and the normal factors then would be matched with actual factors in this manner. Literature review indicated that there is a considerable number of methods developed for factor rotation. However, the maximum variance method is the most popular way used in questionnaire research [22]. Therefore, SPSSAU has been integrated with the maximum variance method for exploring in this research. Table 1 andTable 2 combined with Figures 1 and 2 demonstrate the results of exploratory analysis. Here, Table 1 presents the KMO and Bartlett testing results which aim to initially check whether Q1 to Q7 are feasible for exploratory factor analysis. The KMO-value and p-value are the main indicators in these tests. The involved variables would be independent when the KMO-value is over 0.6, while the p-value is less than 0.05. In this case, the difference between samples results from the probability of sampling error, thus, factor analysis can be applied. As shown in Table 1, the requirements for factor analysis is fully satisfied with a KMO-value of 0.851 and the p-value approximate to 0.  Table 2 gives the variance interpretation, which specified the number of extracted factors, the characteristic root value of each factor, the ratio of variance interpretation, as well as the overall ratio of variance interpretation. Here, the ratio of variance interpretation illustrates the amount of information represented by a given factor (e.g., 30.234% means 30.234% of the overall information can be reflected by the factor), while the overall ratio of variance interpretation gives the percentage of total information can be interpreted by all the factors employed for analysis. It is widely accepted that a value no less than 50% is acceptable, while greater than 60% is desirable. In practice, the initial analysis often starts with a single factor. As shown in Table 3, with the single factor involved in the analysis, all the characteristic root values are greater than 1, demonstrating high inconsistency with the 3 selected factors. In addition, the overall ratio of variance interpretation is 33.699%, less than 50%. Thus, EFA is further implemented with 3 factors involved and result indicated that 94.204% of the total information can be covered and interpreted. Therefore, 3 factors were finally selected as the indicators for this study.
The scree plot shown in Figure 3 can further support the determination of indicators that need to be extracted for analysis. As shown in the diagram, the line is flattened with the increase of the indicator number, while the inflection point gives the number of indicators required for a full interpretation of information with related precision in analysis [23]. In this study, a desirable number of indicators included in assessment and analysis is three. Energies 2020, 13, x FOR PEER REVIEW 6 of 16   The factor loading coefficients reflect the strength of relativities between items and factors, which are normalized within the interval of [−1, 1]. Higher absolute value means a stronger relationship between the item and factor. It is widely accepted that 0.4 is the threshold value, while greater than 0.4 represents strong relatively. The factor loading coefficient can be formulated and calculated with the following steps: firstly, a factor loading matrix L can be obtained by applying PCA with a model expressed as Equation (1): where is a common factor; 1 , 2 , ⋯ is the eigenvalue of the sample correlation coefficient matrix; 1 , 2 , ⋯ is the standard orthogonal feature vector corresponding to the sample correlation coefficient matrix.
After that, common factor variance (common degree) can be obtained from Equation (2): Finally, the variance ratio can be interpreted as Equation (3): where tr(R) is the trace of the correlation matrix.     The factor loading coefficients reflect the strength of relativities between items and factors, which are normalized within the interval of [−1, 1]. Higher absolute value means a stronger relationship between the item and factor. It is widely accepted that 0.4 is the threshold value, while greater than 0.4 represents strong relatively. The factor loading coefficient can be formulated and calculated with the following steps: firstly, a factor loading matrix L can be obtained by applying PCA with a model expressed as Equation (1): where is a common factor; 1 , 2 , ⋯ is the eigenvalue of the sample correlation coefficient matrix; 1 , 2 , ⋯ is the standard orthogonal feature vector corresponding to the sample correlation coefficient matrix.
After that, common factor variance (common degree) can be obtained from Equation (2): Finally, the variance ratio can be interpreted as Equation (3): where tr(R) is the trace of the correlation matrix. The factor loading coefficients reflect the strength of relativities between items and factors, which are normalized within the interval of [−1, 1]. Higher absolute value means a stronger relationship between the item and factor. It is widely accepted that 0.4 is the threshold value, while greater than 0.4 represents strong relatively. The factor loading coefficient can be formulated and calculated with the following steps: firstly, a factor loading matrix L can be obtained by applying PCA with a model expressed as Equation (1): where m is a common factor; λ 1 , λ 2 , · · · λ m is the eigenvalue of the sample correlation coefficient matrix; η 1 , η 2 , · · · η m is the standard orthogonal feature vector corresponding to the sample correlation coefficient matrix. After that, common factor variance (common degree) can be obtained from Equation (2): Finally, the variance ratio can be interpreted as Equation (3): where tr(R) is the trace of the correlation matrix.
As discussed in the previous section, there are three factors/indicators selected in this study for questionnaire setting and analysis. Results indicate that Factor 1 is closely related to Q1, Q2, and Q7. This factor is then termed as the physical condition according to the information presented by the three questions. Similarly, Factor 2 and Factor 3 were termed as subjective opinions and control capacity respectively.

Reliability Assessment
Reliability refers to the consistency of a measurement over time, across different observers, and across parts of the test itself, playing a predominant role in case studies, especially for those survey and statistics-based research [24]. Therefore, reliability analysis for the variables (including influence of wind turbine noise and the three selected factors) were conducted after the EFA operation. Cronbach α reliability coefficient is an indicator widely used to test data reliability. In this study, α-coefficients of variables were calculated and summarized by the data processing program SPSSAU using a standardized Cronbach model expressed as Equation (4). As shown in Table 3, all the α-coefficients are greater than the acceptable threshold interval [0.6, 0.7], which demonstrates that the collected data is of high reliability. CITC is an indicator representing the interaction between items with the threshold often set as 0.4 [25]. In this study, all these CITC-values are significantly higher than the threshold, which indicates a strong correlation between items. In addition, all these α values are over 0.6, showing the collected data is of high reliability: where k is the number of items involved in the questionnaire, si 2 is the intra-question variance of the score for each item, while ST 2 is the total score variance of all items.

Validity Assessment
Validity assessment refers to how accurately a method measures what it is intended to, is another important method which is widely employed to evaluate the quality of research. Validity measurement is conducted with testing of how well the results correspond to the established theories and other measures of the same concept [26]. The practice of validity assessment can be generally classified as content and structure validities. Where the content validity is empirically evaluated by researchers to meet the requirements for following evaluations, while structure validity is achieved by applying statistical methods. In this study, exploratory factors were employed for the structure validity [27] with the results summarized in Table 4. Despite the contents used in this section being similar to those involved in the EFA presented in the previous section, the goal is different as the structure validity is indeed focused on whether the items can truly and effectively reflect the measured information. Given that the strong correlation between items and factors shown in the EFA results are also in line with professional knowledge, and each item can effectively express the concept of a factor (namely, the factor loading coefficient is high), the structure validity then can be confirmed. As shown in Table 4, the common degrees (the degree of a factor to explain the variable information) presenting items of values over 0.4, which illustrates that information can be effectively extracted from the studied research items. This is further supported by a KMO value of 0.851 which is significantly higher than the threshold 0.6. Moreover, given the variance interpretation rates of the three factors being 33.699%, 33.519%, and 26.986% respectively, and the cumulative variance interpretation rates after rotation are 94.204% (over 50%), efficient extraction of research information can also be validated. In light of the above, it is reasonable to conclude that a questionnaire has been effectively developed by this research.

Descriptive Analysis
Descriptive analysis is the first and foremost step for conducting statistical analyses. As descriptive analysis can help to present the data distribution, identify outliers and typos, as well as explore the associations among variables, it is indeed the fundament for further statistical analyses. Moreover, the attitude of samples can also be assessed by examining the overall score and mean value of the quantitative data. In this study, the questionnaire was developed in the form of four-level rating scale, with the options ranging from 1 to 4 indicating the attitudes of samples reflected as answers to questionnaires from the worst to extremely good. Interpretation of the mean value and each individual score are then achieved in this manner. As shown in Figure 5, the results of descriptive analysis illustrate that participants of higher tolerance to wind turbine noise scored 3.10 to the item of influence results after experiments, while the item of control capacity was scored as 2.60. This result indicates that the participants' ability to bear noise is poor, which might be attributed to the large body of low-frequency elements in turbine noise resulting in loss of control capacity. That could also contribute to the understanding of participants' primary states. Energies 2020, 13, x FOR PEER REVIEW 9 of 16 Figure 5. Results of the descriptive analysis.

Correlation Analysis
As a feasible and simple operating method, correlation analysis is widely applied to measure the relationship between quantitative data and examine the correlations between variables [28]. In this study, correlation analysis is employed to identify the relationships between variables and to present the related strength of relationships. Scatter diagrams shown in Figures 6-8 are presented initially to help with the understanding of the mutual situation among variables, followed by formal correlation analysis.

Correlation Analysis
As a feasible and simple operating method, correlation analysis is widely applied to measure the relationship between quantitative data and examine the correlations between variables [28]. In this study, correlation analysis is employed to identify the relationships between variables and to present the related strength of relationships. Scatter diagrams shown in Figures 6-8 are presented initially to help with the understanding of the mutual situation among variables, followed by formal correlation analysis.

Correlation Analysis
As a feasible and simple operating method, correlation analysis is widely applied to measure the relationship between quantitative data and examine the correlations between variables [28]. In this study, correlation analysis is employed to identify the relationships between variables and to present the related strength of relationships. Scatter diagrams shown in Figures 6-8 are presented initially to help with the understanding of the mutual situation among variables, followed by formal correlation analysis.

Correlation Analysis
As a feasible and simple operating method, correlation analysis is widely applied to measure the relationship between quantitative data and examine the correlations between variables [28]. In this study, correlation analysis is employed to identify the relationships between variables and to present the related strength of relationships. Scatter diagrams shown in Figures 6-8 are presented initially to help with the understanding of the mutual situation among variables, followed by formal correlation analysis.    There are two kinds of correlation coefficients widely employed in correlation analysis, the Spearman correlation coefficient and the Pearson correlation coefficient [29]. Compared to the Spearman correlation coefficient, the Pearson correlation coefficient (also known as product difference correlation or product-moment correlation) proposed by a British statistician Pearson in the 20th century is more popular in correlation analysis. Chok survey results showing that for continuous abnormal data (no obvious outliers), the Pearson correlation coefficient may have a clear advantage [30]. In assessments with the Pearson correlation coefficient, an interval [−1, 1] is set for evaluations, while results greater than 0 show positive correlations, and negative values indicate negative correlations. In this study, Pearson correlation coefficients were calculated using Equation (5), and the results listed in Table 5: where is the value of mathematical expectation, while is the covariance. Given the above correlation analysis results, it can be concluded that all the three selected factors are of positive correlations with the influence results. In fact, all the values are close to the upper bound of interval [−1, 1] which also indicates that the three factors and the influence results are correlated strongly and positively.

Regression Analysis
Based on the reliability assessment, validity analysis, descriptive assessment, and correlation analysis conducted to test the robustness of methodology as well as to examine the interactions between variables, a regression analysis is presented in this section to measure the impacts on results from the three factors. Over the years, a considerable number of methods were developed for regression analysis, such as Linear regression, Logistic regression, and Poisson regression [31]. Among them, linear regression is the most popular one due to its simple description of interactions between variables, in which the line that most closely fits the data according to a specific mathematical criterion is found to present the relationships between variables. As numerical The above scatter diagrams Figures 6-8 present approximate linear positive correlations between physical condition, control capacity, subjective opinions, and the influence results. The correlations between variables can be reflected by the correlation coefficients with high precision. There are two kinds of correlation coefficients widely employed in correlation analysis, the Spearman correlation coefficient and the Pearson correlation coefficient [29]. Compared to the Spearman correlation coefficient, the Pearson correlation coefficient (also known as product difference correlation or product-moment correlation) proposed by a British statistician Pearson in the 20th century is more popular in correlation analysis. Chok survey results showing that for continuous abnormal data (no obvious outliers), the Pearson correlation coefficient may have a clear advantage [30]. In assessments with the Pearson correlation coefficient, an interval [−1, 1] is set for evaluations, while results greater than 0 show positive correlations, and negative values indicate negative correlations. In this study, Pearson correlation coefficients were calculated using Equation (5), and the results listed in Table 5: where E is the value of mathematical expectation, while cov is the covariance. Table 5. Pearson coefficient of correlation.
Given the above correlation analysis results, it can be concluded that all the three selected factors are of positive correlations with the influence results. In fact, all the values are close to the upper bound of interval [−1, 1] which also indicates that the three factors and the influence results are correlated strongly and positively.

Regression Analysis
Based on the reliability assessment, validity analysis, descriptive assessment, and correlation analysis conducted to test the robustness of methodology as well as to examine the interactions between variables, a regression analysis is presented in this section to measure the impacts on results from the three factors. Over the years, a considerable number of methods were developed for regression analysis, such as Linear regression, Logistic regression, and Poisson regression [31]. Among them, linear regression is the most popular one due to its simple description of interactions between variables, in which the line that most closely fits the data according to a specific mathematical criterion is found to present the relationships between variables. As numerical dependent and independent variables are included for quantitative analysis, multiple linear regression is a desirable solution for the analysis in this study [32].
A typical linear regression analysis requires the interpretation of the indicators from the F-test (also known as ANOVA test), R-squared test, significance test of independent variables, as well as the d-w and coefficient of variance expansion (VIF). The F test is applied to examine whether a dependent variable is affected by any of the independent variables. In this way, the feasibility of a model is evaluated with its physical meaning is explored [33]. In linear regression analysis, the determination coefficient, which is employed to represent the explanatory ability of independent variables, can be obtained by squaring the sample correlation coefficient [34,35]. The formulation and procedure can be stated as: given a data set y 1 , y 2 , . . . y n and the corresponding model prediction values f 1 , f 2 , . . . , f n , the residual then can be defined as e i = y i − f i and the mean observation value is determined by Equation (6): The summation of squares can be obtained by Equation (7): Then, the following regression summation of squares is determined by Equation (8): Then, the summation of squared residuals can be expressed by Equation (9): Further, the determination coefficient is finally identified by Equation (10): The t-test on regression coefficient is employed to measure the effects on dependent variables from independent variables. Here, given the p-value obtained from the t-test is less than 0.05, the impacts on the dependent variable from independent variables then cannot be overlooked [36]. In another side, the multicollinearity issue of a regression model can be examined with its VIF value. A VIF value tends to 1 means lighter multicollinearity. The model for VIF calculation can be formulated as Equation (11): where (1 − R 2 ) is the tolerability. Detailed results of the regression analysis with independent variables including physical condition, control capacity, and subjective opinions, as well as dependent variable of influence results, can be found in Table 6. The R-squared value of the model is 0.936 in Table 6, therefore, the selected factors including physical condition, control capacity, and subjective opinions can interpret 93.6% of the changes in influence results. This can be further supported by a study conducted in literature, where the involved non-acoustic factors presented 92% influence on results [14]. Indeed, this result is also consistent with the observations from earlier Swedish and Dutch cross-sectional studies. Results also show that the multicollinearity between independent variables is weak. In addition, the model passed the F-test successfully with an F value of 800.389 and a p-value of 0.000, which shows that there are at least one of the factors that would affect the influence results. The model for influence results calculation can be formulated as: Influence results = 0.642 + 0.246 * Physical condition + 0.080 * Control capacity + 0.566 * Subjective opinions; while the final analysis indicates that the regression coefficients of all the three selected factors show significant position impacts on the influence results. A larger regression coefficient means a stronger impact on the results.
As shown in Figure 9, it can be concluded that physical condition, control capacity, and subjective opinions are all of positive influence on people's evaluation of wind turbine noise, while the factor subjective opinions affected the results more significantly.  The R-squared value of the model is 0.936 in Table 6, therefore, the selected factors including physical condition, control capacity, and subjective opinions can interpret 93.6% of the changes in influence results. This can be further supported by a study conducted in literature, where the involved non-acoustic factors presented 92% influence on results [14]. Indeed, this result is also consistent with the observations from earlier Swedish and Dutch cross-sectional studies. Results also show that the multicollinearity between independent variables is weak. In addition, the model passed the F-test successfully with an F value of 800.389 and a p-value of 0.000, which shows that there are at least one of the factors that would affect the influence results. The model for influence results calculation can be formulated as: Influence results = 0.642 + 0.246 * Physical condition + 0.080 * Control capacity + 0.566 * Subjective opinions; while the final analysis indicates that the regression coefficients of all the three selected factors show significant position impacts on the influence results. A larger regression coefficient means a stronger impact on the results.
As shown in Figure 9, it can be concluded that physical condition, control capacity, and subjective opinions are all of positive influence on people's evaluation of wind turbine noise, while the factor subjective opinions affected the results more significantly.

Conclusions and Future Studies
Noise emission is a major concern on the well-being of local residents, which further affected the deployment of wind turbines and wind energy harvesting. This paper presented a survey-based approach for the identification of major concerns/factors based on the response of participants. A questionnaire was developed with 178 participants involved in the wind turbine noise listening experiments. In this study, three factors, physical condition, control capacity, and subjective opinions are selected for modeling and assessment based on a method combining EFA, PCA, and scree plot analysis by the data processing program SPSSAU. After that, reliability and validity assessments were integrated with descriptive and correlation analysis to explore the correlations among factors. R-square regression analysis indicates that the three selected factors can explain 93.6% of the response to turbine. While compared with the other two factors, the correlation between subjective opinions and noise evaluation is much stronger. That means subjective opinions is the major element on turbine noise assessment.

Conclusions and Future Studies
Noise emission is a major concern on the well-being of local residents, which further affected the deployment of wind turbines and wind energy harvesting. This paper presented a survey-based approach for the identification of major concerns/factors based on the response of participants. A questionnaire was developed with 178 participants involved in the wind turbine noise listening experiments. In this study, three factors, physical condition, control capacity, and subjective opinions are selected for modeling and assessment based on a method combining EFA, PCA, and scree plot analysis by the data processing program SPSSAU. After that, reliability and validity assessments were integrated with descriptive and correlation analysis to explore the correlations among factors. R-square regression analysis indicates that the three selected factors can explain 93.6% of the response to turbine. While compared with the other two factors, the correlation between subjective opinions and noise evaluation is much stronger. That means subjective opinions is the major element on turbine noise assessment.
Since both objective and human subjective factors might involve diverse aspects, meanwhile, due to the lack of in-detail and sufficient data, impacts of wind turbine noise on health with medical and mental related investigations are not covered in depth at the current stage. These aspects also remain as part of the focus for the further research and model refinement. It would be meaningful to conduct future studies to identify how the factors affect the turbine noise assessment through a combination of noise listening experiments and questionnaires. This would be helpful in exploring factors that might be most relevant to target-oriented noise assessment and mitigation strategies. Besides, the subjective factor is the result of the main influencing factors, which also explains the controversy over wind turbine noise. Different survey groups have various subjective factors about wind turbines, thus, there are different evaluation results on wind turbine noise. Future studies on turbine noise assessment considering multiple subjective factors is also highly recommended for a more objective view of turbine noise impacts on local residents' wellbeing.

Conflicts of Interest:
The authors declare no conflict of interest.

Nomenclature
A answer cov covariance CITC corrected item-total correlation DOF degrees of freedom E mathematical expectation value e n residual EFA exploratory factor analysis f n corresponding model prediction values h n common factor variance (common degree) K number of items in the questionnaire KMO-value simple correlation coefficients between comparison variables L n factor loading matrix P statistical significance PCA principal component analysis Q question R 2 determination coefficient Si 2 intra-question variance of the score for each item SPSSAU statistical product and service software automatically SS tot summation of squares SS reg regression summation of squares SS res summation of squared residuals ST 2 the total score variance of all items. tr(R) the trace of the correlation matrix VIF variance inflation factor y mean observation value y n data set ρ Pearson correlation coefficients λ n eigenvalue of the sample correlation coefficient matrix η n standard orthogonal feature vector Appendix A Table A1. Questionnaire survey of people's influence on wind turbines.

Q1 A1
The condition of your body at this moment? A. very poor (0) B. poor (9)  You can control your annoyances when exposed in a noisy environment.

Q8 A8
Did you feel dizzy during the testing? A. strongly agree (0) B. agree (4)  Do you agree that wind turbine noise is a kind of hazard after the testing?