Improving the Professional Level of Managers Through Individualized Recommendation to Enhance the Quality of Air Pollutant Management in China

With the rapid development of the economy, and fossil fuel consumption lacking systematic emission controls, China has experienced substantially elevated concentrations of air pollutants, which not only degrades regional air quality but also poses significant impacts on public health. However, faced with the demand for a large number of experts in air pollution protection, people with real expertise for air pollutant management are difficult to find. Therefore, individualized recommendation is an effective and sustainable method for enhancing the professional level of managers and is good for improving the quality of air pollutant management. Thus, this paper initially proposes a novel framework to recommend strengths in air pollutant management. This framework comprises four stages: data preprocessing is the first stage; then, after constructing ability classifications and ability assessment strategies, activity experiences are transformed into corresponding ability values; next, a multilayer perceptron deep neural network (MLP-DNN) is used to predict potential types according to their ability values; finally, a hybrid system is constructed to recommend suitable and sustainable potential managers for air pollutant management. The experiments indicate that the proposed method can assess the full picture of people’s strengths, which can recommend suggestions for building a scientific and rational specialties recommendation system for governments and schools. This method can have significant effects on pollutant emission reduction by enhancing the professional level of managers with regard to air pollutant management.


Introduction
Air pollution in China has attracted considerable attention from the public, scientists, and policymakers [1,2], as the hazards of ultrafine particles affect human health. Air pollution can directly or indirectly affect human health, causing physical discomfort, leading to disease or even death [3]. Regarding air pollution and management, current research is focused on four parts: sources [4][5][6][7][8], relationships with health [9][10][11][12], extreme events [13][14][15][16][17], and control methods [18][19][20][21][22][23]. For example, Ashbaugh [4] used statistical methods to categorize air pollution sources in the United States. Qiu et al. [6] showed that emissions from motor vehicles are one of the main sources of air pollution. Li et al. [7] found that China's air pollution is correlated with rapid industrialization. Cong [9] found that ambient air pollution from waste gas emissions was associated with multiple cancer incidences in Shanghai in a retrospective population-based study. Zhang et al. [14] used remote sensing data and numerical simulations to analyze a heavy air pollution event in Chengdu and found the sources and causes of this event. Lawrence et al. [16] conducted research on air pollution control engineering studying professional courses is reduced; meanwhile, some of those courses are in constant change, which might affect the future employment of students [34]. (2) In colleges, students are cultivated in an undifferentiated educational mode, which makes it difficult for schools to discover the students' strengths or weaknesses and provide individualized guidance [35,36]. (3) Cultivating feedback is lagging. Under large-category cultivation, the school only focuses closely on students from a higher level and with more granularity [37][38][39]. However, this could lead to a disconnection between school education and student feedback. The students are unsure about their abilities, and the school knows nothing about their students' abilities either. This has made a significant impact on the full development of students [40].
Air pollution management in China is badly in need of a large number of professionals every year, and talent in this area needs a long time of professional training to suit future jobs. However, the enrollment of large categories is not professional. This would reduce professional learning for one and a half years. Thus, it is very difficult to master professional knowledge in a short time. Therefore, this requires that students in this major have certain talent in air pollution management. However, in China, most freshmen do not know what they are good at. Thus, there is an urgent need for a set of models that can be analyzed and recommended according to students' daily data. To solve the problem that air pollution management poses needs a large number of professionals, while a large number of students who are gifted in air pollution do not know the correct way to choose a profession and professional knowledge. This paper tries to find an effective solution to this problem. This paper proposes a novel individualized recommendation framework for air pollutant management level improvement. First, a large number of data are collected anonymously. Second, a new ability evaluation model is established. Third, multilayer perceptron deep neural network (MLP-DNN) is used to predict the cultivating types of people. Finally, cultivating types and ability information are integrated to personally recommend everyone.
The main contributions of this paper are summarized as follows: (1) Aiming to accurately recommend talent for each person from numerous data, MLP-DNN classification information, user-based information, and content-based information are integrated to generate individualized recommendations, which accurately and efficiently learn abilities about different kinds of individuals. Moreover, by analyzing students' first-view data, we can more objectively identify their talents and reduce the influence of subjective factors, which would provide considerable help for air pollutant management level improvement. (2) Faced with the chaotic and irregular mass of survey data, a multidimensional ability evaluation model is proposed to acquire the abilities and talents of different people in different aspects. It formulates targeted countermeasures for improving the level of managers by finding people with talent in the field of air pollutant management. (3) Both user-based and content-based information are taken as important information. They are combined in a hybrid way to recommend suitable and sustainable potential managers for air pollutant management.
The main content of this paper is organized as follows. The proposed framework is discussed in Section 2. The experiments and results are presented in Section 3. Section 4 concludes the paper. Some definitions in this paper are given as follows.
• Ability assessment strategy: A set of methods is used to evaluate students' daily performance for all abilities. It is composed of an experience-oriented set, result-oriented set, and bonus rule. More specifically, the ability assessment strategy has the following characteristics. (1) It has a clear view of the items that can be regarded as bonus items. (2) It gives an absolute credit value that can be precisely calculated.

•
Cultivating type: This refers to the method in which the students are recommended to progress during their undergraduate education; for instance, "academic type" is aimed at cultivating students to become academic talents, and "design type" aims to cultivate students to become design talents. • Cultivating content: The activities or competitions that students have to experience for educational reasons, such as reading books, doing homework, watching courses online, and participating in competitions.

•
Experience-based ability growth: This refers to activities in which students participate to improve a certain aspect of their abilities [40,41]. This paper refers to the set of activities that meet the conditions as the experience-oriented set, which is denoted by EC, where EC = {c 1 , c 2 , c 3 , . . . c i , . . . , c n }, and c i indicates activity. Any c i corresponds to a bonus p i , and the bonus rule for experience-based ability growth is represented by BR EC , where BR EC = EC, p 1 , p 2 , p 3 , . . . , p i , . . . , p n .
• Results-based ability growth: This refers to a student's participation in a competition to certify a certain ability. In this paper, the set of competitions that meet the conditions is called the results-oriented set, which is denoted by R, where RC = c 1 , c 2 , c 3 , . . . , c j , . . . , c m , and c j denotes the competition. Any c j corresponds to a bonus point p j , and the bonus rule for results-based ability growth is represented by BR RC , where BR RC = RC, p 1 , p 2 , p 3 , . . . , p j , . . . , p m .

Methods
As shown in Figure 1, the proposed framework is composed of four stages: data, evaluation model, MLP-DNN, and recommendation. (1) Data. We collected data from 4328 questionnaires.
(2) Evaluation model. This paper formulates nine abilities, which are scored and counted through an ability assessment strategy. (3) MLP-DNN. MLP-DNN is used to predict cultivating types for each student. (4) Recommendation. According to the predicted results of MLP-DNN, both user-based and content-based information are combined to construct a recommendation system, recommending suitable information for each student.

Evaluation Model
We innovatively created a student ability evaluation process. To facilitate the evaluation of students' abilities, this paper classifies the abilities of students into nine categories. For the ability classification, we have nine abilities: computer technology, design ability, English ability, mathematics ability, scientific research ability, writing ability, innovation ability, academic performance, and cooperative ability. Table 1 below provides detailed information on ability classification. Table 1. Classification description.

Single Ability Ability Description
Computer technology Measures the performance of students in understanding and writing code Design ability Measures the ability of students to present in design thinking, multimedia design and implementation English ability Measures students' English learning level and English application level Mathematical ability Measures the ability of students in basic mathematics and applied mathematics Scientific research ability Measures students' interest, potential and objective strength in the direction of scientific research Writing ability Measures the ability of students in language organization, logical expression (focusing on technology) Innovation ability Measures student thinking creativity Cooperation ability Measures the awareness, ability, and results of students seeking to work with others in their studies and life Academic performance Measures the learning performance of students in basic school courses We innovatively propose an ability assessment strategy. This paper originally divides ability growth into two categories: experience and results, making an assessment strategy for each one. When conducting the ability assessment strategy, the student ability evaluation result is represented by T, where T = {t 1 , t 2 , . . . , t i , . . . , t 9 }, t i represents one of the nine abilities. The evaluation of ability is composed of three steps. (1) Input the student experience set (S) and process the element S in turn. (2) Classify the experience S into the correct ability classification. Then, the experience-oriented collection (EC) set and the result-oriented collection (RC) set are compared to determine the classification of the experience. Next, query the corresponding bonus rule BR EC and BR RC and determine the bonus score of the experience [42,43]. The above process is repeated until all the experience elements in the set S are scored and complete. (3) Ability result T is obtained. To further analyze the ability evaluation result, multilayer analysis of comprehensive ability T is carried out from the following three perspectives.
(1) Comprehensive ability changing trend. The comprehensive ability of students gradually grows with changes in experience, which can reflect the growth of students more macroscopically. (2) Ability distribution. Students have their own strengths and weaknesses, and the ability distribution analysis can more intuitively show the students' abilities, which is conducive to students' strengths and weaknesses.
(3) Single ability changing trend. Observing the single ability changing trend independently from the time dimension can more specifically reflect the growth of students' single abilities, to deepen the understanding of students. Through the above trials, the scores of every students' abilities are obtained.

MLP-DNN Classifier
MLP is a supervised learning algorithm that learns a function f (·) : R m → R o by training on a dataset [43,44], where m is the number of dimensions for input and O is the number of dimensions for output. Given a set of features X = x 1 , x 2 , . . . , x n and target y, where x n is a certain single ability score of a student, and y is a cultivating type, it can learn a nonlinear function approximate for either classification or regression. MLP-DNN is used to classify the types for all students.
As shown in Figure 2, the process of MLP-DNN is composed of three steps. (1) Dataset preparation. This paper consults with experts and professors in universities to identify students and determine their future cultivating types. The cultivating type is finally classified into four types: "academic type", "R&D type" (short for research and development), "design type", and "social type". (2) Training the MLP-DNN model. A four hidden layers MLP-DNN is built. In the hidden layers, the hyperbolic tan function is taken as the activation function [45], as shown in Equation (1). In the output layer, the softmax function is taken as the activation function, as shown in Equation (2). (3) Classification.
The types of students are classified into four types: "academic type", "R&D type", "design type", and "social type".
where z is the output of the previous layer. In the output layer, we use the softmax function [45], as shown in Equation (2). z i is the i-th element of the input to softmax, which corresponds to class i and K is the number of classes. The loss function for our model is cross-entropy, which is given as follows.
where y represents the target value,ŷ is the parameter estimation of y, α||W|| 2 2 is an L2-regularization term that penalizes complex models, and α > 0 is a nonnegative hyperparameter that controls the magnitude of the penalty. The limited-memory-Broyden-Fletcher-Golfarb-Shanno (L-BFGS) [46,47] algorithm is used to perform parameter updates for our model training process. More details about the experiments are shown in Section 3. For further individualized recommendations, we build a hybrid system that combines user-based CF and content-based recommendations [48]. As shown in Figure 3, our method is composed of three stages.

Recommendation
First, for the input part, we prepare three main matrixes. Initially, the student evaluation result matrix is composed of evaluation results of students who share the same cultivating type that is classified by MLP-DNN at the previous stage. Each row of the matrix is defined as t where S indicates student and t s j i represents the i-th ability value of the j-th student. Then, the content ability requirement matrix is cultivated. It is common that different activities or competitions have different ability requirements for participants. We take advantage of this idea and build a matrix that contains the ability requirements of well-known activities or competitions that are beneficial to students' growth and development. Each row of the matrix is defined as t  Second, for the user-based CF, we initially build the student similarity matrix. The Euclidean distance is taken for similarity calculation, which is defined by Equations (4) and (5).
where O.dist is the Euclidean distance of the different student's abilities, t, t is the ability value of the students, Sim is the similarity of the student's abilities, and Sim is the value of 0-1. The larger the value, the more similar the student's abilities are. When the student similarity matrix is built, the recommended values of each cultivating content are calculated by following Equation (6).
where W d m is the recommended value, n is the number of students currently in the same cultivating type, f is the feedback value, j, i represents the student number, and Sim is the similarity. For the content-based recommendation, we initially build the cultivating content similarity matrix. The Euclidean distance is taken for the similarity calculation defined in Equations (4) and (5). After building the similarity matrix, we use a content-based recommended algorithm to calculate the recommended values by Equation (7).
where W d m is the recommended value, d n is the cultivating content that has been experienced, d m is the present cultivating content, f n is the corresponding feedback values, and Sim is the similarity between the cultivating content.
Finally, user-based CF results and content-based recommendation results are combined for final output by adding weights to calculate the final recommended values, defined as Equation (8).
where W is the final recommended value, and W d m and W d m are the respective recommended values calculated by the previous two methods, α and β are the two recommended weights, which can be customized according to personal preferences. This paper regards user-based CF results and content-based recommendation results as equally important, where α = β = 0.5.

Dataset
A total of 4328 questionnaires were distributed in the survey. The percentages of men and women in the survey were 42.88% and 57.12%, respectively. The gender ratio of the sample was reasonable, and the average age of the respondents was 18 years. Descriptive statistics of the data, rural household registration accounted for 36.99%, nonrural household registrations accounted for 63.01%; age at 18 years old students accounted for 69.83% of questionnaires.

Evaluation
The students' experiences were input, and the ability evaluation results were calculated. For the comprehensive ability calculation, we used the following Equations (9) and (10): where E is the comprehensive ability score, e represents each score, and S is the total score of comprehensive ability. As shown in Figure 4, it indicates that different students show different strengths. In other words, each student has his/her own strengths in one aspect.
We selected students A, B, C, and D from our real student dataset and performed our evaluation process. Students A, B, C, and D were selected from the real student dataset, and all of them represented a certain type of student. As shown in Figure 5, there were three aspects of abilities: changing trend of comprehensive abilities, distribution of abilities, and changing trend of single ability. For the comprehensive ability changing trend in Figure 5a, there were significant differences in the starting point, the increasing range of unit time, and the increasing range of different time periods for different students' comprehensive abilities. Student A had a higher initial ability value. Each stage had a slower ability to grow; Student B had a lower initial ability value, but the subsequent stage had a rapid increase in ability. Student C's ability had a high initial value, and the ability growth was stable during the growth phase. Student D had the ability to rise rapidly in the initial stage, but with less ability to increase in the later period. The growth of the comprehensive ability of different students was closely related to their experience. The ability distribution in Figure 5b indicates the distribution of different students in single abilities. Student A performed well in academic performance, research, and innovation. Student B had outstanding performance in mathematics and programming. Student C had obvious advantages in design, English, and innovation ability. Student D had excellent ability in teamwork, writing, and English. This indicates that the student's ability distribution is a personalized characteristic of the students. For the single ability changing trend in Figure 5c, there are great differences in upload time for different single abilities. Different single abilities are reflected in different students, with different growth rates, and the time period for the explosion and growth of single abilities is also different. This reflects the growth characteristics of single abilities of different students.

MLP-DNN
Datasets were split into a training set and a testing set; 75% of the dataset was randomly selected as the training set, while the rest were taken as the testing set. To better evaluate the performance of the MLP-DNN model, we used the precision (P), recall (R), and F1 (F1-measure), which are defined in Equations (11)-(13), respectively [49].
where B + i is the number of correctly classified students of the i-th cultivating type, A i is the number of students that were classified to the i-th cultivating type by MLP-DNN, and B i is the number of students that were the real i-th cultivating type referring to the real dataset. As F 1 considers both precision and recall values, it was mainly used to evaluate the performance. As shown in Figure 6, it can be seen that MLP-DNN achieves a result with precision of 0.97 and recall of 0.96. To make a comparison and confirm the classification effect, the comparison results are shown in Figure 6. MLP-DNN yields better results than other methods, such as naïve Bayes, KNN, logistic regression, and SVM.

Recommendation
We randomly selected 50 students who were classified as different types for evaluation. First, we conducted our model and produced the recommendation results for each surveyed student, and we selected the top 10 recommendation results from our method. Second, we asked students to give a score ranging from 1-5 to every recommendation result, where '1 represented 'useless', '2 represented 'not so good', '3 represented 'not bad', '4 represented 'helpful' and '5 represented 'great help'. Last, based on those feedback scores, we introduced mean absolute error (MAE) to test the recommendation effects [50]. It is defined as Equation (14).
where f i is the student score value, and y i is our method recommended value. The final result is shown in Figure 7. As shown in Figure 7, our method outperforms the user-based CF and content-based recommendation. Moreover, the MAE is relatively low among the tested students. Some of them even reached below 0.1, which shows the results are very close to the students' real preferences. In summary, the results revealed that our prediction is much greater for students' tastes.
As shown in Figure 8, it is very encouraging that 47% of students considered our recommendation result of great help to them. It is proven that our method can provide a good recommendation for students without a major to develop their abilities.

Conclusions
This paper finds a new mode for improving air pollution management in addition to several methods, such as sources, relationships with health, extreme events, and control methods. Compared with the previous models, the method proposed in this paper is based on long-term and sustainable air pollution management, which may not be effective in the short term. However, it establishes a series of personnel training mechanisms from talent discovery, talent cultivation to talent recommendation. Currently, people are facing the rapid development and impact of big data and artificial intelligence. How to find talented individuals who are truly suitable for air pollution management from the vast crowd of people is very important. In the near future, China will experience the largest enrollment model reform in history, involving millions of people and hundreds of colleges and universities every year, including more than 100 colleges and thousands of students for air pollution management majors alone. The new student training model proposed in this paper will play an important role in the improvement of the overall educational level of China and have a far-reaching impact on the talent training model. Since China has just started the reform of enrollment education for one year, there are fewer experimental subjects. In the future, we will extend our method to more students and more majors for experiment and promotion.