Tool for Predicting College Student Career Decisions: An Enhanced Support Vector Machine Framework

: The goal of this research is to offer an effective intelligent model for forecasting college students’ career decisions in order to give a useful reference for career decisions and policy formation by relevant departments. The suggested prediction model is mainly based on a support vector machine (SVM) that has been modiﬁed using an enhanced butterﬂy optimization approach with a communication mechanism and Gaussian bare-bones mechanism (CBBOA). To get a better set of parameters and feature subsets, ﬁrst, we added a communication mechanism to BOA to improve its global search capability and balance exploration and exploitation trends. Then, Gaussian bare-bones was added to increase the population diversity of BOA and its ability to jump out of the local optimum. The optimal SVM model (CBBOA-SVM) was then developed to predict the career decisions of college students based on the obtained parameters and feature subsets that are already optimized by CBBOA. In order to verify the effectiveness of CBBOA, we compared it with some advanced algorithms on all benchmark functions of CEC2014. Simulation results demonstrated that the performance of CBBOA is indeed more comprehensive. Meanwhile, comparisons between CBBOA-SVM and other machine learning approaches for career decision prediction were carried out, and the ﬁndings demonstrate that the provided CBBOA-SVM has better classiﬁcation and more stable performance. As a result, it is plausible to conclude that the CBBOA-SVM is capable of being an effective tool for predicting college student career decisions.


Introduction
With the advancement of technology and the development of society, the world today has become more challenging and uncertain.It can be said that we are now in the VUCA era, which is characterized by volatility, uncertainty, complexity, and ambiguity.The frequent "black swan events" in recent years and the new crown epidemic that swept the world this year are two very prominent examples [1].In such an uncertain VUCA era, it is crucial for everyone to find their own positioning and future development direction.As a special group, college students are the backbone of China's future society and the group of people who are responsible for China's dream of achieving great rejuvenation.Thus, having strong career development ability is a requirement for their comprehensive quality and professionalism and also reflects their learning achievements in their college career.In September 2019, the Ministry of Education issued "the Opinions on Deepening the Reform of Undergraduate Education and Teaching to Comprehensively Improve the Quality of Talent Training", proposing to "deepen the reform of the education and teaching system", "develop personalized training programs and academic career plans" for college students, and "build a professional setting management system oriented to economic and social development and students' career development needs".The career development of college students is closely related to their academic career and career, and it is also related to the results of undergraduate education and teaching reform and the quality of talent cultivation, which are valued at the national level.
In recent years, especially since the Ministry of Education issued the notice of "Teaching Requirements of Career Development and Career Guidance Course for College Students", career planning education has been in full swing in all colleges and universities, and the career development of college students has received attention from the state, society, colleges and universities, and scientific research institutions at all levels.There are many pieces of research on the career development of college students, but the existing research mainly focus on career education and guidance and career theory and application, while there is a lack of further exploration on the empirical research and model construction of college students' career development; there is especially still a lot of room for exploration in combining the latest theoretical research results with the characteristics of Chinese local students at present.Since 2018, with the entry of "post-00" college students into colleges and universities, "post-00" college students have accounted for half of the population so far.The "post-00s" college students have a strong sense of autonomy, which is reflected in their desire to choose their own learning style, major direction, and life circle according to their own interests.They have a strong sense of self-awareness and self-identity, which is reflected in their courage to express and insist on their own opinions; they have pragmatic and rational life goals and realization paths, which is reflected in their belief that success mainly depends on personal efforts.After sorting out these characteristics, it is found that the Self-Determination Theory (SDT), which is currently the focus of academic circles, is in line with the characteristics of college students as the research target and the characteristics of the times.
According to Deci Edward L. and Professor Ryan Richard M., two well-known American psychologists, self-determination theory was first suggested in the 1980s and is a cognitive-motivational explanation of human self-determined action [2].Those who believe in self-determination theory believe that people are active creatures who possess an inbuilt capacity for self-determination and psychological growth.This potential leads people to engage in interest-oriented behaviors that are conducive to the development of their abilities, and this innate motivation for self-determination constitutes an intrinsic motivation for human behavior.After several decades of research and development, selfdetermination theory has gradually formed a relatively complete theoretical system on human motivation and personality, which has been widely applied in the fields of organizational management, sports, psychological medicine, and educational counseling [3].
"Autonomy needs" are about having the psychological freedom to do things of one's own choice, "Competence needs" are about having control over one's environment and growing as a person, and "Belonging needs", also called relationship needs, are about having a sense of connection with other people.These three needs are important for people to grow, internalize, and be happy [4].The main assumption of this theory is that when the three basic psychological needs of autonomy, competence, and relationship are met, people will be more willing and able to participate in activities, which will lead to more sustained and high-quality behaviors and better behavioral outcomes as well as better physical and mental health for people.At the same time, we find that the three basic psychological needs are very individualized and, interestingly, can exist widely across cultures and situations.Through a literature review, the core assumptions of selfdetermination theory are consistent with the characteristics of college students; therefore, it is feasible to discuss the construction of a career development model for college students based on self-determination theory.
Until now, many studies have been conducted to better investigate and discuss the career development of college students.Using interview data from product development interns at a single engineering business, Powers et al. [5] contributed insights into the particular abilities that interns describe as gaining in their internship and identified linkages between school-and-work learning.Kim et al. [6] used a sample of 420 South Korean college students to analyze the cultural validity of the family impact scale in order to determine the degree to which family played a role in college students' career development within collectivistic societies.Kiselev et al. [7] addressed the social constructivism foundations of machine learning approaches in career advising, as well as the relevance of social networks in psychological research.Chung et al. [8] employed random forests in machine learning to predict students at risk of dropping out in order to identify and assist students who are in danger of dropping out.Luo et al. [9] looked at how stereotyped attitudes about STEM occupations influenced STEM self-efficacy and STEM career-related result expectancies as well as how these constructs predicted STEM career desire in upper primary pupils.Nauta et al. [10] looked at the effects of interpersonal interactions on gay, lesbian, bisexual, and heterosexual college students' job decisions.Park et al. [11] investigated the impacts of a future time perspective on job selections, which comprised three sessions that were opportunity, value, and connectivity.By collecting 558 completed questionnaires, Lee et al. [12] investigated the influence of several significant professional decision-making elements (i.e., advisers, industry mentors, parents, faculty members, and social media) on students.
Therefore, in order to discuss the construction of a career development model for college students based on self-determination theory, this paper proposes a support vector machine (SVM) combined with improved butterfly optimization algorithm (BOA) named CBBOA that adds two mechanisms to BOA.First, a communication mechanism (CM) was added, which can enhance the exploitation ability and improve the convergence accuracy of the original BOA.We also introduced Gaussian bare-bones mechanism into the original BOA.Mutation mechanism in Gaussian bare-bones can increase the diversity of the population and avoid falling into local optimum.In addition, in view of the shortcomings of SVM, we propose a new CBBOA-SVM model, which can substantially improve the classification accuracy of the original SVM by optimizing the parameters.In order to verify the effectiveness of the proposed CBBOA, we conducted a series of experiments based on benchmark functions.Simulation results illustrate that the algorithm showed better performance than the original BOA.In order to better study the career decision factors of college students, it is necessary to conduct comparative experiments between CBBOA-SVM and other algorithms.The results show that CBBOA-SVM can produce more accurate classification results as well as greater stability in terms of the four indicators studied when compared to all other comparison methods.
The following are some of its most significant contributions: • An enhanced BOA (CBBOA) is proposed, where a communication mechanism is used to boost the exploitation ability of the convergence accuracy, as well as a Gaussian bare-bones mechanism is utilized to raise the diversity of the population and the capacity of avoiding falling into local optimum.

•
A new CBBOA-SVM with a feature selection model is developed to predict the future career decisions of college students, which can aid jobless college students in selecting acceptable occupations for themselves but also help the government's macromanagement of college students' employment market.

•
The performance of CBBOA is experimentally verified by comparing the high-quality algorithm with CBBOA, which shows that the performance of CBBOA is better than other peers.

•
To further validate the performance of CBBOA-SVM with feature selection, we compared it with five other similar methods, which indicates that it has a better performance than other similar methods and can be used to study the impact of career decisions.
The remainder of the paper's structure is depicted below.The proposed CBBOA model and the CBBOA-SVM model are described in detail in Sections 2 and 3, respectively.Section 4 is primarily concerned with the introduction of the data source and simulation settings.On the real-world dataset, the experimental results of CBBOA on benchmark functions and CBBOA-SVM on the benchmark functions are discussed in Section 5. Section 6 is devoted primarily to the discussion of the improved algorithm and its implications.Finally, there is a section dedicated to summaries and advice.

Proposed CBBOA
CBBOA has added two mechanisms compared with the original BOA.These two mechanisms are the communication mechanism and Gaussian Bare-Bones, respectively.The two mechanisms and the main flow of the proposed CBBOA are described in detail in the remainder of this section.

Communication Mechanism (CM)
This communication mechanism (CM) is inspired by DE [13] and proposed in the recent work [13].CM will select one individual in the population to communicate with other two optimal individuals in the population, thus generating the location of new individuals.This updated approach will enhance the exploitation capabilities.The updated formula of CM is as follows: where a is a random number in the range [0, 1]; X t i represents the i-th individual in the population in the t-th iteration; S(1) and S(2) represent the two best individuals in the population.

Gaussian Bare-Bones
Gaussian Bare-Bones has been widely used in other algorithms.For example, Kennedy et al. [14] proposed bare-bones particle swarms optimization (BBPSO), and Omran et al. [15] proposed bare-bones differential evolution (BBDE) through this mechanism.Of course, the application of Gaussian Bare-Bones in these algorithms has changed and improved.In this paper, one of the updated formulas of Gaussian Bare-Bones adopts a new Gaussian variation strategy proposed by Wang et al. [16].The formula of the Gaussian variation strategy is as follows: where N(mu, sigma) means a random generation from a Gaussian distribution with the mean of mu = x best,j + x i,j /2 and the standard deviation of sigma = x best,j − x i,j .In this paper, in addition to the Gaussian variation mentioned above, another formula adopts the variation mutation strategy based on DE.The two mutation methods can effectively increase the population diversity and prevent the algorithm from falling into local optimum.The complete Gaussian Bare-Bones update formula is as follows: where CR is the predetermined mutation probability.x t i 1 ,j , x t i 2 ,j and x t i 3 ,j are the i 1 th population, the i 2 th and the i 3 th individual components in the jth dimension respectively.i 1 , i 2 and i 3 are three random numbers between [1, N], and i 1 = i 2 = i 3 .

Description of CBBOA
In this section, we propose CBBOA based on the above two mechanisms and introduce the entire process of CBBOA.CBBOA adds the above two mechanisms to BOA.CBBOA enters the communication mechanism stage after updating the population.Next, we use Equation (1) to generate a temporary individual V.Then, use Equation (3) to generate a temporary population S. Afterwards, it decides whether to replace the original population after comparison.Finally, we update the optimal solution; so far, a complete iteration end. Figure 1 shows the concrete process.

Proposed CBBOA-SVM Method
The two factors that affect the classification accuracy of SVM are the setting of hyperparameters and the selection of feature set, where the hyperparameters include penalty factor C and the kernel parameter , which greatly affect the classification accuracy.Feature subsets also use the entire set or a random selection, which results in low efficiency and accuracy.Based on this, we propose CBBOA-SVM to optimize the SVM by searching for the optimal hyperparameters as well as a subset of features.Next, we apply the model to two realistic scenarios to test the superiority of the model.Figure 2 depicts the framework of the CBBOA-SVM.The model consists mostly of two key components.

Proposed CBBOA-SVM Method
The two factors that affect the classification accuracy of SVM are the setting of hyperparameters and the selection of feature set, where the hyperparameters include penalty factor C and the kernel parameter γ, which greatly affect the classification accuracy.Feature subsets also use the entire set or a random selection, which results in low efficiency and accuracy.Based on this, we propose CBBOA-SVM to optimize the SVM by searching for the optimal hyperparameters as well as a subset of features.Next, we apply the model to two realistic scenarios to test the superiority of the model.Figure 2 depicts the framework of the CBBOA-SVM.The model consists mostly of two key components.The classifica-tion accuracy (ACC) of this optimized SVM is acquired in the right half using 10-fold cross-validation, nine of which are used to train and the remainder to test.
The classification accuracy (ACC) of this optimized SVM is acquired in the right half using 10-fold cross-validation, nine of which are used to train and the remainder to test.

Collection of Data
In this paper, a random web survey was conducted using Questionnaire Star to randomly select students from general undergraduate colleges and universities (comprehensive category), Sino-foreign cooperative colleges and universities, higher vocational colleges and universities (comprehensive category), and general undergraduate colleges and universities (specialist category).A total of 557 questionnaires were collected.Taking into account the volume of questions and the speed of answering, those with an answer time of 120 seconds or more were classified as valid questionnaires, totaling 445, of which 310 were male and 247 were female, and the distribution of majors included science and technology, medicine and health, literature, history and philosophy, arts, and sports.By examining the gender, education, grade, major, place of origin of the respondents, as well as the Chinese version of the Basic Psychological Needs Satisfaction Scale (nine attributes) using self-determination theory, and the Student Career Construct Questionnaire (25 attributes) (see Table 1), the importance of these attributes and their intrinsic connections were explored, and a prediction model for college students' career decisions was established on this basis.

Experiments 4.1. Collection of Data
In this paper, a random web survey was conducted using Questionnaire Star to randomly select students from general undergraduate colleges and universities (comprehensive category), Sino-foreign cooperative colleges and universities, higher vocational colleges and universities (comprehensive category), and general undergraduate colleges and universities (specialist category).A total of 557 questionnaires were collected.Taking into account the volume of questions and the speed of answering, those with an answer time of 120 seconds or more were classified as valid questionnaires, totaling 445, of which 310 were male and 247 were female, and the distribution of majors included science and technology, medicine and health, literature, history and philosophy, arts, and sports.By examining the gender, education, grade, major, place of origin of the respondents, as well as the Chinese version of the Basic Psychological Needs Satisfaction Scale (nine attributes) using self-determination theory, and the Student Career Construct Questionnaire (25 attributes) (see Table 1), the importance of these attributes and their intrinsic connections were explored, and a prediction model for college students' career decisions was established on this basis.

Experimental Setup
The experiment was carried out with the assistance of the MATLAB R2018 software.Prior to dealing with classification, the data were scaled to [−1,1] before being analyzed.
In computational science, the fair setting of experiments plays a very important role in the comparison of methods, such as molecular signature identification [17,18], drug discovery [19,20], and recommender system [21][22][23][24].The data was split using the k-fold cross-validation (CV) method, with the value of k being set to 10 [25,26].

Benchmark Function Validation
This section mainly introduces and discusses related experiments to verify the performance of CBBOA.In order to verify the performance of CBBOA, we tested it with other advanced algorithms on the selected benchmark functions.In addition, in order to explore the impact of the mechanisms on CBBOA, a balance and diversity analysis experiment was also added.

Test Conditions and Benchmark Functions
It is important to set the fair test conditions for the comparison experiment [27][28][29][30][31].All experiments were tested in the same environment, where the dimension, number of populations, and number of random runs were set to 30, while the maximum number of evaluations was 300,000, and CEC2014 was chosen as the test function set [32].Table A1 shows the selected 30 benchmark functions which have been used in many recent studies [33,34].In the table, the last three columns indicate the dimensionality, the upper and lower bounds of the search space, and the optimal solution of the corresponding function.In addition, the functions are divided into unimodal, simple multimodal, hybrid, and composition functions.The reason for choosing different types of functions is to evaluate the performance of the algorithm more comprehensively.
In the table, Avg represents the average result, and Std represents the standard deviation.The best performing results on each function in the table have been bolded.We can see nine of the results of CBBOA ranking first.In the F1~F16, CBBOA ranks in the top three out of 15 functions.On F17~F30, which have more complicated function structures, CBBOA ranks in the top three among the nine functions.In contrast, the original BOA only performed well in F23, F24, and F25, and ranked at the end among other functions.This shows that the improvement for BOA is effective.CBBOA can achieve good results on different kinds of functions.These functions are more comprehensive and more versatile.
In order to analyze the experimental results more comprehensively, we used the Wilcoxon signed-rank test [45] to analyze the experimental results.In the Wilcoxon signedrank test, when the p-value is less than 0.05, it indicates that this algorithm has a significant improvement compared with another algorithm in statistics.
Table A3 shows the p-value of CBBOA compared with other algorithms.Values greater than or equal to 0.05 are indicated in bold.From the table, we can see that CBBOA has improved significantly over CESCA in all functions.Compared with BMWOA, it can be found that only one function has a test value not less than 0.05.Compared with the original BOA, CBBOA has significant improvements in functions other than F24~F27.This once again proves that the performance of CBBOA is superior and more comprehensive.
The convergence curves of this experiment can be seen in Figure 3. From the figure, we can see that in F1, F2, F6, and F17, the convergence curve of CBBOA has obvious advantages.In F9, algorithms such as ALCPSO and GL25 have stabilized in the early stage of iteration, while CBBOA can continue to decline.This shows that CBBOA doesn't easily fall into the local optimum.This is where CBBOA has improved compared with BOA.

Mechanism Comparison Experiment
We made a mechanism comparison experiment to compare the effects of the two mechanisms added in CBBOA.To compare the effects of the two mechanisms, we set up an algorithm that uses only a single mechanism, hence the following four algorithms are used for testing: CBBOA, BOABB, BOACM, and BOA.BOABB means that the original BOA adds Gaussian Bare-Bones.BOACM only adds a communication mechanism.CBBOA added both mechanisms.BOA is the original version.The four algorithms are tested on the same 30 benchmark functions.Table A4 records the experimental results.Similarly, the best-optimized solution for each function is bolded.In the table, we can see that among 30 functions, CBBOA ranks first in 19 functions.In addition to comparing Avg, comparing the values of Std can prove that CBBOA is more stable than other algorithms.This shows that the combination of the two mechanisms exerts a better effect.
Figure 4 shows the convergence curves of the above experiment.In F1 and F11, we can clearly see that CBBOA has got a better solution.We can see that the optimal solutions of the four algorithms are relatively similar in F24 and F25.However, the convergence speed of CBBOA is faster than BOABB and BOA in these two functions.The above experimental analyses all show that the combination of the two mechanisms helps BOA achieve better performance.

Mechanism Comparison Experiment
We made a mechanism comparison experiment to compare the effects of the two mechanisms added in CBBOA.To compare the effects of the two mechanisms, we set up an algorithm that uses only a single mechanism, hence the following four algorithms are used for testing: CBBOA, BOABB, BOACM, and BOA.BOABB means that the original BOA adds Gaussian Bare-Bones.BOACM only adds a communication mechanism.CBBOA added both mechanisms.BOA is the original version.The four algorithms are tested on the same 30 benchmark functions.Table A4 records the experimental results.Similarly, the best-optimized solution for each function is bolded.In the table, we can see that among 30 functions, CBBOA ranks first in 19 functions.In addition to comparing Avg, comparing the values of Std can prove that CBBOA is more stable than other algorithms.This shows that the combination of the two mechanisms exerts a better effect.
Figure 4 shows the convergence curves of the above experiment.In F1 and F11, we can clearly see that CBBOA has got a better solution.We can see that the optimal solutions of the four algorithms are relatively similar in F24 and F25.However, the convergence speed of CBBOA is faster than BOABB and BOA in these two functions.The above experimental analyses all show that the combination of the two mechanisms helps BOA achieve better performance.

Qualitative Analysis
The purpose of this part is to undertake a qualitative study of CBBOA.In the first instance, the function of CEC14 is subjected to a feasibility study.The findings of the feasibility study of CBBOA and BOA are shown in Figure 5.In the illustration, there are five columns.The figure shows five columns of data, from left to right, indicating the threedimensional distribution, two-dimensional distribution, and one-dimensional distribution of the search trajectory of CBBOA in the multidimensional space, the variation of the average fitness, and the convergence curve, respectively.In Figure 5b, the red dot shows the position of the optimum solution, and the black dot represents the location of the search for CBBOA, which is shown in the lower right corner.The fact that the black dots are dispersed over the whole search plane in the picture indicates that CBBOA is capable of traversing the solution space to the greatest extent conceivable.The black dots closest to the best answer are the densest, indicating that CBBOA can define the target region and perform additional development in this part of the world.The trajectory curve in Figure 5c fluctuates greatly in the early period and tends to stabilize in the later period.The fluctuation of the trajectory indicates that the algorithm is searching extensively in the early period.When the algorithm finds the target area, the trajectory becomes stable.In Figure 5d, the average fitness decreases during the iteration.The average fitness dropped to a lower value in the mid-term, indicating that the CBBOA has a good convergence speed.In Figure 5e, the convergence curve of CBBOA is lower than BOA.This shows that the quality of the solutions found by CBBOA is better.

Qualitative Analysis
The purpose of this part is to undertake a qualitative study of CBBOA.In the first instance, the function of CEC14 is subjected to a feasibility study.The findings of the feasibility study of CBBOA and BOA are shown in Figure 5.In the illustration, there are five columns.The figure shows five columns of data, from left to right, indicating the three-dimensional distribution, two-dimensional distribution, and one-dimensional distribution of the search trajectory of CBBOA in the multidimensional space, the variation of the average fitness, and the convergence curve, respectively.In Figure 5b, the red dot shows the position of the optimum solution, and the black dot represents the location of the search for CBBOA, which is shown in the lower right corner.The fact that the black dots are dispersed over the whole search plane in the picture indicates that CBBOA is capable of traversing the solution space to the greatest extent conceivable.The black dots closest to the best answer are the densest, indicating that CBBOA can define the target region and perform additional development in this part of the world.The trajectory curve in Figure 5c fluctuates greatly in the early period and tends to stabilize in the later period.The fluctuation of the trajectory indicates that the algorithm is searching extensively in the early period.When the algorithm finds the target area, the trajectory becomes stable.In Figure 5d, the average fitness decreases during the iteration.The average fitness dropped to a lower value in the mid-term, indicating that the CBBOA has a good convergence speed.In Figure 5e, the convergence curve of CBBOA is lower than BOA.This shows that the quality of the solutions found by CBBOA is better.Figure 6 shows the results of the balanced analysis of CBBOA and BOA, tested on the same functions as the diversity analysis above.The results show the presence of three curves, red, blue, and green, in each graph.Red indicates the exploration capability of the algorithm, blue indicates the exploitation capability, while green is the increment-decrement curve, which is used to describe the trend of the red and blue curves, with the curve rising, indicating the dominance of exploration.On the contrary, exploitation behavior dominates.From the figure, we can see that the two behaviors are at the same level when the incremental-decremental curve reaches the maximum.
From the selected graphs, we can see that the added mechanisms have a great influence on the balance of the original BOA.The original BOA maintains a high exploration and low exploitation trend in functions.However, from the proportion of the two behaviors, we can see that the proportion of BOA exploration behavior is too high.For example, in F10, exploration accounts for more than 90%.This may lead to BOA not being able to get a high-quality solution.In contrast, the exploitation behavior of CBBOA accounts for a higher proportion.It means that it spends most of its time exploiting the target area.Figure 6 shows the results of the balanced analysis of CBBOA and BOA, tested on the same functions as the diversity analysis above.The results show the presence of three curves, red, blue, and green, in each graph.Red indicates the exploration capability of the algorithm, blue indicates the exploitation capability, while green is the incrementdecrement curve, which is used to describe the trend of the red and blue curves, with the curve rising, indicating the dominance of exploration.On the contrary, exploitation behavior dominates.From the figure, we can see that the two behaviors are at the same level when the incremental-decremental curve reaches the maximum.
From the selected graphs, we can see that the added mechanisms have a great influence on the balance of the original BOA.The original BOA maintains a high exploration and low exploitation trend in functions.However, from the proportion of the two behaviors, we can see that the proportion of BOA exploration behavior is too high.For example, in F10, exploration accounts for more than 90%.This may lead to BOA not being able to get a high-quality solution.In contrast, the exploitation behavior of CBBOA accounts for a higher proportion.It means that it spends most of its time exploiting the target area.
Figure 7 is the result of the diversity analysis.From the figure we can see that the diversity curves are all decreasing curves.The reason is that the algorithm randomly generates the population at the beginning, so the diversity at the beginning is large.In the process of algorithm iteration, the continuous narrowing of the search range makes the population diversity continue to decrease.As can be seen from Figure 7, the population diversity of BOA is maintained at a high value in multiple functions.This shows that BOA has been kept in a large search range and cannot determine the target area, which makes the algorithm sometimes unable to find high-quality solutions.The diversity of CBBOA can maintain a steady decline, indicating that it has determined the region where the optimal solution is located and further developed.Figure 7 is the result of the diversity analysis.From the figure we can see that the diversity curves are all decreasing curves.The reason is that the algorithm randomly generates the population at the beginning, so the diversity at the beginning is large.In the process of algorithm iteration, the continuous narrowing of the search range makes the population diversity continue to decrease.As can be seen from Figure 7, the population diversity of BOA is maintained at a high value in multiple functions.This shows that BOA has been kept in a large search range and cannot determine the target area, which makes the algorithm sometimes unable to find high-quality solutions.The diversity of CBBOA can maintain a steady decline, indicating that it has determined the region where the optimal solution is located and further developed.

Predicting Results of Employment Stability
In this subsection, to investigate the impact of career decisions, we evaluated CBBOA-SVM with feature selection (CBBOA-SVM-FS) with some real datasets collected.The results of the evaluation using accuracy (ACC), Matthews Correlation Coefficient (MCC), Sensitivity, and Specificity are given in Table 2.The ACC of a model is defined as the proportion of properly categorized events out of all classified events, and it represents the model's performance in categorizing the information.Specificity is a performance metric used to evaluate the ability of a binary classification model to distinguish between normal and abnormal cases.The sensitivity of the binary classification model is used to evaluate the metrics of the model in terms of spotting aberrant data.In order to properly examine the effectiveness of the classification model, the MCC is utilized.This provides a more objective predictive evaluation than just percentile rankings.Among the results given, the ACC result for CBBOA-SVM-FS is 94.2%, the MCC result is 88.9%, the Sensitivity result is 94.5%, and the Specificity result is 94%.The analysis of the results obtained by CBBOA-SVM-FS through these four evaluation metrics fully illustrates the feasibility of using CBBOA-SVM-FS to study the impact of career decisions for college students.

Predicting Results of Employment Stability
In this subsection, to investigate the impact of career decisions, we evaluated CBBOA-SVM with feature selection (CBBOA-SVM-FS) with some real datasets collected.The results of the evaluation using accuracy (ACC), Matthews Correlation Coefficient (MCC), Sensitivity, and Specificity are given in Table 2.The ACC of a model is defined as the proportion of properly categorized events out of all classified events, and it represents the model's performance in categorizing the information.Specificity is a performance metric used to evaluate the ability of a binary classification model to distinguish between normal and abnormal cases.The sensitivity of the binary classification model is used to evaluate the metrics of the model in terms of spotting aberrant data.In order to properly examine the effectiveness of the classification model, the MCC is utilized.This provides a more objective predictive evaluation than just percentile rankings.Among the results given, the ACC result for CBBOA-SVM-FS is 94.2%, the MCC result is 88.9%, the Sensitivity result is 94.5%, and the Specificity result is 94%.The analysis of the results obtained by CBBOA-SVM-FS through these four evaluation metrics fully illustrates the feasibility of using CBBOA-SVM-FS to study the impact of career decisions for college students.To further validate the performance of CBBOA-SVM-FS, we compared it with five other similar methods, namely CBBOA-SVM, BOA-SVM, ANN, RF, and KELM.The evaluation results of ACC, sensitivity, specificity, and MCC for each method are shown in Figure 8.For the most important one, ACC, CBBOA-SVM-FS obtained a high result of 94.20%, which is 1.90% better than CBBOA-SVM, 2.40% better than BOA-SVM, 8.80% better than ANN, 2.90% better than RF, and 4.90% better than KELM.Therefore, CBBOA-SVM-FS is the best among the performance of all the methods involved in the comparison.The proposed CBBOA not only accomplishes the optimum configuration of the SVM's super parameters but also achieves the selection of the best feature set throughout the process.We took use of a ten-fold CV approach to our benefit.Figure 9 depicts the frequency distribution of the primary features found by the CBBOA-SVM via the 10-fold CV method, as determined by the CBBOA-SVM.
Because of this, as seen in the chart, the characteristics that appeared the most frequently were "Education" (F2), "Accept and win difficult challenges" (F10), "Feel empowered by what you do" (F11), "Decide what is more important to me" (values) (F17), "Set an example in your mind" (F19), "Decide what you want to do" (F29), and "Reconfirmation of a wise career choice" (F33).The five most frequent characteristics appeared 9, 9, 8, 7, and 8 times, respectively.As a result, the study came to the conclusion that such attributes may have an important role in predicting the effect of career decisions.Moreover, in terms of the variance of the ACC obtained by the methods involved in the comparison, CBBOA-SVM-FS is 0.037, CBBOA-SVM is 0.076, BOA-SVM is 0.071, ANN is 0.095, RF is 0.083, and KELM is 0.075.By comparing the variance, we can find that the stability of CBBOA-SVM-FS is also better with respect to ACC.
In terms of sensitivity, CBBOA-SVM obtains the best result of 98.20%, and CBBOA-SVM-FS is the second-best, with a result of 94.50%.In addition, the results of other methods are 93.30% for BOA-SVM, 92.60% for RF, 87.70% for KELM, and 78.20% for ANN, respectively, which shows that the proposed methods CBBOA-SVM-FS and CBBOA-SVM are also superior to other methods in terms of Sensitivity.
Further, the results in terms of variance also show that CBBOA-SVM-FS and CBBOA-SVM are more stable than the other methods.When evaluated using Specificity, CBBOA-SVM-FS is 94%, which is 8.00% better than CBBOA-SVM, 4.00% better than BOA-SVM, 1.00% better than ANN, 4.00% better than RF, and 3.00% better than KELM, showing that CBBOA-SVM-FS is the best.Moreover, on top of stability, CBBOA-SVM-FS is also better than CBBOA-SVM, BOA-SVM, RF, and KELM.
In the MCC results, CBBOA-SVM-FS is the best with 88.90%.CBBOA-SVM, BOA-SVM, ANN, RF, and KELM are 85.30%, 84.30%, 72.60%, 83.00%, and 79.20%, respectively, which shows that CBBOA-SVM-FS is 3.60% better than the next best CBBOA-SVM and 16.30% better than the worst ANN.Therefore, CBBOA-SVM-FS also performs the best, and its stability performance can be seen from Figure 8 that it is the best.In summary, CBBOA-SVM-FS has a better performance than other similar methods and can be used to study the impact of career decisions.
The proposed CBBOA not only accomplishes the optimum configuration of the SVM's super parameters but also achieves the selection of the best feature set throughout the process.We took use of a ten-fold CV approach to our benefit.Figure 9 depicts the frequency distribution of the primary features found by the CBBOA-SVM via the 10-fold CV method, as determined by the CBBOA-SVM.

Discussion
By analyzing the results obtained experimentally for the questionnaire, it can be found that the most important 6 attribute features among the 39 attributes are F2, F10, F11, F17, F29, and F33, which have a more prominent impact on the career decision-making ability of college students.
The survey data shows that the strengths and weaknesses of college students' career decisions differ among different education levels, with students with master's degrees being stronger than those with bachelor's degrees, and those with bachelor's degrees being stronger than those with college (higher vocational), and it can be seen that the higher the education level, the stronger their career decision-making ability.This is because college students with strong career decision-making abilities have clearer and more targeted goals and clearer directions for their studies and future development, and they are more committed to improving their education.
F10 and F11 are two characteristics that measure the degree of satisfaction of competence needs among the basic psychological needs of individuals.We can find that college students who have accepted and won difficult challenges and feel competent in what they do have stronger career decision-making ability, because individuals whose competence needs are satisfied to have more confidence in their abilities in various aspects, and this ability also includes career decision-making ability.At the same time, the successes and difficulties won in career decision making promote the individual's sense of competence.
The characteristics of F17 reflect the values in the individual's self-concept, and it can be seen that the clearer the university students are about their self-values, the stronger their career decision-making ability is.This is because values are the overall evaluation of the meaning, role, effect, and importance of objective things (including people, things, and events) and the results of one's own behavior, and they are the principles and standards that drive and guide one's decisions and actions, and they are one of the core elements of the psychological structure of personality.The clearer one's values are, the clearer one is about what one wants, needs, and needs, and the clearer and more determined one's decisions will be; therefore, the clarity of one's values also determines the strength of one's career decision-making ability.We find that the stronger the degree of deciding what you want to do, the clearer the career decisions and goals, and the stronger the career decision- Because of this, as seen in the chart, the characteristics that appeared the most frequently were "Education" (F2), "Accept and win difficult challenges" (F10), "Feel empowered by what you do" (F11), "Decide what is more important to me" (values) (F17), "Set an example in your mind" (F19), "Decide what you want to do" (F29), and "Reconfirmation of a wise career choice" (F33).The five most frequent characteristics appeared 9, 9, 8, 7, and 8 times, respectively.As a result, the study came to the conclusion that such attributes may have an important role in predicting the effect of career decisions.

Discussion
By analyzing the results obtained experimentally for the questionnaire, it can be found that the most important 6 attribute features among the 39 attributes are F2, F10, F11, F17, F29, and F33, which have a more prominent impact on the career decision-making ability of college students.
The survey data shows that the strengths and weaknesses of college students' career decisions differ among different education levels, with students with master's degrees being stronger than those with bachelor's degrees, and those with bachelor's degrees being stronger than those with college (higher vocational), and it can be seen that the higher the education level, the stronger their career decision-making ability.This is because college students with strong career decision-making abilities have clearer and more targeted goals and clearer directions for their studies and future development, and they are more committed to improving their education.
F10 and F11 are two characteristics that measure the degree of satisfaction of competence needs among the basic psychological needs of individuals.We can find that college students who have accepted and won difficult challenges and feel competent in what they do have stronger career decision-making ability, because individuals whose competence needs are satisfied to have more confidence in their abilities in various aspects, and this ability also includes career decision-making ability.At the same time, the successes and difficulties won in career decision making promote the individual's sense of competence.
The characteristics of F17 reflect the values in the individual's self-concept, and it can be seen that the clearer the university students are about their self-values, the stronger their career decision-making ability is.This is because values are the overall evaluation of the meaning, role, effect, and importance of objective things (including people, things, and events) and the results of one's own behavior, and they are the principles and standards that drive and guide one's decisions and actions, and they are one of the core elements of the psychological structure of personality.The clearer one's values are, the clearer one is about what one wants, needs, and needs, and the clearer and more determined one's decisions will be; therefore, the clarity of one's values also determines the strength of one's career decision-making ability.We find that the stronger the degree of deciding what you want to do, the clearer the career decisions and goals, and the stronger the career decision-making ability, because deciding what to do and how to do it is part of the decision-making ability.
F33 is actually a reconfirmation and re-enforcement of the individual's decision, which can be said to be a recognition of self-decision-making ability and can enhance the individual's self-efficacy for decision making.The research still needs to be improved: first, the number of samples collected for the model construction needs to be increased, so that it is more comprehensive, perfect, and accurate for the model construction.Secondly, the sample collection is currently concentrated in one city, which will be influenced by factors such as the urban environment and college environment.It is necessary to expand different geographical areas, especially different provinces, to expand colleges and universities in different cities including first-tier cities, new first-tier cities, second-tier, third-tier, and fourth-tier cities, and to expand different types of colleges and universities to enrich the model.Thirdly, we can also expand more attributes that affect college students' career decision-making ability and seek more influential attributes to make the model more convincing.
The practical value of this paper's study on forecasting college students' career selections is vast, and it not only can aid jobless college students in selecting acceptable occupations for themselves but also help the government's macro-management of college students' employment market.As a result, conducting prediction studies on college students' future profession choices can provide very valuable information.

Conclusions and Future Work
In this study, we developed an effective hybrid CBBOA-SVM model to predict career decisions for college students.This paper proposes an improved BOA called CBBOA by introducing the Gaussian Bare-Bones and communication mechanism.The mechanisms effectively improve the exploitation ability and convergence accuracy of BOA.In order to evaluate the performance of CBBOA, we test it with other algorithms on benchmark functions of CEC14.Experimental results show that CBBOA has a great improvement in most functions compared with BOA.CBBOA also has competitiveness compared with some advanced algorithms.Meanwhile, by optimizing SVM with CBBOA, it is feasible to get better parameter combinations and feature subsets than previous approaches.Compared to previous machine learning approaches, the suggested method can still predict more correctly and realize more consistently while dealing with the issue of predicting career selections for college students.
In the following study, due to the advanced characteristic of the proposed CBBOA-SVM model, it will be generalized in China and utilized to anticipate various issues, such as medical diagnostics and financial risk prediction.Furthermore, it is envisaged that the CBBOA method may be expanded to handle new application domains, such photovoltaic cell optimization and optimization of deep learning network nodes.

Conflicts of Interest:
The authors declare no conflict of interest.The generalized radial basis kernel function is employed in this work, and its formulation is as follows.

Appendix A
k(x, y) = e −γ||x i −x j || (A4) where γ is a kernel parameter that specifies the interaction breadth of the kernel function and another factor that is highly significant to the classification performance of SVM.
Butterfly optimization algorithm (BOA) [108] is a new swarm intelligence optimization algorithm proposed in 2018.Since its introduction, BOA has been applied to many problems such as fault diagnosis [109], disease diagnosis [110], optimal cluster head choice for wireless sensor networks [111], parameters identification of photovoltaic models [112], image segmentation [113] and feature selection [114].BOA mainly searches the optimal solution of the problem by imitating the foraging behavior of butterflies.In biology, butterflies have chemoreceptors on their bodies.With these chemoreceptors, butterflies can smell the fragrance of food.Therefore, a function f representing fragrance is set in the BOA.The calculation formula of f is as follows: where c represents the sensory modality, a is the power exponent that depends on the sensory modality, I is stimulation intensity which means fitness in the algorithm.The update formula of c is as follows: where t represents the current iteration number.There are two situations in BOA during the search phase.In the first situation, when one butterfly produces fragrance that other butterflies perceive, the other butterflies will then move in that direction.This phase is global search.On the other hand, when butterflies cannot perceive the fragrance which produced by other butterflies in the search space, they will take random steps to search around.This phase is local search.From the above analysis we can get two update formulas for BOA.The formula for the global search phase is shown in Equation (A7): where x t i represents the solution vector of the i-th butterfly in the t-th iteration, g * represents the fitness of the optimal solution, r is the random number in the range [0, 1].
The formula for the local search phase is shown in Equation (A8): where x t j and x t k are the positions of j-th and k-th butterflies in the population, j and k are the random numbers in the range [1, N], N represents the number of populations, r is the random number in the range [0, 1].
In order to switch between global search and local search in the process of searching for the optimal solution, a probability parameter p is set in BOA.The algorithm will choose which formula to update according to p.In other words, p controls BOA to perform global search or local search.The new population individuals generated by BOA will be stored in the agent.After calculating the fitness, choose whether to replace or not.Only when the generated population is better than the original population will it be replaced.The above steps constitute a complete iterative process of BOA. Figure A1 shows the concrete process.
where    and    are the positions of j-th and k-th butterflies in the population, j and k are the random numbers in the range [1, N], N represents the number of populations, r is the random number in the range [0, 1].
In order to switch between global search and local search in the process of searching for the optimal solution, a probability parameter p is set in BOA.The algorithm will choose which formula to update according to p.In other words, p controls BOA to perform global search or local search.The new population individuals generated by BOA will be stored in the agent.After calculating the fitness, choose whether to replace or not.Only when the generated population is better than the original population will it be replaced.The above steps constitute a complete iterative process of BOA. Figure A1 shows the concrete process.

Figure 1 .
Figure 1.Flowchart of CBBOA.The temporal complexity of CBBOA is governed by the maximum number of iterations (T), the number of dimensions (dim), and the size of the population in each dimension (N).According to the results of the study, the overall time complexity of CBBOA is O(CBBOA) = O(initialization) + O(calculation fitness) + T*(O(update position by BOA) + O(communication mechanism)+O(Gaussian Bare-Bones) + O(update the optimal solution)).Initializing the population has a O(N*dim) time complexity, which means it takes a long time.The time complexity of computing fitness is O(N).Using BOA formula to update the time complexity is O(N).The time complexity of CM is O(N).The Gaussian Bare-Bones is O(N*dim) and update the optimal solution is O(N).Therefore, its final complexity is as below.() = ( * ) + () +  * (() + () + ( * ) + ()) = ( * ) + () +  * (3() + ( * )).

Figure 1 .
Figure 1.Flowchart of CBBOA.The temporal complexity of CBBOA is governed by the maximum number of iterations (T), the number of dimensions (dim), and the size of the population in each dimension (N).According to the results of the study, the overall time complexity of CBBOA is O(CBBOA) = O(initialization) + O(calculation fitness) + T * (O(update position by BOA) + O(communication mechanism) + O(Gaussian Bare-Bones) + O(update the optimal solution)).Initializing the population has a O(N * dim) time complexity, which means it takes a long time.The time complexity of computing fitness is O(N).Using BOA formula to update the time complexity is O(N).The time complexity of CM is O(N).The Gaussian Bare-Bones is O(N * dim) and update the optimal solution is O(N).Therefore, its final complexity is as below.O(CBBOA) = O(N * dim) + O(N) + T * (O(N) + O(N) + O(N * dim) + O(N)) = O(N * dim) + O(N) + T * (3O(N) + O(N * dim)).

Figure 5 .
Figure 5. (a) Three-dimensional location distribution of CBBOA, (b) two -dimensional location distribution of CBBOA, (c) Trajectory of CBBOA in the first dimension, (d) Average fitness of CBBOA, (e) Convergence curves of CBBOA and BOA.

Figure 5 .
Figure 5. (a) Three-dimensional location distribution of CBBOA, (b) two -dimensional location distribution of CBBOA, (c) Trajectory of CBBOA in the first dimension, (d) Average fitness of CBBOA, (e) Convergence curves of CBBOA and BOA.

Figure 8 .
Figure 8. Classification results of five models in terms of four metrics.

Figure 8 .
Figure 8. Classification results of five models in terms of four metrics.

Figure 9 .
Figure 9. Frequency of the features chosen from CBBOA-SVM through the 10-fold CV procedure.

Figure 9 .
Figure 9. Frequency of the features chosen from CBBOA-SVM through the 10-fold CV procedure.

Table 1 .
Descriptions of each attribute.

Table 1 .
Descriptions of each attribute.

Table 2 .
Classification results of CBBOA-SVM-FS in the light of four metrics.

Table 2 .
Classification results of CBBOA-SVM-FS in the light of four metrics.

Table A2 .
Comparison results of CBBOA and other algorithms.

Table A3 .
The p-value of CBBOA versus other algorithms.

Table A4 .
Comparison results of CBBOA and other algorithms.