The Impact of Secondary Education Choices on Mathematical Performance in University: The Role of Non-Cognitive Skills

: (1) Background: this study evaluates the most relevant factors affecting the performance in mathematics of university undergraduates. Precisely, the mathematical background of students. Spanish secondary education provides an opportunity to develop this analysis since students can choose between two secondary education tracks with different mathematical content and depth. (2) Methods: a survey was conducted covering personal characteristics, socioeconomic status, academic choices and academic achievement as well as a set of questions aimed to uncover attitudes towards mathematics. Students that show preferences regarding mathematics are prone to choose the track with more mathematical content, creating a potential confusion between training and attitudes towards mathematics. We propose an index of non- cognitive skills related to mathematics to account for this problem. (3) Results: prior background in mathematics plays a role in mathematical performance at university even after correcting for non-cognitive skills related to mathematics. The effects are heterogeneous with respect to gender. (4) Conclusions: choosing a more mathematical-oriented itinerary in secondary education seems to give an edge to students. Our results shed light on the implications associated with the decision of secondary school track choice made by students. Furthermore, they are meant to serve as a guide to improve the design of remedial courses.


Introduction
The objective of the present paper is to analyze the factors affecting results in mathematics. In particular, we draw on data from university undergraduate programs offered by the School of Business and Economics, Universidad de León (henceforth ULE), Spain. Precisely, we focus on the mathematical background of students when entering university.
This research relates to a large body of literature that analyses schooling using models and statistical methods of economic analysis such as the Educational Production Function [1][2][3]. In this kind of analysis, the output of schooling, measured as student's performance, is a function of a set of inputs such as individual, family and school attributes. As a result, it is possible to measure how the effect on the student's performance of policy variables such as school quality is affected by individual and family characteristics. A frequent finding is that the role of the policy variable diminishes substantially once individual and family characteristics are taken into account [4,5].
Previous empirical literature has focused on the student's performance in mathematics [6][7][8] and student's performance in university [9]. Of particular interest for our objective are studies analyzing performance on degrees related to Business and Economics [10][11][12][13][14][15]. In these papers, the mathematical background of students is included as an explanatory variable of student's performance. This approach is interesting from a policy angle since mathematical background could be modified by changes in secondary education but also by remedial courses at university [11,12]. The effects of mathematical background on the Mathematics 2021, 9,2744 2 of 16 student's performance are analyzed using a survey on the student's views [15], a natural experiment [12] and administrative (non-experimental) data [13,14]. They all find evidence of a substantial positive effect of the student's mathematical background on performance at the university level.
Non-experimental data (administrative or survey data) are almost the norm in social science. In this setting, the Spanish secondary education provides an opportunity to analyze the effects of mathematical background using non-experimental data. In Spain, Students can choose among two upper secondary school tracks with different mathematical content and depth; namely, the Social Science (SS) or Science and Technology (ST) track. Therefore, it is possible to compare the performance of two groups of students with different mathematical backgrounds [13].
On the other hand, it is reasonable to expect that non-cognitive skills affect the performance in achievement tests, in general [16][17][18], and results in mathematics, in particular [19]. Therefore, the analysis is likely to be affected by self-selection bias [20,21], thus conditioning the choice of the upper secondary school track in our set up, generating a potential confusion between more training in secondary education and a particular attitude towards mathematics. Concerns about the possibility that students self-select into the ST track together with the size and direction of the estimation bias created by the self-selection problem have been discussed in the literature related to this study [13].
In this paper, we aim to mitigate the self-selection bias described above by including an index on attitudes and feelings towards mathematics. A proxy for the student's attitudes towards mathematics has been used to analyze the performance of finance students [22]. In this regard, we conducted a more elaborated survey on math attitudes and propose an index that aggregates the information in the survey [23].
Our study contributes to this literature by combining the most significant characteristics of the aforementioned approaches. On the one hand, we follow the standard approach in the literature applying an Educational Production Function, which controls for the most important observable characteristics at this stage of the academic career. On the other hand, our methodology allows us to account for individual unobserved heterogeneity associated with non-cognitive skills, thus reducing potential selection bias issues. Finally, we further examine the heterogeneity of our results in the context of gender-specific characteristics.

Materials and Methods
This section contains a description of the survey, the Index of Non-Cognitive Mathematical Skills proposed in the paper and the empirical model estimated.

Description of the Survey
The subjects of the survey are students at the school of Business and Economics at ULE in Northwestern Spain. Precisely, first-year students enrolled in four-year degrees in Business, Finance, International Trade and Marketing. The university also offers a Degree in Economics with similar mathematical content in the first year. However, we decided to exclude students enrolled in this degree from the analysis since the grades were much lower than in the other degrees. In our analysis, we consider five groups of students. Precisely, three groups composed of students enrolled in degrees in Finance, International Trade and Marketing, respectively, and two groups of students enrolled in the degree in Business attending morning and afternoon classes, respectively.
All students are required to take mathematics in their first semester at university. The survey was given in the classroom, right before a lecture on Economics, at the beginning of the second semester. The number of students registered for the class in Economics (potential participants in the survey), the number of participants in the survey and the participation rate are shown in Table 1.  INT. TRADE  57  40  70  MARKETING  62  44  70  FINANCE  53  25  47  BUSINESS 1  62  48  77  BUSINESS 2  46  28  60   TOTAL  280  185  66 Source: Own elaboration using data from ULE.
The questions in the survey cover personal characteristics, socioeconomic status, academic choices and academic results both in Secondary Education and University. See Appendix A for the questions of the survey. Of particular importance, it is a set of questions aimed to uncover feelings about mathematical work and perceptions of their own mathematical skills [24].
A control group of 10 third-year students was surveyed a few days in advance. The objective of distributing the survey to the control group was to uncover issues in the survey that could be spotted only at the time of answering the questions. Respondents in the control group made minor comments on the original wording of a few questions and were able to answer all the questions in ten minutes. As a result, we edited those questions in order to address the students' comments and decided to allocate fifteen minutes to answer the survey.

A Proposal for an Index of Non-Cognitive Skills Related to Mathematics
We propose an index of non-cognitive skills related to mathematics, which come close to the concept of emotional intelligence (EI). Our measure is evaluated by using rating scales, which requires test-takers to rate their agreement with a series of statements about themselves in order to assess self-rated ability. There is evidence of a relationship between academic performance and EI measured by means of rating scales [18]. Therefore, we asked students to declare their degree of agreement with 15 statements describing their feelings about mathematical work and perceptions of their own mathematical skills [24] (see Appendix A). There are four degrees of agreement: Not at All, Slightly, Quite and A lot coded with integers ranging from 1 to 4.
We propose to use the following measure of non-cognitive skills. First, we run the regression of the value of the item I am good at Mathematics against all other 14 items: where the index i denotes individuals in the sample, z 1i is the value of the item "I am good at Mathematics", z ji (j = 2, . . . , 15) are the values of the other items, a j (j = 2, . . . ,15) are coefficients to be estimated and w i is a random disturbance. Second, after estimating Equation (1) by OLS, we use the fitted values of the regression z 1i as a measure of non-cognitive skills related to mathematics. We choose the degree of agreement with the statement I am good at Mathematics as a focal point since it is reasonable to expect a positive correlation of the agreement with this statement and non-cognitive skills related to Mathematics. However, we are aware that self-assessment of any item in the survey could be subjected to a whole set of upward and downward bias. Therefore, we try to reduce self-reporting bias by combining in a single index the information provided by students about their attitudes towards mathematics in the 15 items of the survey. For that purpose, we proposeẑ 1 as an index that aggregates the information in 14 items of the survey using as weights the partial correlation of such 14 items with the item I am good at Mathematics. Our choice ofẑ 1 is related to the well-known result that the fitted value of a regression equation is the best linear predictor of a variable ("I am good at Mathematics") for given values of other correlated variables (other items in the survey) [25].

Empirical Model
The empirical model analyses the linear effect of a set of explanatory variables on grades in mathematics in the first semester of university: where m i is the grade in math of student i, T i is a binary variable that takes the value 0 if a student chooses the SS track in upper secondary education and 1 if the student chooses the ST track, A i is the index of non-cognitive skills defined above, x ij denotes k control variables, u i is a random disturbance term with the usual properties and the α's and β's are parameters to estimate.
The key parameters for the objective of the present paper are α 1 and α 2 . Since T i is a binary variable, α 1 can be written as: In other words, α 1 measures the difference in the expected grade in mathematics between a student who chooses the ST track (T i = 1) and a student who chooses the SS track (T i = 0) while all other explanatory variables are kept constant. In turn, α 2 measures the effect on the average grade in mathematics of increasing 1 unit the index of noncognitive skills while other variables are kept constant. In order to have a more intuitive interpretation of α 2 , we use the standardized value ofẑ 1 as the index of non-cognitive skills A i . As a result, α 2 can be interpreted as the effect of increasing non-cognitive skills by one standard deviation on the grade in mathematics.
Finally, we are aware of the ex-post nature of the proposed non-cognitive skills measure. In this regard, we interpret the index as a proxy control [25]. Therefore, the inclusion of the index, although affected by schooling, will partially control for (unobservable) noncognitive factors. Thus, helping to mitigate the potential student's self-selection issues. Furthermore, considering that the association between schooling and "late" non-cognitive skills is positive, the inclusion of our index will underestimate α 1 , thus setting a lower bound of the true effect.

Index of Non-Cognitive Skills Related to Mathematics
In this section, we show the estimation results associated to the index of non-cognitive skills in Equation (1). First, we present the coefficient estimates of Equation (1) that relate the item I am good at Mathematics with the rest of the items that evaluate non-cognitive skills.
The coefficients significantly different from zero have the expected sign. The items, my mind is well suited to mathematics, I find mathematics to be easy and I feel that I have talent for solving mathematical problems have a positive coefficient meaning that each one has a positive correlation with I am good at mathematics keeping all other variables constant.
As it was discussed in Section 2.1, we use as an index of non-cognitive skills the fitted value of the item I am good at mathematics provided by the coefficients shown in the linear regression in Table 2. This predicted value uses the partial correlations measured by the regression coefficients as weights to aggregate the different items measuring non-cognitive skills. We use the standardized value of the index.  (1) using data from ULE. *** p < 0.01, ** p < 0.05, * p < 0.1.

Descriptive Statistics of Participants
In Table 3, we provide sample statistics of the variables used in the empirical analysis, including the standardized index of non-cognitive skills computed above. Grades in the Spanish education system are on a 0 to 10 scale. In Table 3, the average grade in mathematics is above the threshold passing grade of 5. In turn, the averages of the binary variables show that females make up 55% of participants while 20% of participants chose the ST track in their upper secondary education.
The last row of the table shows the descriptive statistics of the index of non-cognitive skills. The minimum and maximum values suggest that the sample distribution is slightly skewed to the right. In other words, most students have more than average non-cognitive skills related to mathematics.
In Table 4 above, we show the mean values of grades and non-cognitive skills stratified by gender, upper secondary education track and university degree. We show as well the value of an F test of differences in means (ANOVA) and the probability that a variable following an F distribution is greater than the value of the F test.
Average grades in mathematics are higher for females and the difference is significantly different from zero at the 5% confidence level. In turn, the index of non-cognitive skills is around one-tenth of a standard deviation higher for males than for females. In this case, the difference is not significantly different from zero at conventional levels of confidence.
Average grades in mathematics are substantially higher for students who chose the ST track in upper secondary education. As expected, students who chose the technical track in high school have a substantially higher average value of the index of non-cognitive skills and the difference is significantly different from zero at the 1% confidence level.
The average grade in mathematics ranges from 4.70 in International Trade to 5.74 in Business 2. The null hypothesis of mean equality across degrees is not rejected at conventional levels of confidence against the alternative that the mean is different in at Mathematics 2021, 9, 2744 6 of 16 least one degree. Finally, the average index of non-cognitive skills ranges from −0.44 in Marketing to 0.27 in Finance. In this case, the null hypothesis of mean equality across degrees is rejected at the 5% confidence level against the alternative that the mean is different in at least one degree.

Estimates of the Empirical Model
In Table 5, we show the estimates of the coefficients of three versions of the linear model in Equation (2). The model estimates the linear effect of the upper secondary education track and non-cognitive skills on the grades in mathematics. We estimate the model with two sets of control variables. The first set contains only the binary variables that indicate the degree in which the student is enrolled and gender (shown in Column 1), while the second set includes all control variables (shown in Columns 2, 3 and 4). See Appendix B for the full list of control variables. Among all control variables, we chose to show only the coefficient of the variable sex due to its size and significance. Furthermore, in order to analyze the impact on grades of non-cognitive skills by gender, we also report the estimated coefficients of Equation (2) augmented with the interaction between sex and non-cognitive skills (shown in Column 4). The estimates of all the coefficients are shown in Appendix C. The first and second columns of Table 5 contain the estimates of the coefficients of the model in (2) without the index of non-cognitive skills. The results of the first two columns can be summarised using the two following equations: In both cases, the coefficient of TRACK provides a gross measure of the effect of studying the ST track on grades in mathematics. In other words, a measure of the effect of studying the ST track that disregards the fact that students who choose the ST track have, on average, a higher level of non-cognitive skills. The differences between the estimates in the first and second equations stem from the number of control variables included in the estimation.
The first equation shows the estimates of the model in (2) when only the five binary variables of degree and the binary variable SEX are included. In this case, the estimates show that, on average, the grade of a female student is 0.75 points higher than the grade of her male counterpart. In turn, choosing the ST track increases the grade in mathematics 1.24 points with respect to a student choosing the SS pathway. The estimates of the same model after including all the control variables are shown in the second equation. The estimates change moderately when adding family and academic characteristics as control variables. Precisely, the coefficient of the variable SEX decreases while the coefficient of the variable TRACK increases.
At any rate, after including all control variables, we find a substantial and significantly different from zero gross effect of studying the ST track in upper secondary education; in particular, an increase of 1.34 points associated with the choice of the ST track.
In the third column of Table 5, we show the estimates obtained when the index of non-cognitive skills is included as an explanatory variable. The results in the third column can be represented by the following equation: In this case, the effect on grades of choosing the ST track decreases considerably. Now, choosing the ST track increases the grade in mathematics 0.64 points. In turn, a change of a standard deviation in non-cognitive skills increases the grade in mathematics by 0.83.
These results provide substantial evidence on the role played by mathematical background on university results and its relationship with non-cognitive skills. As expected, the results show that more mathematical background (namely, to choose the ST track) increases the grades in mathematics at university. However, the coefficients in the third column of Table 5 show how a sizeable portion of that increase is due to the superior non-cognitive skills of the students who choose the itinerary with more mathematical training. In the fourth column of Table 5, we show coefficient estimates after the inclusion of the interaction between the variables SEX and NON-COGNITIVE. The results in the fourth column can be represented by the following equation: The negative sign of the interaction indicates that non-cognitive skills are rather less important for females in determining math grades, although it is not statistically significant.
The coefficient of the gender indicator is shown among all other control variables because it is quite large and significantly different from zero. In fact, such a large effect of the control variable SEX led us to estimate the model separately for male and female students. This result is related to recent research showing that gender affects math performance both directly and through the choice of itinerary in secondary education [26]. We show the estimates of the gender specific regressions in Table 6.  (2) by gender using data from ULE. *** p < 0.01, ** p < 0.05, * p < 0.1.
The coefficients were estimated with all control variables. We choose to show only the coefficients of the variables TRACK and NON-COGNITIVE. The coefficients of all control variables are shown in Appendix C.
The results by gender without the index of non-cognitive skills (first and third column in Table 6) can be summarized using the following equations: The coefficient of TRACK for females (1.476) is larger than for males (1.188). In other words, the gross effect on grades of choosing the ST track is larger for female students.
The results by gender with the index of non-cognitive skills (second and fourth column in Table 6) can be summarized using the following equations: The inclusion of the index of non-cognitive skills as an explanatory variable produces noticeable changes in the estimates of the coefficients. On the one hand, the coefficient of TRACK decreases for both male and female students. This is an expected result similar to the one in Table 5. On the other hand, the coefficient of non-cognitive skills is almost twice as large for males. Moreover, those students with more non-cognitive skills, who tend to choose the ST, which in turn includes more mathematical content, obtain higher grades, indicating that non-cognitive skills are the mechanism through which tracking affects math grades. In particular, this effect is much stronger for male students.
Finally, we test the hypothesis that students self-select into the ST on the basis of non-cognitive skills. In Table 7, we show the estimates of two coefficients (SEX and NON-COGNITIVE) of a Probit model of the choice of the secondary education track. The coefficients of all control variables are shown in Appendix C. Clearly, non-cognitive skills increase the probability of choosing the ST track. Furthermore, female students are more likely to choose the SS pathway.

Robustness Checks
We further examine the robustness of our results using Principal Component Analysis with Varimax (orthogonal) rotation (PCA) by means of exploiting all the information included in the survey regarding attitudes and feelings towards mathematics (see Appendix A). Recall that our Index of Non-Cognitive Mathematical Skills basically includes information regarding statistically significant coefficients shown in Table 2. PCA allows us to generate three new items: self-assessment of mathematical capacity (F1), positive attitudes towards mathematics (F2) and negative attitudes towards mathematics (F3). Note that higher values of F3 imply that the negative attitude towards mathematics is weaker. These variables measure how eager and motivated students are to learn mathematics. Basically, F1 replicates the information contained in the Index of Non-Cognitive Mathematical Skills presented above, whereas F2 and F3 cover a broader range of noncognitive skills towards mathematics. Table 8 shows regression results when we replicate the specification presented in the last column of Table 5 using the variables generated by PCA instead of the Index of Non-Cognitive Skills discussed above. The main findings of the paper are congruent with the analysis of non-cognitive skills using PCA. The regression estimates using F1 as an explanatory variable are strikingly similar to the ones obtained using our proposed index on non-cognitive skills. The use of F2 and F3 as explanatory variables lead to an increase in the effect of the secondary education track to the point that, with the F3 item as an explanatory variable, the effect of the secondary education track is close in value to the gross effect reported in Table 5 (without controlling for non-cognitive skills). Hence, we show consistency with the results reported in the previous section, and therefore robustness regardless of the methodological approach used to carry out the analysis.

Discussion
In this paper, we find that choosing the ST track in secondary education has a positive effect on performance in mathematics at the university level. This result is consistent with previous findings in the literature, linking university performance with the student's background to the extent that the choice of the ST track improves the student's mathematical background. For example, good results in exams required for university entrance (SAT, A levels) have been found to have a positive effect on performance in mathematics at university [10,12]. Additionally, those students that have taken calculus in high school and have obtained high scores in a basic math test show a better performance in mathematics at university, while being required to take remedial math in university has a negative effect [10].
Our results are also in line with literature that considers the type of high school attended [14] or the itinerary chosen in secondary education [13] as a factor explaining the student's university performance. In fact, a positive effect on mathematical performance in university has been found in students who attend a scientific oriented high school in Italy [14] or who choose the ST track in Spain [13]. Therefore, it seems that the training provided by certain secondary education programs has a positive effect on the student's performance in mathematics at the university level. However, the effect of the high school track can be the consequence of a specific training but also of non-cognitive skills since the best students seem to choose the one with more mathematical content [13]. In other words, the training effect of attending a specific high school or program is blurred by the different non-cognitive skills of the students who choose such high school or program. In this study, we find evidence of the student's self-selecting into the ST track since non-cognitive skills increase the probability of choosing the ST track.
In order to deal with this problem in regression analysis, it would be necessary to include a measure of non-cognitive skills. By doing so, the effect of choosing the ST track could be interpreted as an effect of training. Including pre-university academic results as an explanatory variable alongside the high school or track choice can be seen as a move in the direction of controlling for student's non-cognitive skills [13,14]. However, pre-university academic results are a crude measure of non-cognitive skills since they are likely to be the result, besides non-cognitive skills, of previous training [12]. The direction and size of the bias created by using pre-university results as an explanatory variable instead of a more refined measure of non-cognitive skills has been previously analyzed [13].
In this sense, our paper goes one step further in trying to disentangle the effects of noncognitive skills and training. For that purpose, we first extend previous research related to students' mathematical ability [22] by asking a battery of questions on their feelings about mathematical work and their perception of their own mathematical skills [24]. Second, we combine the information obtained in a single variable that is included in the regression as a measure of non-cognitive skills. As a result, we expect to improve the measurement of the training component of choosing an ST track.
The inclusion of a measure of non-cognitive skills as an explanatory variable in the regression produces a substantial reduction in the effect of choosing the ST track in secondary education. However, roughly half of the effect estimated before remains after the inclusion of the non-cognitive skills variable. This result is qualitatively similar to the inclusion of pre-university academic results as an explanatory variable in the regression [13].
Our robustness-check analysis shows that the result of including our index of noncognitive skills is congruent with the results obtained using items developed by Principal Component Analysis.
A caveat should be mentioned again about the finding that choosing the ST track has a positive effect on mathematical results at the university level even after controlling for non-cognitive skills. The answers to the questions in our survey can be affected by the previous training and effort of students. However, as discussed in the section on methodology, the direction of the bias indicates that our estimations can be interpreted as a lower bound [25].
Additionally, our results show that, on average, female students obtain better grades than males. Similar results have been reported in the literature related to our paper [12][13][14]. Although, the opposite result has been found as well [10]. In this regard, recent literature shows that these contrasting results are conditioned by the direct gender effect on math performance as well as the indirect effect implied by the different secondary school choices between males and females [26].
The effect of being a female student is almost as large as the effect of choosing the ST track. This result prodded us to carry out a more detailed analysis that, to the best of our knowledge, is absent in the previous literature related to our study. For that purpose, we include an interaction between the binary variable sex and the index of non-cognitive skills and run different regressions for male and female students. Both exercises show that non-cognitive skills have a larger effect on grades for males while choosing the ST track has a larger effect on grades for female students. Therefore, our results depict a picture of females relying more on the training component of choosing the ST track for increasing their grades while males rely on non-cognitive skills to get better grades. Institutional Review Board Statement: It is our understanding that our research was conducted in accordance with the ethics requirements of our universities and is exempt from research ethics committee oversight. The reason is that the subjects in our survey cannot be identified in anyway or exposed to risks, liabilities, or reputational damage.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data and code used in this study can be found at https://t.ly/q1uu (accessed on 28 October 2021).
Acknowledgments: Javier Valbuena would like to thank the support from the project S23_20R (Grupo de Investigación de Referencia en Economía Pública, Gobierno de Aragón).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A Survey
Personal characteristics and socioeconomic status of students

Employment status of parents
Tick the alternative which better defines the employment status of your parents when you were in Primary and Secondary school

Appendix B
List of control variables used in the analysis Binary variables denoting the degree in which the student is enrolled: International Trade, Marketing, Finance, Business 1 and Business 2. There are five binary variables that take the value of 1 when the student enrolls in the degree and 0 otherwise.
INTACT is a binary variable that takes the value of 1 when the student lived with both parents in Secondary School and 0 otherwise. SIBLINGS is a binary variable that takes the value of 1 when the student has siblings and 0 otherwise.
BOOKS is a binary variable that takes the value of 1 when there were more than 100 books at home and 0 otherwise. MOTHER ED is a binary variable that takes the value of 1 when the mother graduated from university and 0 otherwise. FATHER ED is a binary variable that takes the value of 1 when the father graduated from university and 0 otherwise.

Appendix B
List of control variables used in the analysis Binary variables denoting the degree in which the student is enrolled: International Trade, Marketing, Finance, Business 1 and Business 2. There are five binary variables that take the value of 1 when the student enrolls in the degree and 0 otherwise.
INTACT is a binary variable that takes the value of 1 when the student lived with both parents in Secondary School and 0 otherwise. SIBLINGS is a binary variable that takes the value of 1 when the student has siblings and 0 otherwise.
BOOKS is a binary variable that takes the value of 1 when there were more than 100 books at home and 0 otherwise. MOTHER ED is a binary variable that takes the value of 1 when the mother graduated from university and 0 otherwise. FATHER ED is a binary variable that takes the value of 1 when the father graduated from university and 0 otherwise.
CHARTER is a binary variable that takes the value of 1 when the student attended a charter school and 0 otherwise.
REPEAT is a binary variable that takes the value of 1 when the student had to repeat a year through Secondary Education and 0 otherwise. SCHOLARSHIP is a binary variable that takes the value of 1 when the student has applied for financial aid and 0 otherwise.
OTHERUNIVERSITY is a binary variable that takes the value of 1 when the student wanted to attend a different university and 0 otherwise.
OTHERDEGREE is a binary variable that takes the value of 1 when the student wanted to enrol in a different degree and 0 otherwise. SEX is a binary variable that takes the value of 1 for female students and 0 for males.

Appendix C
Tables with all coefficients in the model.