1. Introduction
National-scale questionnaires are used in many countries to capture information about student experiences of tertiary education and inform potential future stakeholders [
1]. These include the “student satisfaction” approach of the UK’s National Student Survey (NSS). As an alternative, there are student “engagement” questionnaires, such as the National Survey of Student Engagement (NSSE; Indiana University, Indiana, USA). The NSS provides an end-user view of their undergraduate experiences in terms of agreement with positive statements about the student experience. Around 60 percent of predominantly final year undergraduates return the survey. The survey excludes already students that have left the course before their third undergraduate year and there are potential biases in the survey participants, for example females are more likely to respond [
2]. Student surveys have been criticised as being tools with insufficient practical value [
3] and there remains a need for greater context of the results to draw value from the results [
4]. Previous analyses have highlighted subject differences, in particular how different types of feedback were valued differently by students in different subject disciplines. This work pointed to a need for greater depth of analysis, with more robust techniques to explore what best predicts overall satisfaction of respondents [
4].
Attempts to decipher the essence of the notion of “satisfaction” [
5] reveal general agreement that the concept is complex and multidimensional with no entirely accepted general measurement scale for Higher Education [
6]. There has been suggestion that student satisfaction measures relate well to the quality of the learning experiences [
7,
8], but more recent evidence suggests that engagement measures are much better surrogate measures for “quality” of education and that satisfaction metrics are largely unrelated to educational quality/gains [
9]. It is noteworthy that there is ongoing debate around the term “satisfaction” in the context of students’ ratings of their perceptions of teaching quality (and other elements of their tertiary level educational experiences). For a useful insight into this discussion, refer to Richardson’s 2005 review [
8]. The term “satisfaction” will be used in the current article to discriminate the NSS approach from “engagement” surveys such as the NSSE used by US institutions (
www.nsse.iub.ed).
With the use of UK’s NSS scores in calculating “Good University” metrics and in the new “Key Information Sets” (or “KIS” data;
http://unistats.direct.gov.uk/) provided for to guide course selections of those interested in studying in higher education, there seems to be greater value than ever placed on student ratings of educational experiences. The value of nationally based satisfaction surveys remains in dispute [
9,
10] and there continues to be debate about the best ways to respond to national-level student evaluations [
11]. Therefore, it is timely to use more sophisticated approaches to data analysis and bespoke knowledge of differences in subjects and institutions to contextualise NSS outcomes.
It is well established that a variety of approaches can be used to interpret national datasets [
12,
13,
14]. Thomas and Galambos [
15] applied data mining approaches to identify patterns in the data sets. The CHAID (chi squared automatic interaction detector) algorithm used showed the strengths of this type of analytical approach and identified key factors associated with US students’ satisfaction ratings. For example, intellectual stimulation was perceived as very important to students who recorded high intellectual growth, but this was not true of other subgroups. Thus, as a consequence of not being a significant factor in the overall regression model, it would normally be interpreted as unimportant when using more “traditional” statistical approaches. Other significant factors unearthed by this approach included “academic experiences” which differentiated between more and less satisfied students and also “non-academic” aspects of their experiences were most important to “less academically engaged” students (for definitions see [
15]).
The current study uses robust, readily available modeling techniques to contextualise the quantitative outcomes of the NSS ratings for science and engineering subjects and begins to explore what a “level playing field” means in terms of satisfaction metrics. The study will highlight differences between subjects and institutional types as a starting point to explore comparability of NSS ratings. It will also determine differences in how the individual survey questions predict overall ratings of satisfaction to highlight the most influential of the survey items in terms of overall satisfaction of the respondents. The modeling technique used is robust, has yet to be used to explore mass educational datasets and accounts for variability between the factors that affect the “playing field” of NSS metrics. A short introduction to the statistical technique is provided, which is derived from licence-free software.
3. Results
In general terms, levels of overall satisfaction within subject areas were relatively constant across the surveys (
Table 1), with a trend of increasing numbers of returns from Higher Education Institutions (HEIs) over time.
Table 1.
Mean percentages of overall satisfaction scores and number of HEIs returned for each subject grouping for surveys scored as 4 or 5 (indicating satisfaction).
Table 1.
Mean percentages of overall satisfaction scores and number of HEIs returned for each subject grouping for surveys scored as 4 or 5 (indicating satisfaction).
![Education 03 00193 i001]() |
Scatterplots revealed clear, consistent differences between subject groupings, with a cluster of subjects (
Figure 1) that received 5–10% higher overall satisfaction scores. These were Chemistry, Mathematics and Statistics, Physical Geography and Environmental Science, Human Geography and Biology. The lower scoring group comprised Electronic and Electrical Engineering, Others in Subjects allied to Medicine, Mechanical, Production and Manufacturing Engineering and Computer Science.
Figure 1.
Satisfaction scores from 2007 (Q22) plotted against the satisfaction scores for 2008 and 2009 to highlight consistency of patterns through the clusters of subjects that have lower (four subject groupings) or higher (five subject groupings) levels of satisfaction.
Figure 1.
Satisfaction scores from 2007 (Q22) plotted against the satisfaction scores for 2008 and 2009 to highlight consistency of patterns through the clusters of subjects that have lower (four subject groupings) or higher (five subject groupings) levels of satisfaction.
The percentage of explained variance was consistently large (when considered in light of the many subjects included in this analysis) and also very similar for all models/years (for 2007, Model 1–72.3%, Model 2–72.2%; for 2008, Model 1–74.3%, Model 2–74.2%; for 2009, Model 1–79.9%, Model 2–79.9%; note that Model 3 exploring university groupings will be explored further in the next section. The inclusion of subject as predictor (Model 2) did not result in larger values for the explained variance and it was never an important predictor. The similarities between these models suggest that robust conclusions can be drawn from the pattern of importance statistics. The model structure was very similar between all six models (
Table 2) with Q15 being clearly the most important predictor. It is noteworthy that Q15 asks students if their course is “well organised and running smoothly” and, therefore, is double-barreled and covers many potential issues and problems.
Often questions are combined into their “dimensions” for a more general insight into the survey’s outcomes, see [
14,
21]. From
Table 3 it is clear that the “Teaching” dimension (Q1–Q4 considered together) is the most important predictive dimension of Q22 for the science subjects investigated, followed by questions about “Organisation” (not surprising due to the prominence of Q15) and also “Support”. Scores associated with feedback issues were the lowest and thus the poorest predictors of student’s overall satisfaction (but note the differences between subjects in [
4]), as with many of the other questionnaire items associated with assessment. Although their importance is still quite low there is evidence that the “Resources” questions are increasing in prominence and also note the differences in the two areas of assessment that are often considered separately [
21].
Table 2.
Strength of National Student Survey (NSS) questions predictions of overall satisfaction (2007–2009) for the three models (1, 2 and 3). Higher numbers (of percentage increase in mean square error) indicative greater influence of the first 21 questionnaire items on Q22. “Subject” and “Group” are included but these factors were not as influential as the most prominent questions (e.g., Q15, Q4 and Q1).
Table 2.
Strength of National Student Survey (NSS) questions predictions of overall satisfaction (2007–2009) for the three models (1, 2 and 3). Higher numbers (of percentage increase in mean square error) indicative greater influence of the first 21 questionnaire items on Q22. “Subject” and “Group” are included but these factors were not as influential as the most prominent questions (e.g., Q15, Q4 and Q1).
![Education 03 00193 i002]() |
All predicted satisfaction values were highly correlated with the actual value (r > 0.85 in all cases) and residual means were always close to 0 indicating no systematic under- or over-prediction of satisfaction scores and large correlation coefficients confirmed that actual and predicted satisfaction scores remained in approximately the same rank order. Given the similarities in the models it is unsurprising that between-year predictions were good, with correlations between actual and predicted values of 0.86 or more. The between-year residuals had slightly larger means and were always negative,
i.e., the models tended to over-predict overall satisfaction by only 1–2%. Looking at all three models, over three years there was a striking similarity between years/models (see
Table 2), and a model that combines all years and excluded subject/group is shown in
Table 4.
Table 3.
How well NSS question thematic areas predict overall satisfaction. The higher values indicate greater importance in predicting Q22 and “Teaching” and “Organisation” were the best predictors of overall satisfaction over all the surveys. “Assessment” is shown both as an overall measure and also separately as the commonly deciphered sub-themes of “Fairness” and “Feedback”.
Table 3.
How well NSS question thematic areas predict overall satisfaction. The higher values indicate greater importance in predicting Q22 and “Teaching” and “Organisation” were the best predictors of overall satisfaction over all the surveys. “Assessment” is shown both as an overall measure and also separately as the commonly deciphered sub-themes of “Fairness” and “Feedback”.
![Education 03 00193 i003]() |
Table 4.
Overall effectiveness of NSS questionnaire items (Q1–Q21) as predictors of Overall Satisfaction for all years combined. Predictors are presented hierarchically with best predictors at the top, relating to those with the highest % Mean Squared Error(MSE) (which may be greater than 100%).
Table 4.
Overall effectiveness of NSS questionnaire items (Q1–Q21) as predictors of Overall Satisfaction for all years combined. Predictors are presented hierarchically with best predictors at the top, relating to those with the highest % Mean Squared Error(MSE) (which may be greater than 100%).
![Education 03 00193 i004]() |
University groupings in
Table 5 show mean satisfaction score across all subjects and groups over the three years explored (
Table 5a–c). Subjects are ordered from left (lowest scores) to right (highest scores) after accounting for all university groups. Accounting for subjects, the University groupings are shown in ascending order of responses, with the “1994 group” attaining the highest overall satisfaction, followed by the Russell Group. The Million+ and Alliance had the lowest satisfaction responses. In terms of subjects, Computer Science, Engineering subjects and Subjects Allied to Medicine were consistently the least satisfied while Chemistry, Mathematics and Statistics plus the Geographical subjects were consistently the most satisfied.
Table 5.
Mean overall satisfaction scores for data presented as percentage responses of overall satisfaction for subject and university groupings. The table is ordered by row (lowest scores at the top) and column (lowest scores to the right) means for three years data (a–c).
Table 5.
Mean overall satisfaction scores for data presented as percentage responses of overall satisfaction for subject and university groupings. The table is ordered by row (lowest scores at the top) and column (lowest scores to the right) means for three years data (a–c).
2007 data |
Group | n | Comp Sci | Allied Med | MPM Eng | Elec Eng | Hum Geog | EGS | Maths | Biol | Chem | All |
Million+ | 46 | 72.2 | 77.6 | 80.5 | 66.2 | N/A | 86.3 | 71.0 | 86.4 | N/A | 75.6 |
Alliance | 75 | 72.8 | 79.7 | 77.7 | 79.5 | 89.4 | 88.5 | 92.3 | 87.0 | 74.0 | 80.7 |
None | 117 | 80.6 | 75.7 | 76.3 | 85.3 | 91.1 | 87.7 | 86.9 | 84.6 | 89.0 | 83.0 |
Russell | 117 | 87.3 | 83.1 | 86.0 | 85.2 | 85.2 | 87.6 | 86.8 | 90.7 | 91.2 | 87.2 |
1994 | 64 | 81.5 | 80.0 | 85.5 | 86.2 | 86.7 | 91.1 | 93.3 | 93.1 | 91.2 | 88.9 |
All | 405 | 78.2 | 78.9 | 80.6 | 80.8 | 87.6 | 88.3 | 88.4 | 88.7 | 89.3 | 84.0 |
n | | 92 | 48 | 33 | 41 | 35 | 44 | 49 | 53 | 25 | |
2008 data |
Group | n | Mech Eng | Comp Sci | Allied Med | Elec Eng | Biol | Maths | EGS | Hum Geog | Chem | All |
Million+ | 46 | 57.0 | 76.1 | 75.2 | 77.3 | 79.0 | 95.0 | 85.0 | N/A | N/A | 76.6 |
Alliance | 89 | 81.6 | 75.4 | 83.0 | 78.7 | 87.3 | 89.3 | 90.8 | 91.5 | 93.0 | 83.2 |
None | 108 | 77.8 | 82.5 | 80.7 | 81.4 | 85.3 | 88.3 | 91.0 | 91.9 | 91.5 | 84.9 |
Russell | 129 | 81.1 | 85.1 | 89.3 | 86.4 | 90.9 | 87.0 | 89.6 | 87.8 | 90.4 | 87.4 |
1994 | 71 | 84.0 | 83.2 | 89.3 | 89.4 | 90.0 | 88.5 | 91.6 | 91.9 | 92.0 | 88.8 |
All | 443 | 78.6 | 80.1 | 81.5 | 83.0 | 87.8 | 88.2 | 90.0 | 90.4 | 91.3 | 85.4 |
n | | 48 | 93 | 37 | 42 | 44 | 39 | 56 | 50 | 34 | |
2009 data |
Group | n | Mech Eng | Comp Sci | Allied Med | Elec Eng | Biol | Maths | EGS | Hum Geog | Chem | All |
Million+ | 85 | 72.6 | 71.2 | 74.2 | 71.9 | 75.9 | 92.0 | 84.3 | 91.1 | N/A | 75.7 |
Alliance | 117 | 75.9 | 71.3 | 79.8 | 74.3 | 84.1 | 85.8 | 84.8 | 87.9 | 89.1 | 80.0 |
None | 151 | 79.0 | 82.2 | 81.7 | 80.7 | 87.1 | 90.4 | 89.6 | 86.3 | 89.3 | 84.2 |
Russell | 139 | 86.8 | 85.7 | 83.7 | 86.1 | 91.3 | 85.6 | 88.7 | 84.7 | 90.7 | 87.3 |
1994 | 84 | 83.4 | 88.0 | 89.8 | 90.0 | 89.7 | 90.3 | 88.6 | 91.5 | 92.7 | 88.9 |
All | 576 | 78.7 | 79.7 | 80.6 | 80.7 | 86.4 | 87.6 | 87.6 | 87.7 | 90.5 | 84.5 |
n | | 65 | 115 | 56 | 57 | 63 | 48 | 69 | 61 | 42 | |
Inclusion of an overall satisfaction measure provides an opportunity to explore how this holistic assessment can be predicted by the elemental parts surveyed (Q1-21). It was anticipated that at least some universities/subjects would over (or under-) perform in such models due to the large number of institutions in the analysis. Since differences in the subject composition used for each of the institutions in the analysis will contribute to the model’s outcome, the findings are summarised together with subject groupings in
Table 6.
Table 6.
Prediction of Q22 (Overall Satisfaction) by the surveys main questions (Q1-Q21) for; (a) university groupings; and (b) subject groupings. Shaded numbers indicate groupings/subjects that perform better than the model predicts. Higher numbers (positive or negative) indicates greater deviation from model prediction.
Table 6.
Prediction of Q22 (Overall Satisfaction) by the surveys main questions (Q1-Q21) for; (a) university groupings; and (b) subject groupings. Shaded numbers indicate groupings/subjects that perform better than the model predicts. Higher numbers (positive or negative) indicates greater deviation from model prediction.
![Education 03 00193 i005]() |
In general, Russell Group institutions perform 3–4% better than the model predicts, whereas, the Million+ group do 1–5% worse. In terms of the subjects, Computer Sciences and Allied to Medicine perform slightly worse than predicted (but generally by <1%) whereby Maths & Stats and Chemistry always do better than predicted (1–3% better).