Predictive Modeling of VO 2 max Based on 20 m Shuttle Run Test for Young Healthy People

: This study presents mathematical models for predicting VO 2 max based on a 20 m shuttle run and anthropometric parameters. The research was conducted with data provided by 308 young healthy people (aged 20.6 ± 1.6). The research group includes 154 females (aged 20.3 ± 1.2) and 154 males (aged 20.8 ± 1.8). Twenty-four variables were used to build the models, including one dependent variable and 23 independent variables. The predictive methods of analysis include: the classical model of ordinary least squares (OLS) regression, regularized methods such as ridge regression and Lasso regression, artiﬁcial neural networks such as the multilayer perceptron (MLP) and radial basis function (RBF) network. All models were calculated in R software (version 3.5.0, R Foundation for Statistical Computing, Vienna, Austria). The study also involved variable selection methods (Lasso and stepwise regressions) to identify optimum predictors for the analysed study group. In order to compare and choose the best model, leave-one-out cross-validation (LOOCV) was used. The paper presents three types of models: for females, males and the whole group. An analysis has revealed that the models for females ( RMSE CV = 4.07 mL · kg − 1 · min − 1 ) are characterised by a smaller degree of error as compared to male models ( RMSE CV = 5.30 mL · kg − 1 · min − 1 ). The model accounting for sex generated an error level of RMSE CV = 4.78 mL · kg − 1 · min − 1 .


Introduction
Cardiorespiratory fitness (CRF) is a strong health indicator [1][2][3][4][5][6]. A low CRF level entails the risk of adverse cardiovascular events [2]. In adults, CRF is regarded as a predictor of general mortality and it is negatively correlated with hypertension and diabetes [7][8][9][10][11][12][13]. Low CRF in young people is related e.g., to obesity, which increases the risk of cardiovascular diseases [14] and metabolic syndrome characteristics [15,16]. Studies have also revealed that improved CRF leads to better health results regardless of body mass index (BMI) [9]. A higher physical fitness index in adults significantly reduces the metabolic risk for a particular body fat level [17]. Preventive measures against cardiovascular diseases which are intended to optimise CRF are also implemented in low-risk groups (10% risk of ischaemic heart disease occcurrence within the next 10 years) [18][19][20]. The authors of many studies emphasise the significance of lifelong physical activity to improve or maintain the appropriate level of CRF [19,21,22]. In early adulthood, a high CRF level provides the most benefit related to survival. The maximum oxygen consumption (VO 2 max) is the main CRF criterion measurement [15]. A direct measurement of the factor is regarded as the best indicator of aerobic fitness [23,24]. Since the implementation of VO 2 max direct measurement is quite complex, maximal oxygen uptake tends to be forecasted based on indirect measurements, using predictive models [23,25,26]. A 20 m shuttle run test (20 m SRT) is often used to measure cardivascular fitness [22,23,27,28]. Therefore, it seems reasonable to model the index using various mathematical methods. A predictive model may help to identify the level of CRF without costly equipment and specialised research time. The prediction of the VO 2 max parameter may become an important element of health monitoring and fitness improvement in young healthy people.
In the literature, there are many papers that focus on the prediction of this indicator using, among other factors, the results of resistance tests, anthropometric parameters or physical activity indicators [15,23,29]. Therefore, the VO 2 max predictive models may be classified as both exercise and non-exercise models. Exercise models make predictions based on the result of an endurance test with maximum or submaximal intensity [15,23,[30][31][32][33][34]. In the literature, there are also prediction models based on ergometric tests [35,36]. The most popular predictors include age, sex and BMI [25]. Most of the published models are based on multiple regression or machine learning methods.
In this paper, the authors present different models to predict VO 2 max based on the result of the 20 m shuttle run test and anthropometric parameters. The aim of the work is to determine the optimal predictive model to estimate VO 2 max for young healthy people (students). The models may be used e.g., by a Physical Education teacher for CRF monitoring. To the best of our knowledge, these are the first studies for such a large group (n = 308) using the direct measurement of oxygen uptake. Data was collected at five large universities in various parts of Poland.

Participants
The nature of the study was cross-sectional. It applied to a group of 1097 (aged 19.7 ± 1.4) healthy students (487 males aged 20.0 ± 1.6 and 610 females aged 19.5 ± 1.3) studying in five large academic centres in Poland (Krosno State College, University of Rzeszow, Maria Curie Skłodowska University in Lublin, Cracow University of Technology and Poznań University of Life Sciences). The study group was selected in two stages. In the first stage, randomly selected students at the academic level of bachelor's studies participated in a 20 m SRT. Based on the median of the covered distance (sections): Me = 880 m for females (44 sections) and Me = 1520 m for males (76 sections) people who completed 20 m SRT below and above the median of the distance covered (sections) were selected for the second stage to ensure the highest possible differentiation of results. Finally, 154 women (aged 20.3 ± 1.2) and 154 men (aged 20.8 ± 1.8) participated in the last stage. They participated in a 20 m SRT and the VO 2 max was identified using a portable gas analyser K4b 2 . Before the cardiac stress test, a skilled study team made anthropometric measurements. The body height was measured with a stadiometer (SECA 213 Hamburg, Germany) with an accuracy of up to 1 mm. Measurements of the waist circumference were carried out with non-elastic flexible tape according to two protocols: WHO (World Health Organization) STEPS protocol (approximate midpoint between the lower margin of the last palpable rib and the top of the iliac crest) and according to US NIH (United States National Institutes of Health) protocol (a measurement was made at the top of the iliac crest) [37]. The hip circumference measured around the widest portion of the buttocks. The body weight and body weight components (Fat-body fat percentage, FFM-fat free mass, TBW-total body water) were measured by means of Body Composition Analyzer Tanita TBF 300 (Tokyo, Japan). Table 1 presents used somatic indexes with equations. Generally, the measured anthropometric parameters (height and body weight, waist circumference, WHR-waist-hip ratio, WHtR-waist to height ratio, BMI-body mass index, BAI-body adiposity index) comply with the results for the mean values of the parameters in the population of Polish students presented in studies by other authors [38,39].
A description of the variables (e.g., VO 2 max level, age, heart rate and anthropometric parameters) together with the basic statistics (mean value and standard deviation) is given in Table 2. The models were calculated on the basis of 23 independent variables (x 0 -x 22 ) and one dependent variable (y). Independent variables include: gender (x 1 ), parameters of the 20 m shuttle run (x 1 -x 4 ), age (x 5 ), anthropometric features (x 6 -x 10 ), somatic indexes (x 11 -x 19 ), and body components (x 20 -x 22 ). The VO 2 max result (y) obtained during the 20 m shuttle run test with a telemetry gas exchange system (Cosmed K4b 2 ) was a dependent variable. WHtR-waist to height ratio, WHR-waist-hip ratio, BMI-body mass index, FMI-fat mass index, FFMI-fat-free mass index, BAI-body adiposity index, BSA-body surface area. HR-maximal heart rate, HRR-recovery heart rate, WHtR-waist to height ratio, WHR-waist-hip ratio, BMI-body mass index, FMI-fat mass index, FFMI-fat-free mass index, BAI-body adiposity index, BSA-body surface area, Fat-body fat percentage, FFM-body fat-free percentage, TBW-total body water.

Field Assessment of VO 2 max
The 20 m SRT was conducted according to established procedures [40]. The speed at the first minute was 8.5 km/h and this increased by 0.5 km/h every minute. The measurement of the VO 2 max during a 20 m shuttle run test was made using a portable gas analyzer K4b 2 (Cosmed, Rome, Italy). During the graded test, inspired and expired gases were continuously monitored, breath by breath. Prior to testing, Cosmed K4b 2 was warmed-up for a minimum of 20 min. Following the warm-up period, the K4b 2 was calibrated with standard gases in accordance with the manufacturer's specifications. Heart Rate (HR-beats/min) was continuously monitored (Polar coded transmitter) with a focus on the maximal heart rate (HR max ) and recovery heart rate in the first (HR R1 ) and fourth (HR R4 ) minutes after the test. The criteria for attaining VO 2 max included any two of the following: volitional exhaustion; attainment of at least 90% of the age predicted HR max (220 beats/min minus the age of the subject in years); respiratory exchange ratio (RER) equal to or greater than 1.10; and VO 2 leveled off even with an increase in intensity [41,42].

Predictive Methods
Multiple input single output (MISO) model types were used in the study. Classic regression models, regularised regression models and artificial neural networks were used in the calculations of the models. All predictive models were calculated in R colorred software [43]. The implemented methods included: • The ordinary least squares (OLS) regression used a popular method of least squares, in which weights are calculated by minimizing the sum of the squared errors. The function lm to calculate the OLS was used. The criterion of performance J(w) takes the form: where n-total number of data, p-total number of input, x j -input variable, y i -output variable, w j are unknown weights (parameters) of the model. • The Ridge model was calculated using the function lm.ridge from the "MASS" package. In Ridge regression [44], the criterion of performance includes a penalty for increased weights and takes the form: Parameter λ ≥ 0 decides the size of the penalty: the greater the value of λ, the bigger the penalty. • Multilayer Percleptron (MLP)-to implement these methods, the RSNNS package was used [45].
Apart from the linear models, the artificial neural network was used. Two types of ANNs were applied: a multi-layer perceptron (MLP). MLPs are fully connected feedforward networks, and the most popular architecture used in applications. Training was performed by error backpropagation and logistical function as an activation function of hidden layers was used. • Artificial neural networks with a radial basis function (RBF)-the application of the RBF network is similar to MLP and used the RSNNS package [45]. The RBF are also feed-forward networks, but they have only one hidden layer. This network performs a linear combination of radially basis functions.
The presented methods were used to calculate models from all variables ( Table 2) and for subcollections of variables received after using selected reductive procedures of the input collection variables [46,47]. Lasso regression is the first method analysed. Its mechanism facilitates assigning a penalty to variables and in this way they are eliminated from equations. The Lasso regression was obtained with the function enet from the "elasticnet" package. In Lasso regression [48], the norm L 1 is used i.e., the sum of absolute values: Classic "variable selection" procedures are used when there is a high number of potential predictive variables (Y) which means a high number of potentially calculable equations (solutions of input collections) [47]. The procedures introduce or eliminate individual variables from the equation and consist only of studying the subcollection of all possible equations. The following four selection procedures were used in the analysis: • Stepwise Forward Regression-The forward selection procedure begins with an equation which contains only a free expression. The first variable in the equation is the one which has the highest correlation with the Y variable. If the coefficient of regression of the variable differs significantly from zero, the variable remains in the equation and another variable is added. The second variable introduced into the equation is the one which has the highest correlation with Y. (Y has been adjusted for the effect of the first variable). If the regression coefficient is significant, adding the next variable is implemented in the same way. • Stepwise Backward Regression-The backward elimination procedure begins with an equation with all variables. One variable is removed in each step. The variables are neglected depending on their contribution to a reduction in the total error of squares. The variable with the lowest contribution to the reduction in the total error of squares, i.e., the one which has the smallest t-test in the equation, will be removed as the first one. Assuming that there is one or more variables with negligible t-tests, the procedure consists of removing all variables with the lowest negligible t-test. The procedure is completed when all t-tests are significant or when all variables have been removed. • Stepwise Regression (bidirection)-The stepwise method combines the action mechanisms of both abovementioned methods. Generally, it is a forward selection procedure which contains an extra mechanism that enables the removal of variables on any stage, similarly to backward selection. In this procedure, the variable which was earlier added to the equation can be removed later. Calculations made to add and remove variables are the same as in the forward and backward procedures.
All models determined in the study were tested by means of leave-one-out-cross-validation (LOOCV). During the LOOCV, RMSE CV was calculated which has the form: where: n-total number of data,ŷ −i -the output of a model calculated after removing the pair (x i , y i ).

Results
All models were evaluated using the prediction error RMSE CV calculated during cross-validation (Tables 3 and 4). Figure 1 presents validation errors in relation to the parameters of the models. For Ridge models, this parameter is the lambda parameter λ ∈ [0, 20] with a step of 1, in the case of the Lasso regression, the parameter s ∈ [0, 1] has a step of 0.1. Artificial neural networks (MLP and RBF) were evaluated for the number of neurons in the hidden layer m ∈ {2, 3, . . . , 11}.
By analysing the results for the models identified based on all variables (Table 3), it may be observed that the most accurate model for women was obtained using the RBF type neural network with eight neurons in the hidden layer (RMSE CV = 4.31 [mL·kg −1 ·min −1 ]). The best model making predictions among men were based on the whole collection of variables which generates an error of RMSE CV = 5.50 [mL·kg −1 ·min −1 ]. This was calculated with Ridge regression for λ = 20. A similar observation may be made for the most accurate model for all data where the Ridge regression (λ = 12) is also the one which generates the smallest error (RMSE CV = 4.89 [mL·kg −1 ·min −1 ]). The following stage of the analysis involved variable selection methods to improve the predictive capacity of the presented models, determine optimal input collections, and consequently to identify the factors which determine the VO 2 max prediction in the analysed group. The models determined by means of Lasso and stepwise regression were evaluated using LOOCV ( Table 4). The application of variable selection methods improved the predictive capacity of the OLS model. The error level for the female group was RMSE CV = 4.11 [mL·kg −1 ·min −1 ] when the bidirectional method was used, while, for males, the model obtained with forward regression turned out to be more precise-RMSE CV = 5.35 [mL·kg −1 ·min −1 ]. For all data, the forward and bidirectional methods determined the same input data for which the error generated by the model amounted to RMSE CV = 4.78 [mL·kg −1 ·min −1 ]. The identified input collections were used to calculate new neural networks (Table 4). Neural networks for selected variables are characterised by a smaller prediction error than the networks identified based on all variables. RBF models generate the value of RMSE CV = 4.07 [mL·kg −1 ·min −1 ] for women, RMSE CV = 5.30 [mL·kg −1 ·min −1 ] for men and RMSE CV = 4.80 [mL·kg −1 ·min −1 ] for all participants. The models obtained for females and males demonstrate the poorest fitness among all analysed models, while the model for all data is worse than the OLS model (forward, bidirectional). The equations for optimal linear models are presented in Table 5.   All-all participants, F-female, M-male, Lasso-ordinary least squares regression, OLS-ordinary least squares regression, MLP-multilayer perceptron, RBF-artifical nural network with radial basis function. Table 5. VO 2 max linear predictive equations.

Discussion
The paper presents mathematical models for the prediction of VO 2 max based on a 20 m SRT and anthropometric parameters. All models were developed and cross-validated using a sample of 308 healthy young people (aged [19][20][21][22][23][24][25][26][27]. The models obtained are classified as "maximal models" [49] and that is why errors in this group of models were compared with errors presented by other authors. The majority of papers with VO 2 max maximal predictive models present a common model for females and males [15,30,31]. An analysis of errors for predictive models in which sex was the input variable revealed that the error RMSE CV = 4.78 [mL·kg −1 ·min −1 ] is a result which does not deviate from errors presented in other papers ( Table 6). The calculated model is more accurate than the model proposed by Mahar [30] and Silva [15]. It is however less accurate than the models proposed in papers by Akay [31,50]. The optimal model for the whole group used the following factors for prediction: sex (x 0 ), distance (20 m SRT) (x 1 ), body height (x 7 ) and content of adipose tissue (x 20 ).
In young males and females, regular physical exercise definitely improves CRF by increasing VO 2 max and decreasing body fat percentage, leading to a better quality of life [51][52][53][54]. The VO 2 max level varies significantly among individuals and mainly depends on genetic aspects, sex, age anthropometric properties of health, lifestyle and training status [51,52,[55][56][57][58][59]. Reference values may change over time and should be regularly updated/validated [60]. The data may be found in papers analysing the results of cardiac stress tests carried out for large groups of healthy people [8,60,61]. Typical values of VO 2 max in young healthy male students amount to about 50 [mL·kg −1 ·min −1 ], and in women to about 40 [mL·kg −1 ·min −1 ] [62]. CRF in terms of aerobic capacity is affected by the composition of the body. Low CRF in young adults with increased body fat could be a factor in the development of cardiovascular comorbidities later in middle age and old age [63]. Bioimpedance is an alternative method to estimate the percentage of body fat, when compared to DXA (Dual Energy X-ray Absoptiometry), it is a gold standard method, as there is a high level of agreement [64]. Simple use, lack of radiation and the relatively low cost of bioelectric impedance suggests that it is a feasible analysis for body fat measurement, especially in large populations [65]. Besides a common model for females and males, this paper determines separate models for each group. It may be observed that the models for females are characterised by a significantly smaller error as compared to those for men. The RBF neural network with eight neurons in the hidden layer turned out to be the most accurate. It generates an error of RMSE CV = 4.07 [mL·kg −1 ·min −1 ]. Bandyopadhyay [66] presented similar studies in his paper. He described a multiple regression model whose prediction error amounted to 1.27 [mL·kg −1 ·min −1 ] ( Table 6). The model used speed and body height as predictors. Other studies concerning prediction among females were published by Chatterjee [23], who calculated a model generating an error of 0.53 [mL·kg −1 ·min −1 ]. The model used only the maximum speed. When the errors obtained in this paper are compared with errors obtained by other authors, it may be concluded that their errors were significantly smaller. It shall be emphasized that the population analysed in the study was much larger and more diversified for the VO 2 max level (44.9 ± 7.0 [mL·kg −1 ·min −1 ]). Moreover, cross-validation was used in the paper for evaluation, while in the papers discussed the model quality was evaluated by means of standard errors. That is why direct comparisons are impossible. The model calculated for women has the simplest structure and includes: distance (20 m SRT) (x 1 ), body height (x 7 ) and body fat (x 20 ). The model contains the same variables as the model for all participants save for sex, which is a constant.
Similarly to the female group, the RBF network turned out to be the best model for males, whereby in the male group the optimal network consists only of two neurons in the hidden layer. The network generates an error level of RMSE CV = 5.30 [mL·kg −1 ·min −1 ]. Comparing the obtained results with other papers where models were calculated only for males, it may be observed that the parameter value does not deviate significantly from the published results ( Table 6). The optimal model for males calculated in the study generated greater errors than the models presented by Machado and Demadai [67] (4.10 [mL·kg −1 ·min −1 ]) and Bandyopdhyay [68] (1.41 [mL·kg −1 ·min −1 ]). The error obtained is still smaller than the error generated using the Costa model [69] (7.20 [mL·kg −1 ·min −1 ]).
When comparing the errors of the calculated models, it should be emphasized that a direct comparison of the errors obtained in the studies is valid only when the same maximal test is used and when a similar study sample is employed [70]. Therefore, it seems that the results obtained may only be compared with models which use 20 m SRT and are developed based on a population of healthy people aged 19-27 years, and are cross-validated. Cross-validation does not evaluate the fitness error but a generalisation error.
The model determined for males consisted of the following variables: distance (20 m SRT) (x 1 ), waist circumference (NIH protocol) (x 9 ), WHR (NIH protocol) (x 14 ), FFMI (x 17 ) and BSA (x 19 ). The obtained collection of predicators is not incidental and the significance of each variance for VO 2 max is reflected in the literature. Waist circumference and WHR are popular indicators used for the evaluation of cardiovascular diseases [71]. Waist circumference is also a predictor of visceral fat and indicator of central obesity [72]. There remains no uniformly accepted measurement protocol, resulting in a variety of techniques employed throughout the published literature [73,74]. The most commonly used are four measurements of waist circumference defined by specific anatomic landmarks: (1) immediately below the lowest ribs, (2) at the narrowest waist, (3) the midpoint between the lowest rib and iliac crest, and (4) immediately above the iliac crest [75,76]. Measurements made at the umbilicus are also commonly used in clinical and research settings [74]. Besides DXA and bioimpedance, different indicators based on anthropometric measurements are used for the evaluation of body fat content. The indicator that is often used in obesity studies is BMI [39,71,77]. However, it presents some limitations. It neither evaluates the content of adipose tissue and its distribution in the body nor does it differentiate body fat content depending on age, sex and ethnic origin [71,78]. The identification of adipose tissue location is necessary to detect visceral (central) obesity, which increases the risk of cardiovascular diseases [79]. That is why the qualitative supplementation of the classical concept of BMI may help to create reference values for FFMI for a given age category [80]. BSA, a method for describing body size, is commonly used in medicine as a biometric unit [81,82].
The use of a predictive model makes it possible to determine the level of CFR without using expensive equipment and specialized research teams. The prediction of the VO 2 max parameter may be useful as an element of monitoring the health and physical performance of students of young healthy people. OLS-ordinary least squares regression, MLP-multilayer perceptron, SVM-support vector machines, RBF-artifical nural network with radial basis function.
The strongest points of presented paper are: • a large research group representing the population, • VO 2 max measurements made using direct methods, • the use of a large set of variables to determine the optimal predictors for VO 2 max estimation, • and obtaining a relatively small VO 2 max estimation error.
The limitations of the study are related to the using models in practice. VO 2 max estimation by the proposed models should be performed for healthy people of the same age as the research group.