A Multivariate Analysis of “Metabolic Phenotype” Patterns in Children and Adolescents with Obesity for the Early Stratification of Patients at Risk of Metabolic Syndrome

Background: Metabolic syndrome (MS) is closely linked to obesity; however, not all individuals with obesity will develop obesity-related complications and a metabolically healthy obesity (MHO) group is also described. Objective: To perform a multivariate analysis (MVA) of the anthropometric and biochemical data in pediatric patients with obesity to reveal a “phenotype” predictive for MS. Methods: We analyzed 528 children with obesity (OB) and 119 normal-weight pediatric patients (NW). Adiposity indices were recorded, and MS was detected. MVA was performed. Results: Analysis of the structure of correlation of the variables showed that the variables of waist circumference (WC), body mass index (BMI), and estimated fat mass (eFM) were positively correlated with each other as a whole. In addition, the variables of the triglycerides (TG), triglyceride–glucose (TyG) index, and visceral adiposity index were positively correlated with each other as a whole, although none were correlated with the variables of BMI z-score, waist-to-height ratio, WC, eFM, or weight. The variables that related to insulin resistance (IR) and dyslipidemia were crucial for the early stratification of patients at risk of MS. Conclusions: Independently of body weight, IR, dyslipidemia, hypertriglyceridemia, and fat distribution seem to be the strongest MS risk factors. The early detection of and intervention in these modifiable risk factors are useful to protect children’s health.


Introduction
Childhood obesity is one of the most serious public health issues in pediatrics [1][2][3]. The problem is global, and it is increasingly affecting many low-and middle-income countries, especially in urban settings. According to the World Health Organization (WHO), overweight and obesity are defined as abnormal or excessive fat accumulation that presents a risk to health [2]. Obesity-related complications can be identified at an early age, with important health and economic consequences [4].
In particular, metabolic syndrome (MS) has been closely linked to overweight and obesity in pediatrics. The prevalence of MS increases with the severity of obesity [5], and is a predictor of type 2diabetes (T2D), cardiovascular diseases (CVDs), and all-cause mortality in adults [4]. MS prevalence in pediatrics ranges from 0.3% to 26.4%, according to the 2 of 15 large number of different pediatric definitions of MS and several population studies [6]. Dysglycemia/insulin resistance (IR), hypertension, high levels of triglycerides (TG), and low high-density lipoprotein cholesterol (HDL cholesterol) are included as inter-related components of MS [7][8][9][10][11][12][13][14].
However, not all individuals with obesity will develop obesity-related complications. According to the literature, a metabolically healthy obesity (MHO) group can be described [27]. The underlying mechanisms of metabolic regulation are unclear; adipokine levels, visceral fat contents, percentages of ectopic fat, and different biochemical profiles have been reported in MHO individuals compared with patients with metabolic complications [28]. The definition of clinical patterns related to metabolic risk in pediatrics may lead to more appropriate health preventive strategies.
Multivariate analysis (MVA) is used in medical research due to its ability to explore the structure of correlations and retrieve relevant associations within the data [29,30]. Studies on the inter-relationship of new adiposity indices with different metabolic phenotypes and cardiometabolic risk markers, according to sex, are limited in pediatrics [31].
We performed an MVA of the anthropometric and biochemical parameters, i.e., the "metabolic phenotype", in children and adolescents with obesity, in order to identify clinical patterns among highly interrelated clinical variables and biomarkers predictive of MS. The early detection of specific metabolic phenotypes may be important for the tailored treatment and care of pediatric patients with obesity.

Patients
We retrospectively analyzed 528 Caucasian children and adolescents (274 females and 258 males, aged 11 years old (interquartile range (IQR) 9-12.2 years) with obesity referred to the Vittore Buzzi Children's Hospital, Milan and the San Paolo University Hospital of Milan, by their general practitioner or primary care pediatrician. Patients were referred between May 2019 and May 2021. Exclusion criteria were known secondary obesity conditions, the use of any ongoing medical therapy, and concomitant chronic or acute illnesses.
As a control group, we considered 119 healthy Caucasian children and adolescents comparable for age and sex: 59 males, 60 females, with a mean age 11 years (9-13 IQR), who were enrolled as controls for other metabolic studies. All the parents or guardians gave their consent to retrospectively enroll the subjects in other studies for clinical research purposes, epidemiology, studies of pathologies, and training, aiming to improve knowledge, care, and prevention.
In all subjects, clinical evaluations and biochemical profiles were considered. All participants or their responsible guardians provided written consent after being informed about the nature of the study. The study (protocol numbers 2015/ST/135 MI, 2020/ST/234 MI) was approved by the institutional ethics committee, and the study was conducted in accordance with the Helsinki Declaration of 1975, as revised in 2008.

Clinical Examination
In all of the participants, their height, weight, pubertal stage, waist circumference (WC), WHtR, measurement, DBP and SBP were considered. Standing height was measured using a Harpenden Stadiometer, with the child in an upright position, without shoes, with their heels together and toes apart, their hands by their sides, and the head aligned in the Frankfort horizontal plane [32,33]. Weight was quantified with participants not wearing shoes and in light clothing, standing upright in the center of the scale platform [32,33]. Using a flexible inch tape, the waist circumference was measured at the midpoint between the lower border of the rib cage and the iliac crest [32,33]. Blood pressure was measured twice, consecutively, using a mercury sphygmomanometer, with an appropriately sized cuff on the right arm, which was slightly flexed at heart level. The second BP measurement was used for analysis [32,33].
BMI was calculated as the body weight (kilograms) divided by height (meters squared) and was transformed into BMI z-scores using WHO reference values [36].
According to the BMI z-score WHO classification [36] all subjects were classified into children with normal-weight (NW; −2 ≤ BMI z-score ≤1) and with obesity (OB; BMI z-score ≥ 2).
In addition to BMI, we also considered WHtR, because it is helpful in detecting children with a higher likelihood of presenting metabolic and cardiovascular risks [42], and it is more appropriate than WC alone to track changes in abdominal adiposity among adolescents [44].
As previously reported [31], a pathological level of fasting blood glucose (FBG) and/or IR was used as a marker of gluco-metabolic derangement, because impaired fasting glucose is rare in childhood and IR precedes glucose abnormalities [45]. The euglycemichyperinsulinemic clamp is the gold standard for measuring IR; however, this method is invasive, time-consuming, and is difficult to apply with pediatric patients.

Statistical Analysis
Univariate statistical analyses were performed to study the anthropometric and metabolic characteristics of children with NW and OB included in the analysis. We estimated the density function of the analyzed variables by applying a nonparametric method of kernel density estimation, representing the function by means of violin graphs, including a bar graph showing the mean and standard deviation values. Thus, we obtained a complete distribution view. The graphs were grouped according to obesity status (NW, OB).
To investigate the potential association between anthropometric and metabolic variables characterizing the children, principal components' analysis was performed on the group of the children with obesity. This method allowed us to reduce the dimensionality of the dataset by projecting each data point onto the first principal components (up to three), preserving as much variation in the data as possible. We explored the metabolic and anthropometric profiles of normal weight observations by passively projecting them onto the spaces defined by the three first PCs.
To explore possible differences and similarities among observations of OB children that might define more characterizing but hidden subgroups, a cluster analysis technique was applied to all observations in the dataset, considering all anthropometric and metabolic variables analyzed in previous analyses, for construction of the distance matrix. First, hierarchical cluster analysis using Ward's method produced a dendrogram for estimating the number of likely clusters within the sample. We made cuts at the points of change between successive fusion levels to define likely cluster boundaries. The derived number of clusters was sub-specified using k-means cluster analysis, the main clustering technique. To achieve repeatability and stability in each model, the k-means algorithm was run 50 times with random starting points. Clusters defined by the k-means algorithm were subsequently projected into subspaces defined by principal component 1 (PC1) 1, PC2, and PC3, calculated by performing PCA only on OB children.
Further univariate statistical analysis was performed to study the different characteristics of the clusters defined in the analysis. We produced a table in which all anthropometric and metabolic variables are summarized as the mean (sd) and stratified by cluster. In addition, violin plots were produced to visually capture the differences between clusters. All the statistical analyses were performed using R software (version 4.1.2). The packages used for multivariate analysis were 'FactoMineR' (version 2.4), 'factoextra' (version 1.0.7), and 'stats' (version 3.6.2).

Univariate Analysis
For the OB group, 528 observations were available, whereas for the NW children group, 119 observations were available. The main statistics of the variables considered in the univariate analysis for both groups of children are presented in Table 1. OB and NW were comparable for age and sex. As expected, significant differences were noted for all the anthropometric and metabolic variables, between the two groups.

Principal Component Analysis
The first three principal components explained 56% of the variance in the data (PC1 = 30%, PC2 = 13%, PC3 = 13%). The projections of the original variables onto the three planes are shown in Figure 1. The first axis (PC1) was characterized primarily by variables such as WC, eFM, BMI, and weight (which were highly correlated according to the BMI formula), but also influenced to a lesser degree by the variables of insulin and HOMA-IR (also highly correlated according to the HOMA-IR formula). The second axis (PC2) was mainly characterized by the variables WC/Ht, BMI z-score, height, and age. The third axis (PC3) was characterized to a great degree by the variables of fasting TG and TyG index, and to a smaller degree by the variable VAI.
BMI z-score (median (IQR)) 2. 74  The continuous variables are reported as the median and IQR, whereas categorical variable are reported as frequencies and percentages.

Principal Component Analysis
The first three principal components explained 56% of the variance in the data (PC1 = 30%, PC2 = 13%, PC3 = 13%). The projections of the original variables onto the three planes are shown in Figure 1. The first axis (PC1) was characterized primarily by variables such as WC, eFM, BMI, and weight (which were highly correlated according to the BMI formula), but also influenced to a lesser degree by the variables of insulin and HOMA-IR (also highly correlated according to the HOMA-IR formula). The second axis (PC2) was mainly characterized by the variables WC/Ht, BMI z-score, height, and age. The third axis (PC3) was characterized to a great degree by the variables of fasting TG and TyG index, and to a smaller degree by the variable VAI. It appeared that variables such as SBP and DBP were not well represented and did not contribute to the definition of the first three principal components. It appeared that variables such as SBP and DBP were not well represented and did not contribute to the definition of the first three principal components.
In all the planes defined by the PCs, it appeared that the variables WC, BMI, and eFM were clustered and positively correlated with PC1. In the plane defined by PC1 and PC2 and the plane defined by PC1 and PC3, the variables of fasting TG, TyG index, and VAI were positively correlated with each other and with PC3, although neither were correlated with the variables of BMI z-score, WC/Ht, BMI, WC, eFM, and weight.
In the next step, we analyzed the three planes defined by the PCs for the distribution of all observations in the dataset, by passively projecting the NW with the OB children. As shown in Figure 2, in the plane defined by PC1 with either of the other two dimensions, it was clear that observations related to NW children (absence of obesity) clustered in the region of the plane where PC1 is negative, in the opposite direction to where the vectors of the projections of the variables were related to the obesity point. Observations related to the group of OB children are observed much more frequently in the positive values of PC1. Overall, PC1, resulting from the principal component analysis conducted on the OB dataset and characterized by anthropometric variables, was critical in describing the differences between the group of OB children and the group of NW children, as would be expected.
vectors of the projections of the variables were related to the obesity point. Observations related to the group of OB children are observed much more frequently in the positive values of PC1. Overall, PC1, resulting from the principal component analysis conducted on the OB dataset and characterized by anthropometric variables, was critical in describing the differences between the group of OB children and the group of NW children, as would be expected. Interestingly, four of the NW observations, when projected with the OB observations, were in the same region of space defined by PC1 and PC2, where the OB points were present. In fact, they showed greater values of mean insulin, HOMA-IR, fasting TG, TyG-index, and VAI than the overall NW children.

Cluster Analysis
The hierarchical cluster analysis was conducted in concordance with the Ward method, which resulted in the definition of three clusters. We pre-specified the number of groups computed with the hierarchical algorithm in the k-means algorithm: 50 random Figure 2. In this panels, the obesity (OB) observations (orange points) are projected in the subspaces defined by the first three principal components that explained the most variability in the dataset. The NW (light blue points) are passively projected in the subspaces. As can be observed in figure (a-c), the NW observations clustered in a region of the space where PC1 is negative.
Interestingly, four of the NW observations, when projected with the OB observations, were in the same region of space defined by PC1 and PC2, where the OB points were present. In fact, they showed greater values of mean insulin, HOMA-IR, fasting TG, TyG-index, and VAI than the overall NW children.

Cluster Analysis
The hierarchical cluster analysis was conducted in concordance with the Ward method, which resulted in the definition of three clusters. We pre-specified the number of groups computed with the hierarchical algorithm in the k-means algorithm: 50 random repetitions were used. We confirmed the best number of groups found by the k-means algorithm by plotting the sum of squares of the groups in function of the number of clusters to be considered; the best trade-off was in three clusters which we called subgroup 1 (S1), subgroup 2 (S2), and subgroup 3 (S3).
By projecting the observation grouped by the different clusters onto the subspaces defined by the first three principal components, it was clear that S2 and S3 were more clustered and centered, whereas S1 was more disperse, particularly in the direction of the third principal components, as shown in Figure 3. It also appears that the three subgroups differentiated in the direction of the first principal component. Moreover, if we projected the observations labeled by the condition of MS on the subspaces and compared them with the subgroups defined by the k-means algorithm, the group of MS = 1 overlapped with S1, whereas MS = 0 overlapped with S2 and S3.
defined by the first three principal components, it was clear that S2 and S3 were more clustered and centered, whereas S1 was more disperse, particularly in the direction of the third principal components, as shown in Figure 3. It also appears that the three subgroups differentiated in the direction of the first principal component. Moreover, if we projected the observations labeled by the condition of MS on the subspaces and compared them with the subgroups defined by the k-means algorithm, the group of MS = 1 overlapped with S1, whereas MS = 0 overlapped with S2 and S3. The violin plots produced to explore the distribution of the anthropometric and metabolic variables in each cluster clearly showed that cluster S2 and cluster S3 had a similar profile with respect to S1, where most of the cases instead showed greater means in all the variables studied. Interestingly, we found evidence of a difference between the groups for all the anthropometric variables, except for the variables of ConI and ABSI. From Table 2, which summarizes the distributions of the three clusters, it is clear that S1 has the greatest values for the anthropometric and metabolic variables. The violin plots produced to explore the distribution of the anthropometric and metabolic variables in each cluster clearly showed that cluster S2 and cluster S3 had a similar profile with respect to S1, where most of the cases instead showed greater means in all the variables studied. Interestingly, we found evidence of a difference between the groups for all the anthropometric variables, except for the variables of ConI and ABSI. From Table 2, which summarizes the distributions of the three clusters, it is clear that S1 has the greatest values for the anthropometric and metabolic variables.
From examining the relative proportions of children with metabolic syndrome in the different subgroups, it appeared that some of the observations of the S1 which, if projected to the subspace defined by the PCs, should be characterized by elevated values of anthropometric and metabolic variables, still did not have MS. In addition, some children of cluster S2, the one characterized by the lowest values of PC1, were also characterized as having MS. Finally, a high proportion of the observations in S3 (intermediate) still did not have MS. We summarized the distribution of the variables for the observations of cluster S2 with MS in Table 2.  Table 2 summarizes the descriptive statistics of the OB observations stratified by the three clusters obtained with the k-means algorithm. Here, the continuous variables are reported as the median and IQR, whereas categorical variables are reported as frequencies and percentage. S1 has the greatest prevalence of MS, followed by S3 and S2. In addition, the table reports the descriptive statistics of a particular sub-cluster of S2 children which contains observations that even though are characterized by lower values of the variables positively associated with PC1, still they are classified as having MS. In fact, they show higher values in the insulin, HOMA-IR, TyG index, VAI, and fasting TG variables compared with the median values of the S2 cluster.

Level
Children in Subgroup 1 (S1) The observations of S2 with MS were characterized by lower values of eFM, waist, WHtR, and BMI z-scores, but by surprisingly higher values of TyG index, TG, VAI, HOMA-IR, and insulin. In contrast, the observations of S1 without MS were characterized by lower values of TG, insulin, HOMA-IR, and VAI, but with the highest values for eFM, WC, and WHtR. Finally, the observation of S3 without MS presented lower values for TG, insulin, HOMA-IR, and VAI, and intermediate values for eFM, waist, and WHtR.

Children in
By considering the graphs of the observations projected onto the principal components labeled for their subgroups and for the condition of MS, it seemed that the presentation of MS generally follows the first principal component, although in several cases, this pattern is not respected.

Discussion
We have presented a multivariate MVA. We noted that the metabolic risk is not only related to body weight. The variables related to IR and dyslipidemia, such as high TG levels and fat accumulation, were crucial for the early stratification of patients at risk of MS.
MS is characterized by interconnected risk factors of metabolic origin, leading to CVD and T2D and an increasing mortality risk in adult age [7][8][9]46]. One recent review reinforced the evidence that obesity (and related IR) is a consistent single risk factor for T2D; therefore, it is crucial for limiting the progression of obesity [47]. There are different dysregulated metabolic pathways (increased free fatty acid flux from adipose tissue, increased hepatic de novo lipogenesis from excessive carbohydrate consumption, and hypertriglyceridemia) that accelerate the progression from normal to impaired glucose tolerance test and diabetes [48]. Adipose tissue expansion leads to the overproduction of adipocytokines, further exacerbating metabolic dysfunction. Excesses or deficiencies of adipocytokines such as leptin or adiponectin further affect insulin sensitivity [49]. The complex interactions between glucotoxic and lipotoxic effects which ensue during IR in children with obesity can worsen IR, and consequently, exacerbate dyslipidemia and hyperglycemia due to positive feedback loops; this clearly indicates that more appropriate and complex statistical modeling should be employed [47,50].
A revision of the classification could provide a powerful tool to individualize treatment regimens and identify individuals with an increased risk of complications at diagnosis. We stratified patients into three subgroups with differing disease progression and diabetic complication risks. This new sub-stratification might represent a first step towards precision medicine in diabetes.
Due to different definitions, the MS prevalence in pediatrics remains unclear, and varies widely depending on the different definitions utilized [15]. It is reported more frequently in subjects who are overweight (11.9%) and with obesity (29.2%) [15,51,52]; however, unhealthy normal-weight patients and metabolically healthy obesity (MHO) has also been reported [28]. We noted that some of the NW observations projected with the OB observations were in the same space where the OB values were present, as defined by PC1 and PC2. On the other hand, some OB observations were far away from the mean center of the OB group and were closer to the mean center of the NW group. These data confirmed that the pathogenic mechanism of the metabolic regulation may be multifactorial, and excessive fat accumulation in tandem with a normal BMI does not protect against metabolic risks [16].
In adults, higher levels of adiponectin, lower visceral fat contents, and lower percentages of ectopic fat, such as in the muscles and liver, have been reported in MHO individuals compared with patients with metabolic complications [53][54][55]. In unhealthy normal-weight subjects, metabolic risks such as elevated glucose, insulin resistance, dyslipidemia, and hypertension are more strongly associated with high subcutaneous abdominal fat mass, visceral obesity, or fatty liver [56]. Cota et al. reported that the presence of normal-weight obesity in adolescents was associated with the accumulation of abdominal fat and an unfavorable lipid profile [16].
Our analysis supports the role of IR and dyslipidemia, such as high TG levels, and fat distribution as principal players interested in the risk stratification for MS. We noted that both the HOMA-IR and TyG Index are involved; these two indices have been proposed as simple surrogates with high sensitivity in recognizing IR and are considered as optimal predictors of metabolic and cardiovascular diseases in normoglycemic [57] and pre-diabetic patients [58,59]. Compared with other IR indicators, the TyG index mainly quantifies IR in muscle and represents a better indicator for peripheral IR, which could also be of interest in NW or underweight [60] patients in which a relatively low leg fat mass and high subcutaneous abdominal fat mass are present. Studies have shown that TyG, independent of body weight, is related to a risk of diabetes [61], hypertension [62], and non-alcoholic fatty liver disease [63], and it can predict the development of cardiovascular events [64].
We confirmed that, in addition to BMI, other anthropometric measures, such as WC and WHtR, may be useful to identify abdominal obesity. Additionally, the evaluation of new adiposity indices, particularly VAI and eFM, may be helpful in clinical practice to estimate the role of body fatness and fat distribution, when instrumentation to estimate the body composition is not available, and for the early detection of at-risk children [17][18][19][20][21][22][23][24][25][26]. In contrast, compared with the reported data in adults, blood pressure is not included in the parameters that contribute to definitions of the principal components of risk. The roles of IR and hyperinsulinemia in the pathogenesis of essential hypertension have been extensively reported [65,66]; the instauration of metabolic alterations may precede hypertension, and the time of exposure may trigger the manifestation and evolution [67,68].
In the literature, debates remain as to whether MHO represents a unique subset of people with obesity, or is simply a group which is in transition to the later development of metabolically unhealthy obesity (MUO) [69]. As reported by Blüher [70], the principal factors associated with the conversion of MHO to MUO are a decline in insulin sensitivity and an increase in fasting blood glucose [71]. In our population, the distribution of the clinical and biochemical variables stratified by the clusters clearly represented a condition in which S2 and S3 were more similar to each other with respect to S1, suggesting that a transitional phase could exist and that the IR, earlier than hyperglycemia, initiates different pathogenic pathways which increase metabolic risk and result in the full expression of the MS [72,73].
The concordant profiles of IR and triglycerides in subgroups support the notion that hyperinsulinemia could induce an increased transcription of genes for lipogenic hepatic, leading to increased TG production and influencing the severity of metabolic involvement [15]. The hypothesis that triglycerides may be an earlier dismetabolic marker could also be considered [74]. As reported elsewhere [75], hypertriglyceridemic waist phenotypes, characterized by increased TG levels and WC, can be used to predict cardiovascular risk in adult men. Despite high triglycerides levels being a modifiable cardiovascular risk factor, they contribute to oxidative damage [33] which may play a causative role in the synergistic effects of the MS components.
The MHO concept may be a model to better understand the mechanisms linking obesity with cardiometabolic diseases [76]. In pediatrics, this aspect is particularly interesting to explore, in order to define preventive measures and to adopt a tailored monitoring.
We recognize that there are some limitations to this study, starting with its retrospective cross-sectional nature. Additionally, we considered indirect indices of body fatness and fat distribution; a comparison with body compositions estimated from bio-impedance could improve the validity of our findings. Finally, we only considered anthropometric and biochemical parameters to stratify the metabolic risk; moreover, additional factors such as genetic contribution, dietary intake, physical activity level, composition, and the diversity of the microbiota could also be considered. Despite the limitations, this type of analysis is interesting to consider for the health assessments of pediatric patients with obesity.

Conclusions
Here, we considered different anthropometric measures and biochemical variables to improve the early stratification of pediatric patients at risk of MS, independently of body weight. A transitional phase from MHO to MUO may not be excluded in pediatrics.
The early detection of the interrelated variables and interventions on these modifiable risk factors is useful to protect pediatric health. Cluster analysis may help in identifying specific patient characteristics related to disease and therapeutic responses leading to a personalized medicine.  Informed Consent Statement: Informed consent was obtained from all subjects or their responsible guardians involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.