Socioeconomic Status, Health and Lifestyle Settings as Psychosocial Risk Factors for Road Crashes in Young People: Assessing the Colombian Case

The social determinants of health influence both psychosocial risks and protective factors, especially in high-demanding contexts, such as the mobility of drivers and non-drivers. Recent evidence suggests that exploring socioeconomic status (SES), health and lifestyle-related factors might contribute to a better understanding of road traffic crashes (RTCs). Thus, the aim of this study was to construct indices for the assessment of crash rates and mobility patterns among young Colombians who live in the central region of the country. The specific objectives were developing SES, health and lifestyle indices, and assessing the self-reported RTCs and mobility features depending on these indices. A sample of 561 subjects participated in this cross-sectional study. Through a reduction approach of Principal Component Analysis (PCA), three indices were constructed. Mean and frequency differences were contrasted for the self-reported mobility, crash rates, age, and gender. As a result, SES, health and lifestyle indices explained between 56.3–67.9% of the total variance. Drivers and pedestrians who suffered crashes had higher SES. A healthier lifestyle is associated with cycling, but also with suffering more bike crashes; drivers and those reporting traffic crashes have shown greater psychosocial and lifestyle-related risk factors. Regarding gender differences, men are more likely to engage in road activities, as well as to suffer more RTCs. On the other hand, women present lower healthy lifestyle-related indices and a less active implication in mobility. Protective factors such as a high SES and a healthier lifestyle are associated with RTCs suffered by young Colombian road users. Given the differences found in this regard, a gender perspective for understanding RTCs and mobility is highly suggestible, considering that socio-economic gaps seem to differentially affect mobility and crash-related patterns.


Introduction
Much research has been conducted in the field of traffic safety, since the consequences of Road Traffic Crashes (RTCs) and Road Traffic Injuries (RTIs) have been recognized as a major concern for public health [1,2]. The numbers show that, worldwide, around 1.35 millions of people die and 50 million are injured as a consequence of road traffic crash-related events [3]. Unfortunately, despite the efforts made by different researchers, governments and institutions, these occurrences are still present in our life.
An important case study is the one corresponding to developing countries. These countries are especially impacted, since, in addition to being economically affected by low economic growth, they also suffer a huge number of crashes. The worrisome part of this driving behavior [41]. For what concerns the physical aspect, some argue that people with high body mass may be at higher risk of suffering a RTC [42,43].
To sum up, SES, health and lifestyle influence, and even determine, the psychological and social risk or protective factors: better SES, health and lifestyle indices are associated with a better psychological health [44], and the poorer the psychological health, the more probabilities of suffering RTCs [45,46]. Now, in order to study the relation of these topics with traffic and road safety, all the above should be considered, starting from the following premises: which country are we talking about? What are the characteristics of people at risk, and of those who suffer these crashes? In the case of Colombia, it has been reported that a driver can be four times more likely to die in a crash, compared with a driver in Spain [47], in addition to a ratio of 18.5 RTC deaths every 100,000 inhabitants [3]. Moreover, young Colombians are a risk group, and they are vulnerable to RTCs [48].
Taking into account that the consideration of SES, health and lifestyle allows us to understand why traffic crashes seem to be present and possibly increasing in developing countries, the objective of this work is to construct indices related to these major topics in order to explore their relationship with RTCs and mobility in a sample of young Colombian participants. The null hypothesis that there are no significant differences between groups is going to be tested for each index, expecting that the groups with the most vulnerable SES, unfavorable health and worse lifestyle will present more crashes and will have patterns of more active mobility. As specific objectives, the study aims at: (1) developing SES, health and lifestyle indices for this country, that will take into account sociodemographic variables, SEP indicators and health-related information; (2) assessing self-reported RTCs and mobility features depending on the SES, health and lifestyle indices.

Participants
Colombia is a country with 44.164 million inhabitants [49]. Several studies have pointed out that, taking into account a confidence level of at least 95% and a 5% margin of error, a minimum sample size of n = 385 is required in order to conduct meaningful analyses [50][51][52]. We will take this number as our sample reference, assuming that a population group adequately represents the population from which such group is extracted [53,54]. According to the Statuary Law, from 1855 and from 2018, as a modification of the Young Citizens Status, in Colombia young people are those with an age ranging from 14 to 28 years old, and youth is considered the stage during which one's intellectual, moral, physical, economic, social and cultural autonomy are being built [55]. It is reported that young people represent at least 21.8% of the country's total population [56].
Following a cross-sectional design, the sample was collected through convenience sampling, and participants who were older than 17 were included. Since young people were the study's target, the research relied on the cooperation of university lecturers, who emailed their contacts an invitation to participate. Overall, 20 professors were invited, and 15 of them accepted, thus having a 75% margin of acceptance. A total of 731 interviews were completed, and after a cleaning and refining process through the age filter > 17 and < 29, a final sample of n = 561 was selected, therefore reducing the margin of error to 4.14%. Most of the respondents (65.95%) were from Bogotá, the most populated city in the country, and from municipalities from Cundinamarca surrounding the capital (30.12%).

Procedure and Data Analysis
Facing the limitations of web-based surveys but highlighting their economic advantages, their efficiency in collecting data, their reduction of interviewer biases [57], and the fact that, through a rigorous design and development, "results from an online survey may be no different than paper based survey results" [58], this study gathered the data using an online survey named "Encuesta de Salud y Seguridad Vial" ("Survey on Road Safety and Health"), whose average completion time was 40 min. This survey collected data on sociodemographic and crash records information, as well as on some specific scales. It was reviewed by two experts: a psychologist with traffic safety experience, and a civil engineer with experience in the assessment of human factor in transportation. After their recommendations, the instrument was tested in a pilot study including 50 participants, which allowed for the elimination of ambiguous items.
To achieve the general and specific objectives, Principal Component Analysis (PCA) was used to construct the indices. Chi Square Independency Test and Student's t-test for Independent Samples were performed to compare group means, both with a 95% level confidence, testing the null hypothesis that there are no significant differences between groups. The p values were adjusted through False Discovery Rate (FDR), which is thought to be the best approach, "as it not only reduces false positives, but also minimizes false negatives" [59]. Finally, a violin plot to show the full distribution of the data was charted. All the previous steps were performed using the free software environment for statistical computing and graphics R [60].

Index Construction
Broadly speaking, an index is a measure composed of other variables that allows for the representation of a construct or result [61]; it can be used as a quantitative indicator of the researched idea. Indices can be developed in different ways, however, in the case of SES constructs and health-related indices, the PCA is a variable reduction approach which remains constantly used and is thought to be useful in epidemiological studies, despite its limitations. Howe, Hargreaves and Huttly [62] consider that a PCA "involves replacing a set of correlated variables with a set of uncorrelated 'principal components' which represent unobserved characteristics of the population." Additionally, beyond the method that is used, what will weight on the results of the model seems to be the categorization of the variables [62]. This perspective was taken into account to construct three (3) indices, considering that every time an item is categorized differently, the PCA results change; thus, a total number of 76 items, contemplating the original item and its different forms of categorization, were considered (See Appendix A).

•
For what concerns SES: Socio-economic stratification, which in Colombia is a way to classify the residential properties that must receive public services and subsidies according to their social stratum, are established in the Law 142 from 1994 [63]. SEP indicators include the wage reported in the Minimum Legal Wages for the year 2020 in Colombia, the occupational status and the educational level. Evaluation of wealth assets: residing in one's own house (belonging to the individual or to the nucleus of co-habitation, where no rent is to be paid); access to a computer; money for leisure; savings; debts; permanent access to the internet; and covered month (which means the feeling of being able to manage with the available monthly income). Number of people who inhabit the home. The average number for Colombian homes is 3.3 in urban zones and 3.9 in rural zones. Furthermore, 52.7% of homes with 5 or more people reported incomes below 2 minimum wages [64]. This type of family structure, or cultures that foster familistic societies, can be not so good on an economic level. This is due to the fact that, regardless of the possible social support that these networks provide, economic resources seem to be more associated with living alone instead [65].

•
Regarding health: the perception of having a good health, the use of medicines and the body mass index (BMI) were evaluated. In addition, some of the main causes of death and non-communicable diseases were considered as well: cancer, diabetes, hypertension/high blood pressure, dyslipidemia (evaluated through the vector: HDL-LDL cholesterol, triglycerides) and cardiovascular diseases. Additionally, diagnosis of a mental/psychological disorder, general self-reported stress and fatigue were taken into account.

•
For lifestyle: having a sedentary life; doing sports at least 3 times a week; doing sports at least 30 min every time; smoking; drinking alcohol; self-assessment of one's eating habits; walking; and using a bike were considered. Sleeping hours per day (24 h). Regularly sleeping less than 7 h per night can lead to adverse health conditions, such as weight gain and obesity, hypertension, depression, diabetes, heart disease and stroke, and increased risk of death; between 7 and 9 h could be considered a normal range for young adults and adults, while more than 9 h could be enough for young adults and for people recovering from sleep debt or suffering from illnesses. Nevertheless, it is still unknown whether sleeping more than 9 h per night could imply health risks [66].
Additionally, for contrasting the utility of the indices, the following variables were taken into account too:

•
RTCs in a dichotomous way No/Yes (0-1): have you ever suffered a traffic crash? Suffering a crash as a road actor, a variable that was considered when the participant was matched in the vector: having a traffic crash, or a crash as a passenger, on a bike, as a pedestrian or as a driver. The variables that compose this vector were also used to study the contrasts.  [67], by men and women [67], as well as various concerns in the economic and health-related fields [31].

Compliance with Ethical Standards
The present study obtained its ethical approval from the Research Ethics Committee of the University Research Institute on Traffic and Road Safety at the University of Valencia (IRB: E0002080419). Additionally, it complied with the guidelines established by the Code of Ethics and Bioethics of Psychologists [68]. Following this code, participants completed the survey only if they had previously agreed with an informed consent form that emphasized confidentiality and data protection rights, with special attention to the fact that the data would be used only for research purposes, thus encouraging participants to provide sincere answers.

Results
With these data, the descriptive analyses used to understand the participants' profiles were performed according to sex and income, and they can be consulted in Table 1. In total, 413 women (73.88%) and 146 men (26.12%) participated, and their mean (SD) age was 20.83 (2.49) years. In total, 59.3% of the sample reported having finished their high school studies; Status 3-middle (40.4%), Status 2-low (41.1%) and Status 1-low-low (7.5%), represents 89.05% of the total sample.

PCA Indices Construction
Variables accounting to equal or more than 95% in any of the answer categories were discharged. To construct the PCA indices, the subset of variables was scaled, allowing for the use of covariances matrices. The Kaiser-Meyer-Olkin (KMO) factor adequacy was tested to be higher than 0.5, which is considered acceptable for employing the selected method. Several models were tested, considering 70 possible variables. These were reduced according to their contribution to the final models and to the cluster explaining the possible components, in addition to the related theory. The final components manage to explain around 56.3% and 67.9% of the total variance, and they were used to generate three indices: SES, Health and Lifestyle. The respective loadings with an absolute cutoff of |0.34| for components with eigenvalues ≥1 are displayed in Table 2. Missing data were omitted in the final model in order not to affect its predictive value (see Table 2).
The indices were constructed through the sum and ponderation of the variance explained by each eigenvalue ≥ 1, to be then re-scaled within a 0-1 range. For SES and Lifestyle indices, a value equal to 1 corresponds to the most favorable socioeconomic status and to the best lifestyle conditions, respectively. For what concerns the health index, 0 represents a lack of unfavorable health conditions and 1 represents the presence of illness. Some works suggest considering only the first component of the PCA (Comp.1) to construct the indices, and, therefore, the Comp.1 of each model was tested in contrast with another index equivalent to the sum and ponderation of all components with eigenvalues ≥ 1. However, the relations explained only by the Comp.1 were not found to provide better or worse contrast results, which is why we chose, as final indices, those that ponder components in order to increase the variance explained by the model. The indices were also categorized in terciles that were Low (<0. 43

Means and Frequency Contrast
To explore the behavior of the indices categorized in terciles, the Chi-square test of Independence was employed and reported, together with the adjusted standardized residuals, where values higher than 1.96 indicate more cases than expected, while values lower than 1.96 indicate fewer cases than expected. The effect size is reported through the contingency coefficient (see Table 3). To begin with the SES index, statistically significant differences were found in the driving task. It is attention-worthy how there are more cases than expected presenting a high SES in the case of those who drive. On the other hand, suffering a crash as a pedestrian presents differences as well; specifically, people with a higher SES report more crashes like these, while an average SES implies fewer people who have suffered a crash as pedestrians. For what concerns the Health Index, no significant differences were found. On the other hand, the Lifestyle index shows differences in comparison with the Health index, since the adjusted standardized residuals show more cases of poor health than expected in the case of the unhealthy lifestyle group; also, there were fewer cases of poor health in the healthy lifestyle group. Differences in the use of bikes show that there are fewer cases of people not using bikes in the healthy lifestyle group. Regarding bike crashes, more cases than expected were found in the healthy lifestyle too, for those who were involved in this type of crash. The self-reported crashes also showed significant differences: there were more cases than expected when considering unhealthy lifestyles. The lifestyle index also presented differences with the crashes suffered as a driver, finding more cases than expected in the unhealthy lifestyle category and in those who suffered the crash, and fewer cases in the healthy lifestyle group. Finally, differences were found in the sex variable, too: there are fewer women with a healthy lifestyle in comparison with the group of men (see Table 3).
For what concerns the continuous variables, Student's t-test for independent samples was also tested (see Table 4), considering crash rates and mobility as contrasting variables. For what concerns the SES index, it was found that those who drive presented an average SES higher than those who do not. There are no mean differences related to the Health index. Regarding the lifestyle index, it was found that those who reported suffering a crash had a lower lifestyle mean; those who suffered crashes as drivers also presented a lower mean; and, finally, those who rode a bike had a lifestyle mean that was higher than those who did not. Finally, Figure 1 shows a violin plot for the indices that display the variables' distribution depending on the reported crashes and on the sex variables (as an example of the possible distributions the indices could have across the participants' features). The figure allows us to visualize the predictive power of the indices, observing that the lifestyle index is the one adjusting to the data curve in the most adequate way.
Finally, Figure 1 shows a violin plot for the indices that display the variables' distribution depending on the reported crashes and on the sex variables (as an example of the possible distributions the indices could have across the participants' features). The figure allows us to visualize the predictive power of the indices, observing that the lifestyle index is the one adjusting to the data curve in the most adequate way.

Discussion
Understanding that developing countries are severely affected by RTCs and that this issue must be approached from a multi-dimension and interdisciplinary perspective, this work has proposed the need of studying crash rates and mobility patterns in young Colombians through SES, health and lifestyle as predictors of psychosocial risk factors. To our knowledge, this is the first study of this type that was ever performed in Colombia. By means of a reductive approach and to explain between 56.3% and 67.9% of the variance, three indices were constructed: SES, Health and Lifestyle, since the evidence appoints them as determinant elements to be considered when comprehending who suffers RTCs and why.
To begin with, the variables reduction led us to discharge a total of 54 variables, leaving three models composed of 10 variables (SES), 7 variables (Health) and 5 variables (Lifestyle). This reduction also allowed for a better understanding of how, despite the fact that there are variables highlighted in other countries that we expected would be valuable in these models too (such as the number of people in the home, or the hours of sleep), this did not apply to the population of young Colombians, emphasizing the idea that it is necessary to perform studies focused on the specific issues of each country [69].

Mobility and RTCs Patterns of Young Colombians
To begin with, this study points out some interesting mobility patterns. The majority of young people report walking in their city (93.76%), but their participation as road actors starts to decrease with the use of vehicles; only 26.38% of them use a bike, and even fewer drive a motor vehicle (12.3%), mostly male drivers. Overall, 17.29% of them report that they have been involved in a traffic crash at least once in their life. However, the proportion of those who have been involved in a crash, regardless of their road role, increased up to 39.9%: this leads us to acknowledge that, as other authors have already pointed out [48], young people are indeed at risk for dangerous situations on the road. Additionally, in both cases men reported a higher number of crashes, following the gender-related tendencies associated with RTCs [70,71].

Socioeconomic status (SES) and Young Colombians
SES is a determinant of health, as well as of the risky and/or protective actions that a person performs when living. Vulnerable SES and health imply severe detriments for the individual's quality of life, and the proportion of these inequalities are highly present in developing countries. The problem is that, as some studies have pointed out, the more crashes happen, the bigger the social and economic burden becomes for a country [9,72]. A heavier burden probably corresponds to a lower investment in the development of laws and in the work on road safety, which is a reason why, in addition to the deaths associated with this phenomenon, we are facing a political and economic issue that negatively feeds back on itself. It is not a surprise that vulnerable subjects could be more involved in crashes in countries with poor or still-developing policies, as we were able to verify with this work.
It was found that the indicators of young Colombians were associated with detrimental social conditions. Following the Colombian socio-economic classification, around 48.65% of participants are below the 3rd (middle-low) status, and the debt variable had a considerable weight on the third PCA component. However, the educational factor, among others, was slightly higher than expected in this population, counterbalancing the model so that the index's terciles point out groups that are more or less similar. This is probably due to the participants mostly living in the country's capital, and to them being financially supported by their families [73], a support that could also have an influence on the health status through the reduction of psychological stressors [74]. However, this variable did not have any weight on the SES model.
Generally speaking, this index highlighted interesting relations (though fewer than expected) contrasting with variables related to young drivers. To begin with, it was found that those who drive are more associated with high SES, and the index mean is higher for them than it is for those who do not drive. This could be explained by the fact that driving allows the person to move more easily in the city, or even to work more easily, and that, of course, having access to a vehicle is linked to an economy that accumulates capital [75]. As we have said before, the driving task is different depending on sex: men drive more, and those who drive report higher salaries.
On the other hand, those who have experienced crashes as a pedestrian are in the high SES tercile. This provides evidence to reject our initial hypothesis, in which we considered that high SES would present fewer relations with crashes, which is a source of concern not only considering that pedestrians are the most vulnerable road actors [76], but also because, according to the theory, high SES should correspond to a protective factor. In this case, beyond the SES the road safety conditions of Colombia should be taken into account, in addition to the alarming death and injury rates of RTCs and the walkability perception [77].

Health, Lifestyle and Young Colombians
Moving on to the health index, it did not show significant contrasts in the present analysis. However, we can notice in Figure 1, in the part addressing the contrast with crashes, how the proposed model includes the majority of the cloud data within its distribution. The non-existence of significant relations is not necessarily a reason for discharging the construct of RTCs' study: on the contrary, we believe that the results are caused by the population being young, and by the prevalence of illnesses being quite low, as it can be seen in the index's terciles. In addition, research on young drivers' health is more related to their tendency to drink and consume substances [78], which corresponds rather to the field of healthy lifestyle habits (without being excluded from the health sphere).
Actually, it was found that the lifestyle index presents differences from the health index, and there are more cases than expected presenting a poor health in the high/good lifestyle category. However, the relations between this index and mobility seem to be more important (assuming that young people do not get sick so often). To begin with, those who use the bike have a healthy lifestyle and represent a higher proportion of the highest index. This result is important in terms of sustainable mobility, but it also represents benefits for physical [79] and mental health, and it can even have therapeutic effects on some specific populations [80,81]. Nevertheless, it was found that people with a healthier lifestyle suffer more crashes when riding a bike. This is quite concerning, since the message of mobility in the country's context would then be against the promotion of health. As Evans states: "many people say they would cycle more if the roads were safer-the biggest deterrent to more cycling is high traffic speeds and volumes. There is obviously a vicious circle to be reversed here" [79]. Even so, the study of cyclists' behavior must be deepened, since they are road actors too, and they could contribute greatly to the occurrence of crashes.
On the other hand, and complementarily, it was found that those reporting that they have suffered RTCs have an unhealthier lifestyle, and, in addition, drivers also have one of the unhealthiest lifestyles. This result is consistent with the findings of other countries and age groups, where it was concluded that driving can even be considered a sedentary activity: driving versus walking [82]. As a sedentary activity, driving can lead to unhealthy habits that are then quite difficult to change [83], with undesirable effects in the shortand long-term.
Finally, the sex and age variables showed important differences that, as we have seen, mark some of the patterns of SES and mobility. It was also found that men are those who keep the healthiest life habits in comparison to women, as other works have shown [84], in addition to having a higher mean of bike crashes. Clearly, we have some risky dynamics at play for young males in Colombia, associated with crash rates. However, this applies to women too, especially in terms of their lower participation in road life and their less healthy life habits. Clearly, a gender perspective must be taken into account in order for women to become more active mobility agents, and for men to be less prone to suffer RTCs. For what concerns age, groups older than 21 engage with the road more, they drive more, they use bikes more, but they also suffer more crashes.
As our results suggest, the work that must be carried out in the country is deep. Joining the same call for action as other authors in what concerns youth [85,86], protecting young Colombians from RTCs must be a priority. It is essential to ensure that they have favorable socioeconomic and psychosocial conditions for their development as well, always following a gender perspective. On the other hand, being active in mobility cannot be a synonym of suffering crashes. If a country aims at enhancing mobility and fostering the use of alternative transport means, such as bikes, it must protect its road actors and provide them with safe contexts so that people will take on an active mobility role through the care of health [87].

Conclusions
As a result of this research, we now know that SES, Health and Lifestyle as constructs follow a special cluster in the case of young Colombians, and variables highlighted in other countries were not significant in the case of this population. Moreover, sensitive socioeconomic conditions are quite common in this country, and there is a situation of social and economic vulnerability for young people, who, interestingly enough, present high levels of education.
One of the achievements of this study was the construction of three models that allow for the generation of SES, Health and Lifestyle indices in the population of young Colombians, which provide information on the crashes and mobility patterns they have, as well as on differences between groups. The main findings were: (1) drivers are associated with higher SES and driving. The action of driving is associated with higher incomes. High SES is not necessarily associated with protection, since pedestrians belonging to this group report higher crash rates; (2) the prevalence of illnesses is low, and it does not affect mobility or crash rates in this population; (3) people with a better lifestyles use bikes more and report more crashes when using them. Unhealthier lifestyles are associated with more RTCs, and with the driving task; (4) sex and age do establish SES, lifestyle and mobility patterns. Men keep healthier life habits than women, they drive more, they use the bike more, but they also report more crashes than women. Women participate less in the road life, and they have less healthy habits. Finally, the results allow us to draw the conclusion that protective factors such as a high SES and a healthier lifestyle are associated with RTCs in this population, and the age group over 21 engages with the road more, they drive more, they use the bike more, but they also suffer more crashes.
Finally, even though some results may seem obvious, they had not been reported yet; and this is a payoff when working on the RTCs prevention of young Colombians. Additionally, we hope that this work will leave the readers with more questions than answers, and, thinking of the results, we would like to draw the attention to the following interrogatives: is not encouraging people to have a better lifestyle through exercise the objective of health prevention policies? Is not leading us to more sustainable and equal cities the objective of mobility? Then why should taking care of one's health and cities end up being a risk for young people? The work that is left now consists of further researching the population of drivers and non-drivers in order to answer these questions.

Limitations and Future Research
Despite the efforts that were made, the analyses we performed could present limitations, and the existence of confounding variables must be evaluated through other methods. Moreover, the size of the sample must be increased for future applications, not only to widen the number of participants, but also to include people from other geographical places in the country, which would diminish the limitations when researching a developing country [69]. In addition to this, future studies will need to obtain more funding, with the aim of performing samplings that are proportional to age, gender and road users (specially drivers).
Even though the constructed indices present an acceptable percentage of the explained variance, the construction and proposal of models that may explain the SES, health and lifestyle associated with young Colombians with more power must be fostered. In addition to this, other conceptual models for the construction of indices should be considered, for instance, a cumulative proposal instead of a reductive one for the model construction [88].
We hope that the results of this work will be useful for understanding the dynamics associated with RTCs in a developing country, and, moreover, with a population group that is at risk. The work of variable reduction that we have performed can be useful for future studies so it they will reduce the application time in what concerns the sociodemographic variables and allow for focus on deepening the researched topics. Additionally, these indices can be extended to the research of other issues since their construction does not depend on mobility or crash variables, but rather used them for contrast. Regarding future research associated with this study, it is worth highlighting the necessity to improve the health index and its predictive value. In future works, we hope to collect data that go beyond self-reports, especially in what concerns health factors, through quick check-ups and revisions of the participants' medical history, together with visiting the participants' homes in order to contrast the socioeconomic information. We hope to do this with at least one subsample, considering the economic and ethical implications. Finally, it would be useful to consider the severity and nature of RCTs for implementing specific prevention strategies.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The datasets and code used and/or analyzed in the present study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A
In this appendix the items considered for the elaboration of the Socioeconomic Status (SES), Health and Lifestyle indices are presented. Taking into account that the categorization of items is important for the Principal Component Analysis (PCA) results, here the different categorization forms assessed in this research are presented, resulting in a total number of 76 possible items, considering that every time an item is categorized differently, the PCA results change. Table A1 shows the items related to SES; Table A2 shows those related to health; and Table A3 shows those related to lifestyle.
If the objective is to reproduce the methodology, please consider the following aspects: (1) Consult the theory related to the population and grounding in categories that others have already built, but also propose your own categories, depending on the researcher's intuition on the data, and compare the results using each categorized item; (2) when some categories represent 95% of the answers, this item must be rejected from the analysis, since it does not present variance in the population; (3) be careful with the items' directionality, they must all coincide; (4) remember to standardize the items, so that they will all belong to the same scale and you will be able to compare them when performing the principal components analysis (PCA).

<4 >4
All EBC characteristics are added, and a cutting edge is placed in the middle Low-low low Intermediate High All EBC characteristics are added and classified in terciles * Item used in the model in order to produce the best results through principal components analysis (PCA), following the rules needed to carry out the method. These were standardized in the presented PCA.   * Item used in the model in order to produce the best results through principal components analysis (PCA), following the rules needed to carry out the method. These were standardized in the presented PCA. Sleeping less than 7 h per night can lead to adverse health conditions; between 7 and 9 h could be considered a normal range for young adults, while more than 9 h could be enough for young adults and for people recovering from sleep debt or suffering from illnesses [66].
On a scale from 0 to 10, how good is your diet? Continuous Likert scale assumed as continuous 0 bad diet-10 good diet Bad Average Good Likert 0-10 categorized in terciles.

Do you walk in your city? No Yes
Do you use a bike in your city? No Yes * Item used in the model in order to produce the best results through principal components analysis (PCA), following the rules needed to carry out the method. These were standardized in the presented PCA.