Sociobehavioral, Biological, and Health Characteristics of Riverside People in the Xingu Region, Pará, Brazil

This study aimed to evaluate the sociodemographic, behavioral, and biological profile and its relationship with the emergence of chronic non-communicable diseases in riverside populations in the Xingu region, Pará, Brazil. Characteristics related to health indicators and which risk factors are considered most important were analyzed. This is a cross-sectional, exploratory, and descriptive study. The sample consisted of riverside people of over 18 years of both sexes. The sample size (n = 86) was calculated with a confidence level of 95% and a sample error of 5%. The K-means clustering algorithm was adopted through an unsupervised method to divide the groups, and the values were expressed as a median. For continuous and categorical data, the Mann-Whitney and chi-square tests were used, respectively, and the significance level was set at p < 5%. The multi-layer perceptron algorithm was applied to classify the degree of importance of each variable. Based on this information, the sample was divided into two groups: the group with low or no education, with bad habits and worse health conditions, and the group with opposite characteristics. The risk factors considered for cardiovascular diseases and diabetes in the groups were low education (p < 0.001), sedentary lifestyle (p < 0.01), smoking, alcoholism, body mass index (p < 0.05), and waist–hip ratio, with values above the expected being observed in both groups. The factors considered important so as to be considered to have good health condition or not were the educational and social conditions of these communities, and one part of the riverside population was considered healthier than the other.


Introduction
Environmental determinants may be related to lifestyle, and the onset of diseases can affect a given population's health and quality of life. In addition, populations change their epidemiological profile due to the influence of chronic noncommunicable diseases (CNCDs) that are related to multiple causes [1].
According to the Global Burden of Disease Study, in Brazil, CNCDs, such as dyslipidemia, systemic arterial hypertension, and diabetes mellitus, correspond to about 75% of the causes of death [2], being a major problem for health services. Thus, the importance of surveillance of risk factors, such as sociodemographic, behavioral, and biological profiles, are effective ways to establish primary prevention measures and early detection of cardiovascular diseases [3]. Therefore, it is essential to adopt these practices, since it is with less schooling, bad habits, and worse health conditions, as well as the second profile of people with opposite characteristics.
In this sense, this study evaluated in an exploratory way, through the machine learning technique, the sociodemographic, behavioral, and biological profile, as well as its relationship to the emergence of CNCDs in riverside populations in the middle Xingu region and analyzed the characteristics of these patterns regarding health indicators and which risk factors are considered most important.

Participants
This is a cross-sectional, exploratory, and descriptive study carried out from March to September of 2019 in a riverside community in the middle of Xingu River, Altamira town, Pará State. The study was developed in the Espelho community, consisting of ~60 families, distributed in small communities (Cajueiro, Chicote, Espelho, Jabuti, Itapuâma, Jabota, Transassurini, Espanhol, and Firma) along the banks of the middle Xingu River (Figure 1). Two expeditions were carried out that took place in the Chicote Island community, considered a geographically privileged place among the other communities, so it was used as a data collection base. Regarding the profile of the interviewees, the inclusion criterion was for convenience, so the profile was traced by the presence of riverine people in each field campaign, which was widely publicized a month in advance. The sample size resulted from the total number of riverine people present in each field survey. Thus, 40 families were accepted to be part of this research, which had an average of 4.3 ± 1.59 people per family, totaling 172 individuals. Within this group, an average of 2.2 ± 0.79 people per family were over 18 years old, totaling 86 individuals of both sexes who signed informed consent and answered the questionnaire.
People with cognitive problems and those who refused to participate in the research at any time of the action, even if they signed the consent form, were excluded.
To assess the representativeness of the number of individuals who agreed to participate in the research, Slovin's formula was used: n = N/(1 + Ne2), where n = number The sample size resulted from the total number of riverine people present in each field survey. Thus, 40 families were accepted to be part of this research, which had an average of 4.3 ± 1.59 people per family, totaling 172 individuals. Within this group, an average of 2.2 ± 0.79 people per family were over 18 years old, totaling 86 individuals of both sexes who signed informed consent and answered the questionnaire.
People with cognitive problems and those who refused to participate in the research at any time of the action, even if they signed the consent form, were excluded.
To assess the representativeness of the number of individuals who agreed to participate in the research, Slovin's formula was used: n = N/(1 + Ne2), where n = number of samples, N = total population, and e = error tolerance (level) [25,26], based on a statistical power of 90%. After applying the equation, it was verified that this study would need at least 84 individuals. Therefore, the sample obtained is representative. Furthermore, the sampling effort of this study is similar to that of others that analyzed small riverside communities [27][28][29].

Procedures
In Figure 2, it is possible to observe the chronological flow and the operational procedures developed during the research. We emphasize that consent was read and explained to all participants who could neither read nor write, who, after agreeing, put their fingerprints on the document. Data collection from the participants included: (1) sociodemographic characteristics: sex, ethnicity, reading, and education level; (2) behavioral habits: smoking, classification of smokers (light or moderate to heavy categories) [30], alcohol consumption, physical exercise, healthy diet (including weekly consumption of fish, nuts, fruits, vegetables)/bad diet (fried foods, soft drinks, sausages, among others); (3) biological predictors: age, blood pressure, body mass index (BMI), waist/hip ratio (WHR), and personal physiological history (dyslipidemia, arterial hypertension-SAH, diabetes-DM, stroke-CVA, cardiovascular disease-CVD).
of samples, N = total population, and e = error tolerance (level) [25,26], based on a statistical power of 90%. After applying the equation, it was verified that this study would need at least 84 individuals. Therefore, the sample obtained is representative. Furthermore, the sampling effort of this study is similar to that of others that analyzed small riverside communities [27][28][29].
This study was approved by the Ethics Committee of the Tropical Medicine Center at the Federal University of Pará (Opinion No. 3678493), in compliance with Resolutions 441/2011/CNS e 466/2012/MS.

Procedures
In Figure 2, it is possible to observe the chronological flow and the operational procedures developed during the research. We emphasize that consent was read and explained to all participants who could neither read nor write, who, after agreeing, put their fingerprints on the document. Data collection from the participants included: (1) sociodemographic characteristics: sex, ethnicity, reading, and education level; (2) behavioral habits: smoking, classification of smokers (light or moderate to heavy categories) [30], alcohol consumption, physical exercise, healthy diet (including weekly consumption of fish, nuts, fruits, vegetables)/bad diet (fried foods, soft drinks, sausages, among others); (3) biological predictors: age, blood pressure, body mass index (BMI), waist/hip ratio (WHR), and personal physiological history (dyslipidemia, arterial hypertension-SAH, diabetes-DM, stroke-CVA, cardiovascular disease-CVD).

Data Analysis
Assuming that the development of CNCDs is multifactorial [1], which means that it can arise through different factors (e.g., level of physical exercise, eating habits, smoking), the sample was divided by clustering, using the algorithm K-means clustering method that, through an unsupervised method, learns how all the variables of a data set are related to each other by dividing the participants into different predetermined groups, which can help to discover different hidden patterns present in the data. This grouping Blood samples were taken to evaluate the lipid profile: total cholesterol (TC), HDL, LDL, triglycerides (TG), and blood glucose, following established protocols [31,32].

Data Analysis
Assuming that the development of CNCDs is multifactorial [1], which means that it can arise through different factors (e.g., level of physical exercise, eating habits, smoking), the sample was divided by clustering, using the algorithm K-means clustering method that, through an unsupervised method, learns how all the variables of a data set are related to each other by dividing the participants into different predetermined groups, which can help to discover different hidden patterns present in the data. This grouping occurs by minimizing the sum of squares of the distances between the data and corresponds to the geometric center of a characteristic, called the centroid [33].
Based on the hypothesis of the existence of an association between social factors, health, and habits that have an impact on health, we suggest that there could be two profiles of people: the profile of low education or no studies, with bad health habits and worse health conditions; and the profile with opposite characteristics. Therefore, the K-means clustering algorithm was used to cluster the sample into two main groups [34].
For the construction of the model, 28 variables were used (two sociodemographic variables, 17 biological variables, and nine behavioral variables). Orange Data Mining 2.27 software was used to determine the groups (Supplementary Materials).
We highlight that only two sociodemographic variables were inserted because there are biological variables (sex, age, and ethnicity) already evidenced in the literature as factors associated with a sociodemographic level, so they were considered in the biological variables and not in the sociodemographic variable [4,[35][36][37]. The variable "fish consumption" was emphasized, since this food is rich in omega-3 fatty acid, which is a nutrient known to be associated with the prevention and protection of cardiovascular diseases [38]. On the other hand, the consumption of fish by the population studied may be a risk factor for mercury contamination, since it has been reported that many species of fish in the Xingu River have levels of methylmercury concentration (the organic form of the metal heavy mercury) far above what is tolerable by the human body and which has a high power of bioaccumulation along the food chain [39][40][41][42].

Statistical Analysis
Group characteristics were expressed as a median and interquartile range, mean, standard deviation, and proportions. The Mann-Whitney and chi-square tests were used for comparisons between groups, with Fisher and Yates corrections, and a p-value < 0.05 was adopted as significant. Residual adjustment >2 was adopted for significant categorical analyses.
The effect size was considered to support the importance of differences between groups. For continuous data, the interpretations followed the Cohen table (2013). For categorical data, effect sizes were observed through ϕ in 2 × 2 tables, assuming "Null effect" for ϕ < 0.10, "Small effect" for ϕ < 0.30, "Moderate effect" for ϕ < 0.50, and "Large effect" for higher values. In tables > 2 × 2, the sizes were observed by Cramer's V, whose interpretations of null, small, moderate, and large effects were performed, respecting the variations according to the increase in degrees of freedom [43,44].
The algorithm multilayer perceptron [45] was used to assess and classify the degree of importance of each variable in determining the groups. It is a supervised machine learning algorithm that, through artificial neural networks, can find non-linear patterns among different variables in a dataset and in response provides a prediction of some predetermined variable.
Only significant variables (p < 0.05) were inserted in the input layer for the prediction of groups by K-means, of which four were biological, four were behavioral, and one was sociodemographic. Numerical variables were rescaled at intervals between 0 and 1. The sample was randomly divided into two data sets, with 70.2% of the sample (n = 60 individuals) used to train the algorithm and 29.8% used for the test (n = 26 individuals). The architecture of the algorithm was automatically determined by the software, and the most adequate pattern to structure the neural network was the use of a neuron and a hidden layer. The activation functions used in the hidden layer and output layer were tangent hyperbolic and softmax, respectively. To minimize a possible effect of overfitting (a false response in the network), the algorithm was applied three times, and the application chosen was the one with the lowest cross-entropy error, 0.026 (training sample) and 0.013 (test sample).
To calculate the importance of each variable in dividing the groups, a sensitivity analysis was performed based on the combined training and test samples, creating a table displaying each variable's importance rating.
IBM Statistical Package for the Social Sciences (SPSS) 23.0 software was used for the evaluation procedures of the division of the groups.

Results
After dividing the sample composed of 86, the K-means algorithm defined two groups: group 1 composed of 47 people with major socio-demographic vulnerability, poor health habits, and worse health conditions, and group 2 composed of 39 people with profiles of opposite characteristics.
Regarding the sociodemographic descriptors (Table 1), when the participants were asked about knowing how to read or not, 100% of the people in group 1 said they did not know how to read, while group 2 was the opposite, revealing approximately half of this population. As for education, 59% of people who declared they had not studied belonged to group 1. In addition, 16 people declared having attended elementary school. However, they could not read. Among the people declared to be literate and who had attended elementary school (46.8%) or high school (10.6%), they were in group 2. Only one person declared having graduated. However, this did not influence the groups among the riverside people. The effect size was highly significant (p < 0.001) for reading and schooling, evidencing a low education profile or without studies. The other social criteria were not relevant between the groups (p > 0.05). The letter "a" represents the highest adjusted residual value (above 2) and the subsequent letters characterize lower values, respectively, representing the categories that influenced the statistical significance (p-value < 0.05) between the groups. *** p < 0.001. ++ moderate effect, +++ Large effect.
Regarding the behavioral descriptors, it was observed that, in group 1, people smoked for a longer period (0-15 years; Table 2). Characteristics, such as the number of cigarettes smoked per day (Table 2) and the rate of smokers and former smokers (Table 3), did not influence the comparison between the groups. It is important to note that riverside dwellers who reported being smokers (7% of the population) were included in the "light" (≤10 cigarettes/day) or "moderate to heavy" (>10 cigarettes/day) categories.
As for the frequency of alcohol consumption (Table 2), people in group 2 consume alcohol more frequently every week when compared to group 1 (p < 0.05). However, of the people who consume alcoholic beverages, 42.6% are in group 2, representing double those belonging to group 1 (Table 3).
A percentage of 39.5% of the riverine people declared to practice physical exercises. The weekly frequency of physical exercise was lower in group 1 (0.0 times) than in group 2 (1.5 times) (p < 0.01; Table 2).   Regarding the consumption of fish per week (Table 2), it was observed that they consume fish three to four times a week and that this frequency did not differ between the groups (p > 0.05). There was no difference between the groups as to whether or not they had a healthy diet (Table 3). However, most riverside dwellers stated that they preferred a healthy diet, with 61.5% in group 1 and 70.2% in group 2 (Table 3).
Considering the lipid profile, even if the values are within the acceptable limit [46], the riverside people in group 2 tended to have lower values of total cholesterol and LDL when compared to group 1 (p < 0.05; Table 2). The other biochemical tests, blood pressure measurement, WHR, and physiological background (SAH, CVA, CVD) showed no differences between the groups (p > 0.05). In addition, SAH was within the normal range. However, it was observed that the systolic blood pressure (SBP) was at the limit (SBP = 130 mmHg) recommended by the Brazilian Society of Cardiology [46].
Although the WHR did not differ between the two groups, both presented values above the desired level (Table 2). Men had WHR ≥ 0.90 cm, and women had values ≥ 0.85 cm. The BMI of this population differed between the groups, where group 1 had higher values when compared to group 2 (p < 0.05; Table 2). However, both groups were considered pre-obese according to the WHO [47]. Although SAH and DM did not differ between the groups, these variables had an average frequency of the population, in general, of 21.3%, including those who reported SAH, and 4.7% reported having DM and SAH.
The biological descriptors observed in group 1 showed people of more advanced ages when compared to group 2, with a mean age of 55 years (49-62 years) (p < 0.001; Table 2). Although age was significant, the effect size was small.
Thus, it was evidenced by multilayer perceptron that, of the most important variables, social factors, such as reading (100%) and level of education (11.8%), were the factors that most influenced this study; then, the second most important were behavioral descriptors, such as smoking time (2.7%), total cholesterol (2.6%), alcohol frequency (2.3%), and others. Although age had an influence, the effect size was small, and it has a lower importance (0.9%) than the other variables, as shown in Figure 3. Table 3. Categorical data on the behavioral and biological profile of riverine populations in Group 1 and Group 2 in the Xingu River region.

Discussion
This study evaluated the relationship of sociodemographic, behavioral, and biological patterns in an Amazonian riverine population of the middle Xingu River to identify the factors that most influence one's health profile. We assume that this population has

Discussion
This study evaluated the relationship of sociodemographic, behavioral, and biological patterns in an Amazonian riverine population of the middle Xingu River to identify the factors that most influence one's health profile. We assume that this population has two profiles, and our analyses show that there are two distinct groups: a group with low education and poor health, and another group with opposite characteristics. In addition, we found that the educational and social conditions together were the most relevant aspect for having good health conditions in this population.
We observed that all people in group 1 are illiterate, regardless of having attended elementary school or not, showing that the lack of reading may lead people to be unable to read or interpret, for example, a medication leaflet, tables of nutritional information present in processed and industrialized foods, and documents that may provide information on health education. The population's level of schooling was low, evidenced mainly in group 1, with 59.9% of the riverside people without education, and although the other group has a better educational index, most have only elementary school. These data are particularly important, since they agree with other studies carried out in the Amazon and in other regions of the world, where a low level of schooling is a risk factor that can influence the development of CNCDs and other diseases [27,[48][49][50][51].
In this context, we believe that low education can make it difficult or impossible for the population to understand the importance of healthy lifestyles, such as a good diet, physical activity, and reduced consumption of industrialized products. Recently, [52] found that the level of education can influence the dietary profile of the population of London, United Kingdom, in three ways: (i) low education is linked to diets rich in carbohydrates and low in fiber; and (ii) low education is also associated with higher consumption of sweets and red meat, and high education is associated with higher consumption of fruits, vegetables, and fish. In Latin America, it was observed that the main factors that hinder the state of health conditions in this population are: socioeconomic inequality (e.g., low education), social/geographical isolation and cultural barriers, and linguistic and political, which mainly affect indigenous and other populations living in rural areas [53]. Considering Brazilian Amazon populations, studies in riverside populations of the Pará (Tapajós River and Tucuruí Lake) and Amazonas (Negro River and Solimões River) states demonstrate that the profile or status of educational and bad habits (food and health) of riverine people are risk factors for the development of the metabolic syndrome and cardiometabolic diseases [21,27,48,54,55].
Regarding the smoking habit, the variable "time of smoking" was the most influenced in the study, especially in group 1, which was also considered to have poor health conditions. In riverside communities in the Amazon, smoking was associated with weight gain, the development of DM2, the onset of systemic arterial hypertension, metabolic syndrome, and chronic kidney disease [56,57]. Although the harmful effects of smoking on the body and the emergence of diseases are already widely known [58,59], the data reported here are particularly important, since they involve traditional populations that are historically neglected or excluded by public health policies in Brazil.
Considering alcohol consumption, the results showed that the consumption of alcoholic beverages and educational qualifications (literate) belonged to group 2, corroborating previous studies that associate the existence of a higher consumption of alcohol among people with better education [60][61][62][63]. Regarding the consumption of alcohol in riverside communities, it was reported that, in riverside communities of Amazonas and Paraíba states, it reaches 34.8% and 30.4%, respectively [56,64]. These studies corroborate our data, which pointed out that about 30% of riverside people are considered consumers of alcoholic beverages. The level of education shapes social position and opportunities in life, promoting circumstances that favor alcohol consumption [65].
Contrary to previous studies on dyslipidemia in riverine and quilombola populations in the region [4,66], this study showed that the values for cholesterol and LDL were considered desirable for both groups of this population. However, group 2 presented relatively lower values, suggesting that a part of the population has healthier lifestyle habits than the other. Perhaps the limitations to access to processed foods, due to poverty and low financial resources, are an influencing factor [67][68][69].
The effect size of age, despite being significant, was small concerning other sociodemographic and biological variables, such as education and lipid profile, respectively, thus not influencing the health of the riverine [70]. On the other hand, the level of knowledge and age may have a possible association, since the riverside people who had no education were in the same group as those who were older (group 1), as also demonstrated by Peres [71]. This may be mainly associated with the absence of educational programs in the past decades, as well as financial difficulties and difficulties with locomotion and transportation [19].
The population studied presented a typical food profile of riverside populations in the Amazon region, in which riverine people reported having a good diet (natural food) based on vegetables, Brazil nuts, fruits, and mainly fish, the latter item being consumed three to four times per week [55,72]. We highlight the "fish consumption" factor for two interesting aspects: (i) a positive aspect, since there is a high concentration of unsaturated fats and omega-3 fatty acids in this food, which can substantially contribute to the prevention of cardiovascular diseases in this population, since this nutrient has been proven to prevent this disease [38,73]; and (ii) a negative aspect, since there is a strong possibility that this food is contaminated by methylmercury, which in turn can bioaccumulate in riverine people who consume them [39][40][41][42]. Indeed, Souza-Araujo and collaborators (2022) [42] identified fish contamination and mercury biomagnification in food webs in the Belo Monte Hydroelectric Power Plant, in the middle of Xingu River, Pará state. Recent studies have reported mercury contamination of traditional (e.g., riverine and indigenous) and urban communities in the lower Tapajós basin and in the Tucurí Lake influence area, Pará state [40,41,74]. Therefore, it is possible that the population studied may be at risk of methylmercury contamination, since they may be consuming contaminated fish.
We observed that a large part of the population studied had blood pressure within the limits of what is recommended by the Brazilian Cardiology Medical Society [46]. This condition may be associated with possible mercury exposure, since concentrations of this metal in the bloodstream are positively associated with blood pressure, hypertension, and other cardiovascular diseases. In addition, the exposure dose is an important factor in determining the effects on hypertension [75][76][77].
Despite these results, the riverside population of the current survey reported having some of the CNCDs, such as SAH and DM. Even though they do not influence this study, these variables presented an average frequency of the general population of 21.3% reported to have SAH and 4.7% declared to have DM and SAH. Studies carried out in riverside communities in the interior of Amazonas show a higher prevalence of SAH (30.7%) and DM (8.9%), whereas, in the interior of Pará, 20.47% reported being hypertensive, and 4.13% reported having diabetes, and such prevalence corroborates with our studies [56,78].
Regarding the frequency of physical exercises per week, a low frequency of exercise was evidenced in both groups, and less than half of the population is a practitioner. Such results confront the idea that populations residing in rural areas have higher levels of physical activity, which is most often represented by walking [79][80][81].
Anthropometric indicators showed that BMI and WHR presented values above the expected in both groups evaluated, as well as in both sexes, considered pre-obese for BMI and high risk for metabolic complications for WHR, suggesting an excess of intraabdominal adipose tissue, as also evidenced by other studies [47,82,83]. According to Pereira et al. [84], the WHR has a greater predictive capacity for hypertension, allowing greater discrimination of people at risk of chronic diseases. However, BMI and WHR were not determining predictors among the risk factors for classifying the importance of quantitative variables in the present study, thus not indicating a possible association with CNCDs.
In this sense, public policies should be worked on with these riverside populations, which lack medical care. The study carried out by Machado et al. [20], with riverside popu-lations of the lower Madeira in Rondônia, used telemedicine as a technological resource for health promotion and prevention and developing the population's responsibility for a better quality of life in the region. The Brazilian educational system does not have public policies of national scope aimed specifically at the traditional populations (e.g., indigenous, riverine, quilombolas, etc.), so the only program that exists does not meet the breadth of distribution of these populations in the national territory. For example, the government of the state of Pará launched the program "Modular Education Organization", which is regulated by law N • . 7806 of 29 April 2014, and that aims to bring education and literacy to traditional populations and the interior of the state. However, although this program has significantly contributed to the schooling of approximately 36,000 students from 465 locations in the interior of the state [85], it is still limited and cannot serve more distant and isolated communities, such as the population of the region around the middle Xingu River, so this population remains unassisted by public education policies. Therefore, given the importance of education for the critical understanding of people regarding their social context, robust public policies for schooling and health education (in cooperation with states, municipalities, and the federal government) are essential to mitigate the impact of this factor on the quality of life and health of traditional populations.
Overall, this study could confirm the hypotheses based on the literature, that a person with no or less education is unlikely to have knowledge and information about how harmful certain behaviors are. Therefore, they will not change them and, as a result, will have worse health indicators. Only alcohol intake showed a different pattern from the other factors, since its influence could be observed within the best level of education, probably because education provides better social conditions and opportunities, favoring this consumption. Finally, healthy eating, mainly due to the high consumption of fish, was considered a positive criterion for the health of riverside dwellers.
The riverside population studied has an important peculiarity, which is its relative geographic isolation, since this population is surrounded by numerous waterfalls in the middle Xingu River, which makes navigation in this region difficult. In addition, with the installation of the Belo Monte Hydroelectric Power Plant, many families lost their land and were "relocated" to another region, so that the local riverine population was reduced. Therefore, the difficult access to the collection site, the high financial cost of the study, and the population size profile of the studied community (considered small and dispersed along the middle Xingu River) may have hindered the adherence and participation of a greater number of riverine people in the survey. However, we believe that the sampling effort did not affect our results, which are in agreement with other studies that analyzed small or large Amazonian riverside communities.
Finally, although this study brings relevant results, especially regarding the degree of importance of health indicators, the research had some limitations, such as the experimental design, which has a transversal characteristic. Studies that are more longitudinal are needed to better understand how the population would behave if there was an intervention and monitoring by health professionals about the most important variable: education. Could it be that by encouraging and intervening in education, positively, the other indicators would also improve? Another limitation was that the procedure was not conducted in a laboratory, and the process was not systematized and controlled. Despite this, the study valued a view of the subject in a real environment.

Conclusions
There are two patterns in the riverside community, being a pattern of people with low education, with longer exposure to cigarettes and low frequency of physical exercise, factors that can be considered as risk factors for cardiovascular diseases and diabetes in this population and consequently can affect your health. Additionally, the other pattern has opposite characteristics, being considered healthier. On the other hand, food, mainly due to the high consumption of fish, was considered a positive criterion for the health of riverside dwellers in both groups. This is the first study of this population, which has important peculiarities that distinguish it from other Amazonian riverine populations. Finally, the most important variables for determining these groups were sociodemographic (education), behavioral (smoking and weekly alcohol consumption), and health (cholesterol).
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/ijerph20085542/s1, Table S1: List of variables used in the construction of the clustering model by the K-means algorithm.