Periodontitis and Metabolic Syndrome: Statistical and Machine Learning Analytics of a Nationwide Study

This study aimed to analyze the associations between periodontitis and metabolic syndrome (MetS) components and related conditions while controlling for sociodemographics, health behaviors, and caries levels among young and middle-aged adults. We analyzed data from the Dental, Oral, and Medical Epidemiological (DOME) record-based cross-sectional study that combines comprehensive sociodemographic, medical, and dental databases of a nationally representative sample of military personnel. The research consisted of 57,496 records of patients, and the prevalence of periodontitis was 9.79% (5630/57,496). The following parameters retained a significant positive association with subsequent periodontitis multivariate analysis (from the highest to the lowest OR (odds ratio)): brushing teeth (OR = 2.985 (2.739–3.257)), obstructive sleep apnea (OSA) (OR = 2.188 (1.545–3.105)), cariogenic diet consumption (OR = 1.652 (1.536–1.776)), non-alcoholic fatty liver disease (NAFLD) (OR = 1.483 (1.171–1.879)), smoking (OR = 1.176 (1.047–1.322)), and age (OR = 1.040 (1.035–1.046)). The following parameters retained a significant negative association (protective effect) with periodontitis in the multivariate analysis (from the highest to the lowest OR): the mean number of decayed teeth (OR = 0.980 (0.970–0.991)); North America as the birth country compared to native Israelis (OR = 0.775 (0.608–0.988)); urban non-Jewish (OR = 0.442 (0.280–0.698)); and urban Jewish (OR = 0.395 (0.251–0.620)) compared to the rural locality of residence. Feature importance analysis using the eXtreme Gradient Boosting (XGBoost) machine learning algorithm with periodontitis as the target variable ranked obesity, OSA, and NAFLD as the most important systemic conditions in the model. We identified a profile of the “patient vulnerable to periodontitis” characterized by older age, rural residency, smoking, brushing teeth, cariogenic diet, comorbidities of obesity, OSA and NAFLD, and fewer untreated decayed teeth. North American-born individuals had a lower prevalence of periodontitis than native Israelis. This study emphasizes the holistic view of the MetS cluster and explores less-investigated MetS-related conditions in the context of periodontitis. A comprehensive assessment of disease risk factors is crucial to target high-risk populations for periodontitis and MetS.


Introduction
Periodontitis is a multifactorial chronic inflammatory disease induced by dysbiotic dental biofilm [1].Despite advances in understanding and treatment, global periodontitis prevalence is increasing [2].From 1990 to 2019, the age-standardized prevalence rate of severe periodontitis increased by 8.44% worldwide, and in 2019, there were 1.1 billion (95% uncertainty interval: 0.8-1.4 billion) prevalent cases of severe periodontitis globally [2].Periodontitis is the primary cause of tooth loss in adults worldwide, adversely impacting mastication, nutrition, appearance, and life quality [3].Azzolino et al. reviewed oral health determinants of malnutrition, describing that a diet poor in micronutrients may lead to a greater inflammatory response of periodontal tissues, and at the same time, the loss of dental elements due to periodontitis can negatively affect the nutritional status of the patient, resulting in discomfort during chewing and leading to a selection of soft and easy-to-chew foods [4].Therefore, these processes can further exacerbate sarcopenia and frailty [4].Moreover, with increasing age, people may experience physical and cognitive decline, which may result in poor oral hygiene, leading to an increased incidence of periodontitis [4].
Periodontitis is a focus in the "periodontal medicine" field, linked to around 50 systemic diseases, including metabolic syndrome (MetS) [5,6].MetS represents a cluster of cardiometabolic risk factors that can co-occur in an individual, including elevated plasma glucose, central obesity, dyslipidemia, and hypertension [7].MetS has been linked to obesity-related disorders such as non-alcoholic fatty liver disease (NAFLD) [8] and obstructive sleep apnea (OSA) [9].MetS poses a significant public health burden, with a global prevalence of approximately 20-25% [7].Multiple definitions of MetS exist, all agreeing on the defining components but differing in the suggested diagnostic criteria [10].
Both MetS and periodontitis result from multifactorial causes linked to immune and inflammatory responses [11], and a bidirectional relationship exists between them [12].
The underlying mechanisms linking periodontitis to MetS include inflammatory mechanisms where proinflammatory cytokines originating from the gingiva infiltrate the bloodstream and increase oxidative stress, which may facilitate insulin resistance and atherosclerotic changes, and both may lead to MetS development [13].The connection is bidirectional, as inflammatory cytokines resulting from MetS components may increase the oxidative stress in the gingiva [13].Elevated blood glucose induces various proinflammatory effects impacting multiple bodily systems, particularly the periodontal tissues [12].Adipokines from adipose tissue, such as TNF-α, IL-6, and leptin, contribute to inflammation.The hyperglycemic state leads to the deposition of advanced glycation end products (AGEs) in periodontal tissues, triggering local cytokine release and altered inflammatory responses via the receptor for AGE (RAGE) [12].Diabetic conditions also modify neutrophil function, intensifying the respiratory burst and delaying apoptosis, thereby increasing periodontal tissue destruction [12].Local cytokine production in periodontal tissues may reciprocally influence glycemic control through systemic exposure, impacting insulin signaling [12].These factors collectively contribute to dysregulated inflammatory responses in periodontal tissues, exacerbated by the chronic bacterial challenge in the subgingival biofilm and further compounded by smoking [12,13].
Evidence for these mechanisms was revealed, for example, by Ghorbani et al., who demonstrated that the combination of ischemic heart disease and periodontitis is associated with a lower activity of Paraoxonase-1 (PON-1), a new biomarker representing both anti atherosclerotic and antioxidant activity [14].Narendran et al. assessed the myocardial strain among controlled hypertensive patients with periodontitis and showed that an increase in the periodontal inflamed surface area (PISA) score may cause mild alterations in the global longitudinal strain (GLS) score, which could indicate the possible influence of periodontitis on myocardial activity [15].Moreover, homocysteine was suggested as a marker of inflammation in patients with periodontitis, and Khudan et al. demonstrated in rats that chronic hyperhomocysteinemia enhances disturbances in bone metabolism in lipopolysaccharide (LPS)-induced periodontitis [16].
However, the literature is conflicting regarding the associations between MetS and periodontitis, with some studies reporting a positive association [6,17,18], while other studies reporting conflicting or null results [19,20].A systematic review with meta-analysis concluded that the study effect size was influenced by the year of publication, study design, and MetS diagnostic criteria, contributing to inter-study variability [20].Prior contradictory results may stem from overlooking key confounders in the common risk factor model, such as age, socioeconomic status, smoking, obesity, nutrition, and hygiene [21].
Due to the demand for comprehensive data, interest has grown in using electronic medical records (EMRs) for big data analysis to study dental-systemic interactions using a machine learning (ML) approach [22].The use of artificial intelligence (AI) technology in clinical practice is an emerging and debated topic, both for its possible diagnostic and therapeutic implications [23].Through ML, it is possible to exploit multiple variables using easy and rapid extraction, such as the clinical history of the patients as well as anthropometric or demographic characteristics, to facilitate the identification of otherwise complex pathologies [23].In dentistry, several studies applied ML for periodontitis classification, producing encouraging results [24][25][26][27][28]. Limitations of these studies include using data from a single institution [25], not consistently including social determinants of health and systemic conditions or relying on patient-reported data, and not involving EMRs.Patel et al. stressed that analyzing common periodontitis risk factors (e.g., periodontal pockets) lacks clinical utility, given clinician awareness.They suggested focusing on the underlying risk factors as targets for preventive interventions [25].Furthermore, prior studies employed ML or statistics; however, combining both enhances comparisons and validates findings.
The rationale of this study involves the use of a novel methodology combining statistical and ML approaches in artificial intelligence technology to study the association between periodontitis and MetS, by utilizing a big data repository of a nationally representative population of young-to-middle-aged adults.This comprehensive repository allowed us to examine parameters from different facets of life, namely sociodemographics, health-related habits, medical history, and dental history, and thus consider the presence of numerous confounding parameters that had been gathered using a strict protocol for dental and medical disease definitions.This study aimed to address the unmet needs by analyzing the associations between periodontitis and MetS components and related conditions among young-to-middle-aged adults while controlling for sociodemographics, health behaviors, and caries levels.Our hypothesis suggests a positive association between periodontitis and certain MetS-related conditions.By addressing these research objectives, our aim is to advance periodontal medicine research and illuminate potential avenues for future clinical applications.

Data Source
This cross-sectional study is a part of the record-based nationwide Dental, Oral, and Medical Epidemiological (DOME) big data study [22,[29][30][31].These earlier articles featured and detailed the DOME study, with one dedicated to its protocol and methods [29].Briefly, to achieve the objectives, we utilized the DOME study's database, a large-scale, structured, and comprehensive repository that integrates sociodemographic, medical, and dental databases from a nationally representative population of young-to-middle-aged adults of military personnel within the Israel Defense Forces (IDF).Instead of relying solely on patient-provided information, this study cross-referenced data from three electronic databases: (1) dental patient records (DPRs), (2) medical records (i.e., computerized patient records (CPRs)), and (3) sociodemographic records.Data extraction was performed using the IDF Medical Information Department and was completely anonymous [29].As detailed in "DOME Protocol and Study Methods", the data warehouse (DWH) of the IDF Medical Corps combines information from several operational source systems into one comprehensive database.The collected data are classified in the DWH into Oracle database schema according to the data world of the original operational sources (e.g., CPR schema and DPR schema).Data management from the DWH was performed using Statistical Analysis System (SAS) version 7.1.

Ethical Clearance
This study was approved by the Medical Corps Institutional Review Board, with approval number IDF-1281-2013, and conformed to the guidelines of the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology).This study was granted informed consent exemption as it involved the retrospective analysis of anonymous electronic records.

Study Eligibility Criteria
Inclusion criteria: Men and women, aged 18-50 years, who visited the IDF dental clinics between 1 January 2015 and 1 January 2016, whose information is recorded in the sociodemographic, medical, and dental military electronic records, and who had periodontal status examinations recorded in the dental records were included in this study.
Exclusion criteria: Subjects lacking this information in these databases were not included.

Variables' Definitions
The definitions of variables are detailed in "DOME Protocol and Study Methods" [29], and below, we provide a concise overview: 2.4.1.The Dependent Variable: Periodontitis As described in our previous publications [29][30][31], periodontitis was defined according to the American Academy of Periodontology guidelines [32] (data collected before the new classification was published).Furthermore, due to possible pseudo pockets, the assessment of radiographic bone loss was deemed essential, as defined by crest cement junction distance exceeding 2 mm in more than one tooth, without observable causes (e.g., faulty restorations or overhangs, interproximal cavitation, etc.) [29].
Education: Educational attainment categorized as high school and below, technical college, or academic; 4.
Locality of Residence: Classification into urban Jewish, urban non-Jewish, or rural areas; 5.
Birth countries: North America, Eastern Europe, Western Europe, Ethiopia, Africa, Asia, South America, and Israel.
All sociodemograhic variables are listed in Table 1.

Health Behaviors
The following self-reported health behaviors were included (yes/no): tooth brushing (at least once a day), current smoking status, consumption of cariogenic diet (snacks and/or sweets food intake between/instead of meals), and consumption of sweetened beverages (above one glass per day).

Definition of Medical Diagnoses and Auxiliary Test Results
The CPR database uses the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) as the basis for diagnosis.The extracted diagnoses and auxiliary test results as part of the evaluation of MetS components are displayed in Table 2.

Analytical Approach
A novel integrated method using statistical and ML models was employed for data analysis.

Statistical Analyses
The IBM (International Business Machines) SPSS (Statistical Package for the Social Sciences) software version 28.0 (Chicago, Illinois, United States) was used to conduct the statistical analyses.
Descriptive statistics: Continuous variables are displayed as means and standard deviations.Categorical variables are displayed as frequencies and percentages.
Univariate analysis: The associations between periodontitis and the independent variables were analyzed with Pearson's chi-square (χ 2 ) or likelihood ratio test for categorical parameters and a non-paired t-test for continuous variables.To calculate the odds ratio (OR), we employed binary logistic regression analysis for binary categorical dependent variables and linear regression analysis for continuous dependent variables.
False discovery rate (FDR) procedure: Controlling type I errors in multiple hypothesis testing is of paramount importance in biomedical research.Therefore, following the univariate analysis, we applied the Benjamini-Hochberg (BH) procedure, which provides a balance between controlling the FDR and maintaining statistical power.

Analysis of multicollinearity:
A linear regression model that included collinearity tests was conducted with independent variables that retained statistical significance following the BH procedure.Only one of the highly correlated variables was selected based on the context.Variance inflation factors (VIFs), typically indicating collinearity at VIF > 10, were set at VIF 2.5 as a limit, due to the potential issues in weaker models.
Multivariate analysis: A multivariate binary logistic regression analysis was performed, including independent variables that were statistically significant following the BH procedure and were not collinear.

Machine Learning (ML) Models
The Python scikit-learn package [33] was used to run ML models.XGBoost ML algorithm: We utilized eXtreme Gradient Boosting (XGBoost), a powerful gradient-boosting framework for supervised learning problems that can be used for both regression and classification applications [34,35].XGBoost iteratively trains decision trees on the residuals of a previous iteration, where the residuals are the differences between the actual and predicted values [34,35].The algorithm also employs regularization techniques to prevent overfitting and improve model generalization [34,35].We used the XGBoost algorithm to generate a list of prioritized variables according to their importance in the task of periodontitis classification [35].We varied the ratios of training and testing (e.g., 70-30% and 80-20%) and conducted five-fold cross-validation [36].
Sensitivity analysis: Two additional ML algorithms were used to determine the features' importance, to confirm the validity of the XGBoost ML model: Gini importance [37] and information gain [38].
Gini Importance ML algorithm: Gini importance is a technique for determining the importance of input features in a random forest model [37,39].It calculates the overall reduction in the Gini impurity index to which each feature contributes across all decision trees in the forest.The Gini impurity index measures the homogeneity of the target variable within a decision tree node [37,39].
Information gain ML algorithm: Information gain (using entropy) [38] is a feature selection method commonly used in ML that measures the entropy or uncertainty reduction of a given dataset when features are included.This algorithm calculates the information gain for each feature by comparing the entropy of the original dataset with the entropy of the dataset after the feature is added.Features with high information gain are considered more informative and are selected for use in the final model [38].

Compliance with Reporting Guidelines in Machine Learning Research:
The adherence of the study to reporting guidelines was assessed using the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) checklist, (www.tripod-statement.org,accessed on 1 November 2023).This checklist comprises 20 main items and 31 subitems, covering various aspects of a prediction model's validation, such as title, abstract, methods, results, and funding disclosure.Each item received a binary rating of "1" for adherence or "0" for non-adherence.Subsequent scrutiny unveiled that this research rigorously adheres to all the elements prescribed by the TRIPOD, with three items deemed irrelevant.Detailed documentation of compliance with each TRIPOD item is provided.

Results
This research consisted of 57,496 records of patients, and the prevalence of periodontitis in the study population was 9.79% (5630/57,496).Table 1 presents the associations of periodontitis with sociodemographics, health behaviors, and the mean number of untreated decayed teeth.The following parameters were positively associated with periodontitis: 1.
Periodontitis was negatively associated with the mean number of untreated decayed teeth (OR = 0.972 (0.961-0.982)) and with North America as the birth country (OR = 0.715 (0.596-0.898)).There were no statistically significant associations between periodontitis and sex (OR = 1.028 (0.966-1.095)) (Table 1).2).Periodontitis had statistically significantly higher test values in most auxiliary tests, but the associations were weak, with ORs close to 1 (Table 2).We performed the BH to decrease the FDR (see Table 3).Only the independent variables that were statistically significant following the BH procedure entered the next step of collinearity statistics.The results of collinearity statistics shown in Table 4 ruled outcollinearity (VIF < 2.5).Subsequently, independent variables that were statistically significant following the BH procedure and were not collinear were used for the multivariate binary logistic regression analysis (Table 4).The following parameters retained a significant positive association with periodontitis in the multivariate analysis, the results of which are shown in Table 4 (from the highest to the lowest OR): brushing teeth (OR = 2.985 (2.739-3.257));OSA (OR = 2.188 (1.545-3.105));consumption of cariogenic diet (OR = 1.652 (1.536-1.776));NAFLD (OR = 1.483 (1.171-1.879));smoking (OR = 1.176 (1.047-1.322));and age (OR = 1.040 (1.035-1.046))(Table 4).The following parameters retained a significant negative association with periodontitis in the subsequent multivariate analysis (from the highest to the lowest OR): the mean number of untreated decayed teeth (OR = 0.980 (0.970-0.991));North America as the birth country (OR = 0.775 (0.608-0.988)) compared to native Israelis; urban non-Jewish (OR = 0.442 (0.280-0.698)) and urban Jewish (OR = 0.395 (0.251-0.620)) compared to the rural locality of residence (Table 4).
In the ensuing stage, we employed ML algorithms for the task of periodontitis diagnosis classification.The Gini importance and information gain algorithms resulted in outcomes akin to the performance metrics of the XGBoost model.Consequently, we report the results attained using the XGBoost algorithm in Figure 2. The model yielded an area under the curve (AUC) = 0.63, an accuracy = 0.19, an F1 score of 0.275, and a recall score of 0.506.Cut-offs for AUC discrimination results of 0.7 ≥ AUC ≥ 0.6 are considered acceptable discrimination [40].However, with a precision rate nearly double the periodontitis prevalence (19% vs. 9.79%), the XGBoost model excels in accurate disease detection while minimizing false positives.The model ranked the significance of the features in relation to the target variable (periodontitis) as follows: age was ranked first, followed by cariogenic diet (second), and smoking (third).The top-ranked MetS-related conditions were obesity (fifth), followed by OSA (twelfth), and NAFLD (thirteenth).

Discussion
In this research, we analyzed the associations of periodontitis with MetS among a nationwide population of 57,496 young and middle-aged adults, utilizing the DOME comprehensive repository.This provides us the unique opportunity to cross-check sociodemographic, dental, and medical parameters against periodontitis diagnosis and simultaneously analyze important confounders and mediators on an unmatched scale.In line with the current periodontitis classification, the collected data encompassed age, smoking habits, and medical comorbidities.The overall results obtained using our novel method combining statistical and ML approaches allowed us to establish a profile of the "patient vulnerable to periodontitis" that includes older age, rural locality, smoking, brushing teeth, cariogenic diet consumption, obesity, OSA, NAFLD, and having fewer untreated decayed teeth.Individuals born in North America had a lower prevalence of periodontitis than native Israelis.
Periodontitis prevalence: Among the nationwide sample of the Israeli population aged 18-50 years, periodontitis prevalence was 9.79%.According to the Global Burden Study, the prevalence of periodontitis increased from 1990 to 2019 [2], affecting almost 50%, and its severe form affects 9.8% (1990-2017) [41].Our lower prevalence compared to the literature may result from our focus on young-to-middle-aged adults and the inclusion

Discussion
In this research, we analyzed the associations of periodontitis with MetS among a nationwide population of 57,496 young and middle-aged adults, utilizing the DOME comprehensive repository.This provides us the unique opportunity to cross-check sociodemographic, dental, and medical parameters against periodontitis diagnosis and simultaneously analyze important confounders and mediators on an unmatched scale.In line with the current periodontitis classification, the collected data encompassed age, smoking habits, and medical comorbidities.The overall results obtained using our novel method combining statistical and ML approaches allowed us to establish a profile of the "patient vulnerable to periodontitis" that includes older age, rural locality, smoking, brushing teeth, cariogenic diet consumption, obesity, OSA, NAFLD, and having fewer untreated decayed teeth.Individuals born in North America had a lower prevalence of periodontitis than native Israelis.
Periodontitis prevalence: Among the nationwide sample of the Israeli population aged 18-50 years, periodontitis prevalence was 9.79%.According to the Global Burden Study, the prevalence of periodontitis increased from 1990 to 2019 [2], affecting almost 50%, and its severe form affects 9.8% (1990-2017) [41].Our lower prevalence compared to the literature may result from our focus on young-to-middle-aged adults and the inclusion of radiographic bone loss in the periodontitis definition.
Sociodemographic parameters: For the task of periodontitis classification, age was ranked first in the ML model (Figure 2) and retained a significant association in the statistical multivariate analysis (Table 4).In agreement, previous studies demonstrated that the risk of periodontitis rises with age globally, with a notable increase between the third and fourth decades of life [2].Greater periodontal destruction in the elderly reflects lifetime disease accumulation [11], and thus age is recognized as a risk factor for the future progression of alveolar bone loss [42].A geoscience approach links periodontitis and MetS as phenotypic expressions of accelerated biological aging [43].
Our study, like prior research [44], revealed a positive association between higher education and periodontitis [Table 1], attributed to multicollinearity between age and education, which prompted the exclusion of education from the multivariate analysis.
Sex was ranked fourth in the ML feature importance (Figure 2), although the association between sex and periodontitis did not reach statistical significance (Table 1).While systematic reviews provide evidence for a lower prevalence of periodontitis among women, men do not appear to have a higher risk for rapid periodontal destruction than women [45].Previous studies related sex-based periodontal differences to oral care behaviors of men rather than genetics [46], underscoring our holistic analysis.
Consistent with prior investigations [2], periodontitis was more prevalent among those with low and medium SES (Table 1).However, SES did not retain a statistically significant association with periodontitis in a multivariate analysis (Table 4) and ranked tenth in the ML feature importance model (Figure 2), highlighting the significance of other factors.
Rural locality retained statistical significance even following multivariate analysis (Table 4) and ranked eleventh in the ML feature importance model (Figure 2), consistent with prior research [47].Since military dental clinics are spread throughout the country, we attribute our findings to the consequences of living in a rural area, not clinic proximity.
An important socioeconomic determinant is the birth country, particularly in Israel, which is known as an immigrant state.North American-born individuals had a lower prevalence of periodontitis than native Israelis (Table 4), and the birth country parameter (Israeli natives vs. immigrants) was ranked sixth in the ML feature importance model (Figure 2).This is in line with the Global Burden of Severe Periodontitis Study, 1990-2019 [2], which demonstrated less prevalent cases of periodontitis in high-income North America and Canada compared to Israel [2].
Supporting our holistic approach to identifying novel sociodemographic parameters as predictors of periodontitis, Alqahtani1 et al. recently published a cross-sectional study of the 2013-2014 National Health and Nutrition Examination Survey (n = 4555) and identified age and education level as the two most important predictors for the presence and severity of periodontitis using ML models [26].Other significant factors included alcohol use, type of medical insurance, sex, and non-white race [26].
Health behaviors: In the ML model, cariogenic diet consumption ranked second, smoking third, sweetened beverage consumption ranked eighth, and teeth brushing ranked ninth (Figure 2).These parameters were also positively associated with periodontitis in the multivariate analysis (Table 4).Interestingly, patients with periodontitis exhibited better teeth-brushing habits and fewer decayed teeth.This may be due to the instructions provided to them or because they were being observed as part of the study (i.e., the Hawthorne effect).However, information bias is unlikely as patients with periodontitis also reported higher rates of smoking and consumption of cariogenic diet and sweetened beverages, all of which are known as unhealthy habits.
Decayed teeth: Periodontitis patients had fewer decayed teeth (Table 4), and decayed teeth were ranked seventh place in the ML feature importance model (Figure 2).In agreement, Sewon et al. observed a higher prevalence of caries-free teeth and molars in individuals with periodontitis [48].Conversely, other studies found that periodontitis is more prevalent in the presence of caries [49].
MetS. related conditions: Our primary objective was to analyze the association between periodontitis and MetS-related conditions using statistical and ML models.Interestingly, the only MetS-related conditions that retained statistical significance with periodontitis following the multivariate statistical analysis were NAFLD and OSA (Table 4), and the highest-ranked MetS-related conditions in the ML model were obesity (ranked fifth), followed by OSA (twelfth) and NAFLD (thirteenth) (Figure 2).Periodontitis associations with obesity, NAFLD, and OSA are underexplored compared to diabetes and cardiovascular links, potentially due to the inadequate consideration of concurrent MetS components.NAFLD, the most prevalent chronic liver disease worldwide, involves hepatic fat accumulation and is strongly associated with obesity [8].Likewise, OSA is marked by recurrent sleep airway obstruction, and therefore patients with OSA may benefit from treatment like palate surgery, reducing not only the apnea and hypopnea indices and daytime sleepiness but also associated with mood comorbidities [50].OSA is closely tied to obesity [9].Obesity appears to be a critical factor driving the pathogenesis of both NAFLD and OSA [51].Indeed, an international expert panel from 22 countries redefined NAFLD as a metabolic dysfunctionassociated fatty liver disease (MAFLD) [8].In our recent publications, we separately studied OSA [31] and NAFLD [30] and found a positive association with periodontitis.Our findings are also in line with other publications linking obesity and periodontitis, including a Mendelian randomization study suggesting a potential causal association between obesity and periodontitis [52].Wang et al. conducted a systematic comparison of six machine learning algorithms to develop and validate a prediction model to predict heart failure risk in middle-aged and elderly patients with periodontitis, and the variables in the final model were ranked in the descending order of importance as myocardial infarction, age, diabetes, and race.While their research question is different, the study by Wang et al. and the current study both highlight the importance of demographic factors such as age and race, as well as MetS-related conditions, in the context of periodontitis [53].Overall, our study highlights the importance of considering the whole MetS cluster and sheds light on less investigated MetS-related conditions in the context of periodontitis.

Strengths and limitations
The present study's strengths include a large sample size and rigorous protocol incorporating dental, medical, and sociodemographic databases, with consistent and strict definitions for all patients.Clinical and radiographic assessments were utilized for dental parameters.Dental and medical indexes were derived from records rather than relying on patient-reported data, except for health behaviors.Furthermore, this study employed a novel analytical approach combining both statistical methods and ML algorithms.
The limitations of this study include its cross-sectional design, preventing the establishment of causality.Another limitation pertains to the new classification for periodontitis, which considers staging including furcation involvement, tooth hypermobility, and the presence of infra-bony defects, parameters that were not measured and analyzed in this study.Although multiple confounding factors were taken into account, there are residual confounding factors that were not analyzed such as genetics, microbiome, and childhood and past exposures.Future research should involve long-term longitudinal populationbased epidemiological surveys conducted in different settings and populations that will incorporate multiomics data to increase generalizability, account for causal inferencing, and address these limitations.

Conclusions
The prevalence of periodontitis among a nationwide sample of the Israeli population aged 18-50 years was 9.79%.We identified a profile of the "patient vulnerable to

Figure 2 .
Figure 2. The ranking chart for clinical features' importance generated using the XGBoost machine learning algorithm for periodontitis set as the target variable.Five-fold cross-validation; train/test 80%/20%.

Figure 2 .
Figure 2. The ranking chart for clinical features' importance generated using the XGBoost machine learning algorithm for periodontitis set as the target variable.Five-fold cross-validation; train/test 80%/20%.

Table 1 .
The associations of periodontitis with sociodemographic parameters, health behaviors, and the mean number of untreated decayed teeth; Pearson's chi-square *; likelihood ratio ˆ; non-paired t-test **; generalized linear models ˆˆ; binary logistic regression #.
3 ± 8.3 1.035 (1.032-1.039) Figure 1 presents the prevalence of periodontitis per 100,000 by age among the study population, demonstrating that periodontitis prevalence increases with age.Bioengineering 2023, 10, x FOR PEER REVIEW 8 of 17

Table 4 .
Collinearity statistics and multivariate analysis of periodontitis as dependent variable.VIF: variance inflation factor.Statistically significant p values are in bold.