Statistical Methods and Machine Learning Algorithms for Investigating Metabolic Syndrome in Temporomandibular Disorders: A Nationwide Study

The objective of this study was to analyze the associations between temporomandibular disorders (TMDs) and metabolic syndrome (MetS) components, consequences, and related conditions. This research analyzed data from the Dental, Oral, Medical Epidemiological (DOME) records-based study which integrated comprehensive socio-demographic, medical, and dental databases from a nationwide sample of dental attendees aged 18–50 years at military dental clinics for 1 year. Statistical and machine learning models were performed with TMDs as the dependent variable. The independent variables included age, sex, smoking, each of the MetS components, and consequences and related conditions, including hypertension, hyperlipidemia, diabetes, impaired glucose tolerance (IGT), obesity, cardiac disease, obstructive sleep apnea (OSA), nonalcoholic fatty liver disease (NAFLD), transient ischemic attack (TIA), stroke, deep venous thrombosis (DVT), and anemia. The study included 132,529 subjects, of which 1899 (1.43%) had been diagnosed with TMDs. The following parameters retained a statistically significant positive association with TMDs in the multivariable binary logistic regression analysis: female sex [OR = 2.65 (2.41–2.93)], anemia [OR = 1.69 (1.48–1.93)], and age [OR = 1.07 (1.06–1.08)]. Features importance generated by the XGBoost machine learning algorithm ranked the significance of the features with TMDs (the target variable) as follows: sex was ranked first followed by age (second), anemia (third), hypertension (fourth), and smoking (fifth). Metabolic morbidity and anemia should be included in the systemic evaluation of TMD patients.


Introduction
Temporomandibular disorders (TMDs) represent a comprehensive classification encompassing diverse clinical manifestations that result in anomalous, deficient, or compromised functioning of the temporomandibular joint(s) (TMJ) and the associated masticatory muscles [1].
TMDs constitute the predominant source of non-dental chronic painful conditions within the orofacial domain, and stand second in the prevalence of musculoskeletal ailments that causes pain and impairment, following chronic lower back pain, impacting up to 12% of the populace [2].
The suffering of patients is manifested in different intensities of pain and discomfort during function to the point of difficulty in eating or speaking.Thus, TMDs may negatively influence routine tasks, social conduct, psycho-emotional conditions, and life quality [3].These consequences lead to increased healthcare use and social costs, with yearly expenditure appraised at USD four billion [2].
The etiology and progression of TMDs are complex and poorly understood, although numerous etiologic factors that can initiate or contribute to TMDs have been identified.Among the primary factors that have been identified to contribute to the development of TMDs are trauma, psychological distress, increased pain sensitivity, and activities related to parafunction, including bruxism and clenching [4,5].Furthermore, recent genetic research suggests that variations in the genetic profile of people could potentially exert a significant impact on the perception of pain, thus affecting the susceptibility to develop TMDs [6].
Associations with inflammatory conditions, as well as insufficient nutrient levels among individuals diagnosed with TMDs, have been documented [7].
Prior research analyzed the correlation between TMDs and separately taken component constituents of metabolic syndrome (MetS) [8,9].MetS, colloquially referred to as 'Syndrome X', is a cluster that encompasses a set of interconnected conditions, including central obesity, dyslipidemia, insulin resistance, and hypertension [10].These conditions collectively amplify susceptibility to cardiovascular diseases and the onset of type 2 diabetes mellitus [10].Additionally, there is documented evidence linking the syndrome with nonalcoholic fatty liver disease (NAFLD) [11] and obstructive sleep apnea (OSA) [12].MetS stands as a highly prevalent global health concern, affecting a substantial proportion of the adult population (estimated to be around 20-25% worldwide), meaning that every fourth person in the adult population suffers from MetS [10].Over the past two decades, various definitions of MetS have emerged, characterized by consensus regarding its key components but discrepancies in the recommended diagnostic criteria [13,14].It is widely acknowledged that the presence of any MetS component serves as a pivotal signal for the comprehensive assessment of other associated risk factors [13].
While previous studies investigated the association between TMDs and individual components of MetS, to the utmost extent of our cognizance, there are no big data studies utilizing statistical and machine learning (ML) models present in the English literature that study the associations between TMDs, MetS cluster constituents, their repercussions, and associated disorders, including biochemistry test results within the demographic of individuals ranging from youth to middle adulthood.
In recent times, ML has assumed a pivotal role across diverse domains, with its notable impact extending into the medical arena [15,16].This influence stems from ML's adeptness at discerning intricate patterns and extracting insights from convoluted datasets.Notably, graph-based deep learning has been employed for medical diagnostic purposes [17], while inverse reinforcement learning (IRL) algorithms have demonstrated efficacy in optimizing performance within intricate systems [18].The progress witnessed in ML exhibits promise in a myriad of medical applications [19][20][21].The potential ramifications of ML are particularly salient in advancing the comprehension and treatment of intricate medical conditions such as TMDs.
The predominant consensus asserts that inflammation processes play a role in MetS pathogenesis [22], and thus the amalgamation of biochemistry test outcomes is profoundly important.This multifaceted strategy enriches the comprehensiveness and depth of knowledge of our understanding of biological systems, particularly in the evaluation of hyperglycemia (glycated hemoglobin, fasting glucose), lipid profiles (cholesterol, triglycerides, lipoproteins), and markers of inflammation such as C-reactive protein (CRP).
Considering the above-mentioned unmet needs, our primary aim was to investigate the associations between TMDs and the following parameters: (a) diagnoses related to MetS, and (b) ancillary diagnostic tests, including biochemistry blood assessments utilized in the evaluation of MetS constituents.The hypothesis of this study posited a discernible association between TMDs and certain MetS components.For the exploration of these connections, this study will employ a novel combination of statistical and machine learning (ML) models to enhance comparisons between the models and to validate the findings.By addressing these aims of the study, our objective is to advance TMD and MetS research and shape new avenues for future research and clinical applications that hold critical relevance for policymakers in updating protocols and prevailing guidelines.

Research Population
This investigation forms a part of the Dental, Oral, Medical Epidemiological (DOME) big data record-based research project [23][24][25][26][27][28][29][30].Previous publications have extensively utilized and elucidated the DOME initiative, with a singular paper devoted to outlining the procedural framework and research methodologies of the DOME project [23].The DOME project is a large-scale, systematized, and comprehensive depository that integrates demographic, dental, and medical records within a nationwide population of individuals ranging from youth to middle adulthood from the military who sought routine medical and dental examinations at the Israel Defense Forces (IDF) general and dental clinics [23].Cross-referencing this demographic, dental, and medical information affords us a unique opportunity to discern associations between TMD diagnosis and MetS-related conditions on an extensive and unparalleled scale.

Research Ethics Clearance
This research adheres to the STROBE guidelines and was approved by the Institutional Review Board (IRB) of the Medical Corps (approval number: IDF-1281-2013).The IRB authorized this study as exempt from the necessity to obtain informed consent due to the study being retrospective and only involving the review of medical information.

Enrollment Criteria
Inclusion Criteria: This cross-sectional study considered male and female individuals aged 18 to 50 years who were affiliated with the Israel Defense Forces (IDF) and sought dental care at IDF dental clinics during the period from 1 January 2015 to 1 January 2016, and for whom comprehensive socio-demographic, medical, and dental records were available.
Criteria for exclusion: Participants with incomplete data records within the specified data sources were not included in the research.

Information Acquisition
The IDF Medical Information Department furnished the data sourced simultaneously from three IDF electronic systems, specifically, dental patient records (DPRs), medical patient records (computerized patient records (CPRs)), and the socio-demographic electronic systems housing the socio-demographic characteristics of individuals from the military, as elaborated upon in our previous publication [28].The process of data mining was executed in an anonymized manner by the Medical Corps' Department of Medical Information, as described previously [23].

Definitions of Study Variables
Demographic and smoking status variables: Sex: male/female; age in years; current smoker: yes/no.
Medical diagnoses: The computerized patient record (CPR) repository utilizes the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM), as the foundation for diagnostic purposes.

2.
Systemic comorbidities linked to Metabolic Syndrome (MetS) were incorporated within the study as independent variables defined according to the ICD-9-CM diagnostic criteria.These diagnoses are depicted in Table 1 in the results section.
Ancillary test findings including biochemistry blood test results: The supplementary test outcomes also sourced from the CPR encompassed an array of assessments utilized for the evaluation of metabolic syndrome (MetS) components, including biochemistry blood laboratory tests.These assessments encompassed, as previously delineated [23,29].The ancillary tests are depicted in Table 2 in the results section.

Analysis Strategy
An innovative integration of statistical and machine learning (ML) models was utilized for analysis of the data.

Statistical Analysis
Statistical procedures were conducted utilizing SPSS software version 28.0 (IBM, Chicago, IL, USA).Descriptive statistics entailed representing continuous variables through means and standard deviations (SDs), while categorical variables were depicted by frequencies and corresponding percentages.
Bivariate analysis: For the bivariate analysis, we scrutinized the association between temporomandibular joint disorders (TMDs) as the dependent variable and their independent variables.Categorical parameters were assessed through Pearson's chi-square test or the likelihood ratio test, and continuous variables were analyzed using non-paired t-tests for non-paired samples.Odds ratios (ORs) were computed, employing linear regression for continuous variables and binary logistic regression for categorical variables.
Multicollinearity analysis: After the bivariate analyses, multicollinearity assessments were performed using linear regression to evaluate the interrelationships among the independent variables.In cases where substantial collinearity was detected between two variables, only one was incorporated into the model, with the specific variable chosen to be contextually determined.Variance inflation factors (VIFs), calculated as 1 divided by the tolerance, were computed.While VIF values below 10 typically denote collinearity, this study applied a VIF threshold of less than 2.5, due to the potential issue of less robust models.
Multivariable analysis: Subsequent to the bivariate analysis and collinearity evaluation, a multivariable binary logistic regression analysis was executed with TMD as the dependent variable.Independent variables identified as statistically significant in the bivariate analysis that were not marked by high collinearity were incorporated.All associations described were statistically significant at p = 0.01.

Machine Learning (ML) Models
In the execution of machine learning (ML) models, we harnessed the Python scikitlearn package [31].We employed XGBoost, a highly efficient gradient-boosting framework that proves particularly adept for supervised machine learning tasks in the domains of regression and classification [32].
The goal of the ML model was to explore the relative feature significance and generate a prioritized variables list according to their importance in the task of the classification of TMDs as the target variable.The model underwent a rigorous evaluation with a five-fold cross-validation approach [33], deploying distinct training and testing dataset ratios such as train-test partitions of 70-30% and 80-20%.To affirm the robustness of the XGBoost ML model, we additionally conducted two alternative ML models to assess feature importance: Gini Importance [34] and Information Gain based on Entropy [35].
Evaluating Adherence to Reporting Standards in Machine Learning Research: To assess the completeness of this research to the standards of reporting research in the field of machine learning, we utilized the checklist of TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis; www.tripodstatement.org,accessed on 5 November 2023) for the validation of the prediction models.
The checklist comprises 20 main elements, accompanied by a cumulative sum of 31 sub-components, that address different elements encompassing the validation of predic-tion models, including title, abstract, introduction, methodology, results, discussion, and funding disclosures.
Each item was scored as adherent (1) or not adherent (0).Subsequent analysis demonstrated adherence of the research to all TRIPOD elements, with 3 elements identified as non-relevant.The findings related to the TRIPOD elements were precisely articulated, with in-depth documentation of the adherence to the specified TRIPOD elements.

The Associations of Temporomandibular Disorders (TMDs) with Demographics, Smoking Status, and Systemic Conditions
The prevalence of TMDs in the study population was 1.43% (1899/132,529).Table 1 presents the demographics, smoking status, MetS constituents, their repercussions, and associated disorders of patients with TMDs compared to those without TMDs.A statistically significant positive association was demonstrated between TMDs and the following parameters: S/P (status post) transient ischemic attack (TIA) was associated with over 5-fold odds of having TMDs; obstructive sleep apnea (OSA) and deep venous thrombosis (DVT) had over 4-fold odds; nonalcoholic fatty liver disease (NAFLD), impaired glucose tolerance (IGT), and anemia had over a 3-fold odds; and female sex, smoking, hypertension, hyperlipidemia, type 2 diabetes, obesity, and cardiac disease were associated with over 2-fold odds of having TMDs (Table 1).The associations of TMDs with ancillary test findings, including laboratory biochemistry assays employed in the work-up of MetS components, are depicted in Table 2.There was a statistically significant positive association between TMD and body mass index (BMI), cholesterol, and high-density lipoprotein (HDL).Nevertheless, the associations were weak with odds ratios (ORs) closely approximating a value of 1 (for this reason, three decimal places are displayed in Table 2).Moreover, the rest of the ancillary tests presented in Table 2 had no statistically significant associations with TMDs.After conducting bivariate analyses, a linear regression analysis was carried out to determine the collinearity among the independent variables that exhibited statistical significance.The collinearity statistics results are presented in Table 3 and rule out collinearity, since all VIF values are below 2.5.
Following that, we carried out a multivariable binary logistic regression analysis with TMD diagnosis as the dependent variable, which is also presented in Table 3.The multivariable analysis included statistically significant independent variables following the bivariate analysis, and did not exhibit collinearity.All independent variables were collectively incorporated in a single step within the multivariable analysis.Female sex was associated with over 2.5-fold odds of having TMD, and anemias was associated with over 1.5-fold odds and the ORs for age were close to 1 (Table 3).Following the multivariable analysis, we performed a second multivariate logistic regression analysis stratified according to age, which is presented in Table 4.We split the total sample into two groups: 18-30 years (119,579 patients, 90.2%) and 31-50 years (12,949, 9.8%).In the younger age group (18-30 years) the significant parameters were sex, smoking, hyperlipidemia, obesity, cardiac disease, OSA, and anemia.In the older age group (31-50 years), the significant parameters were sex, NAFLD, and anemia.Sex and anemia were the only parameters that retained a statistically significant association in both age groups.Both the Gini Importance and Information Gain based on Entropy methods yielded results for model fitness measurements, including metrics like the area under the curve (AUC) and accuracy, which exhibited a high degree of similarity to the outcomes obtained with the XGBoost model.Therefore, we present the results of the XGboost ML algorithm in Figure 1.The AUC was 0.748, recall score = 0.703, precision = 0.027, and accuracy was 0.660.The thresholds indicating excellent discrimination in the AUC results lie within the range of 0.7 to 0.8 [36].Furthermore, the XGBoost model demonstrates a nearly two-fold increase in precision (2.7%) compared to the TMD prevalence in the study population (1.43%), showcasing its proficiency in precise disease detection while effectively reducing false positive results.The feature importance scores derived from the XGBoost algorithm, as illustrated in Figure 1, reveal the model's prioritization of feature significance with TMDs (the target variable) as follows: sex holds the top position, followed by age (second), anemia (third), hypertension (fourth), and smoking (fifth).

Discussion
To the utmost extent of our cognizance, this marks the inaugural research within the English literature that employs novel methods of statistical and ML analytics in a big data context to investigate the association between TMDs and MetS among 132,529 individuals ranging from youth to middle adulthood, using a holistic approach that cross-referenced demographic and medical data including laboratory biochemistry tests at an unmatched scale.
The fusion of clinical data with biochemistry test results empowers investigators to unveil intricate associations between molecular events and systemic responses.
Artificial intelligence (AI) algorithms have been employed in the diagnosis of TMDs.Nonetheless, investigations have employed disparate criteria for patient selection, diverse categorizations of disease subtypes, distinct input data, and varied outcome measures, and consequently, the efficacy of AI models exhibits variability across these studies [37].For this reason, this study employed a hybrid analytical framework combing both statistical and ML approaches.
Regarding demographic parameters, the outcomes of the investigation align with the literature.Our study population included subjects aged 18-50 years, and similarly, Yap et al. found that TMD signs and symptoms are more predominant among adults between 20 and 40 years of age [38].Coinciding with our findings of a positive association between TMDs and age in both statistical and ML models, a large prospective clinical trial study representing 2737 TMD subjects also showed an increased prevalence in accordance with age.The annual increment varies, ranging from 2.5% for individuals aged 18 to 24 to 4.5% for those within the age bracket of 35 to 44 [39].
In the present study, we addressed the confounding effect of age by performing a multivariable analysis with age as a continuous parameter, and also stratified our data according to age (18-30 vs. 31-50 years).The younger age group comprised most of the sample (90.2%), and therefore our conclusions should be based on careful interpretation of all analyses performed.Sex and anemia were significant in both age groups, as well as in the multivariable analysis that used age as a continuous variable, and in the ML algorithm, highlighting these factors as the most significant.
In accordance with our observations of a positive association between TMDs and female sex in both statistical and ML models, there is consensus in the existing medical literature that TMDs predominantly affect women [40][41][42].The results of a US National Health Interview Survey showed that women reported pain in the jaw joint and/or facial pain 2.1 times more often than men [43].TMD pain was additionally prevalent among women compared to men in studies from Sweden [44] and Finland [45].
In this study, while a significant positive association was found between TMDs and smoking habits in the bivariate analysis, this association was lost following the multivariable analysis, and smoking was ranked only in fifth place by the ML model.The results in the literature are contradictory.Some studies demonstrated that smoking cigarettes was related to both significantly greater TMD pain intensity [46][47][48] and TMJ sounds [46], and to poorer response to treatment than nonsmokers [48].Contradictory to these results, Wänman et al. found that the manifestation or progression of signs and symptoms of TMDs is not associated with smoking [49].Yekkalam et al. also found no correlation between smoking and craniomandibular disorders in an adult population [50].
This study aimed to perform an analysis of the association between TMDs and MetSassociated disorders utilizing statistical and ML analytics.The study demonstrated that TMDs are positively associated with systemic conditions related to MetS, and in particular with anemia and hypertension.Anemia maintained a statistically significant association with TMDs after multivariable analysis and emerged as the most highly ranked systemic condition in the ML model (ranked third after sex and age).Corresponding to our results, other studies also found an association between TMDs and anemia.For example, Ohrbach and colleagues identified a noteworthy association between TMDs and different hematologic disorders including anemia, disorders of bleeding, and leukemia, using data from the OPPERA case-control study [8].
Mehra et al. found serum nutrient deficiencies, including anemia, in patients with complex TMDs who underwent surgical joint reconstruction [51].Orhan et al. found that individuals experiencing persistent anemia exhibited reduced signal intensity in the mandibular condyle bone marrow and posterior band compared to their healthy counterparts, suggesting that anemia might induce modifications in bone marrow without any concurrent internal derangement [52].
A separate investigation exploring tissue oxygen saturation and alterations in oxygenated hemoglobin, deoxygenated hemoglobin, and total hemoglobin within the masseter muscle revealed that subjects with a predisposition to TMDs exhibit irregularities in the deoxygenation of the masseter [53].Conversely, Staniszewski et al., in a controlled crosssectional study of 60 TMD patients and 60 healthy controls, failed to establish a correlation between severe systemic illness, malnutrition, and systemic inflammation with TMDs [7].
The association between TMDs and anemia may signify variations in hemoglobin levels concerning age, sex, and smoking habits.For this reason, we used statistical and ML multivariable models that adjusted for these parameters and demonstrated an association between TMDs and anemia, independent of age, sex, and smoking.Another explanation for a positive association between TMDs and anemia was suggested by Mehra et al., who attributed anemia to the deficiency state due to inadequate nutritional intake and utilization dysfunction in TMD patients [51].An additional compelling rationale for the observed association between TMDs and anemia pertains to the presence of anemia of inflammation, also recognized as anemia of chronic disease.This form of anemia is common in patients with illness causing protracted immune activation, such as infections, autoimmune disorders, and malignancies [54].This category has expanded over the years to encompass the effects of aging, obesity, type 2 diabetes, pulmonary arterial hypertension, chronic liver disease, and advanced atherosclerosis, with ramifications of stroke and coronary artery disease [54].Indeed, in the current study, TMD patients exhibited a higher prevalence of systemic conditions related to MetS compared to those without TMDs.
The highest-ranked MetS-related condition was hypertension, which was ranked fourth by the ML model, although it did not retain a statistical significance with TMDs in the multivariable statistical analysis.Thus, the present research emphasizes the importance of using ML models in addition to classical statistical models, as they can add value to feature importance identification.While there were previous studies that found no correlation between TMDs and hypertension [55,56], there were other studies similar to our findings, such as Sanders et al., who demonstrated an association between the incidence of first onset TMD and increased mean baseline arterial pressure, as well as OSA [57].Maixner et al. found that painful TMDs exhibit heightened sensitivity to painful stimuli, and may result from dysfunction in the central pain modulatory mechanisms which, in turn, can be influenced by baseline arterial blood pressure [58].
Following hypertension, the next ranked MetS-related condition by the ML algorithm was NAFLD, which was ranked sixth in the feature importance for the task of TMD classification.In recent years, NAFLD has been recognized to be the hepatic manifestation of MetS, and is also termed "metabolic dysfunction-associated fatty liver disease" [11].While other MetS-related conditions were studied in the context of TMDs, based on a performed literature review, no studies were found regarding the association between TMDs and NAFLD, highlighting the importance of the current study's holistic approach in analyzing MetS related conditions in the context of TMDs.
Moreover, the holistic approach employed in this investigation enriched the comprehensiveness and depth of our knowledge on molecular biology in the assessment of hyperlipidemia and serum lipid profile parameters (cholesterol, HDL, LDL, non-HDL, triglycerides).Hyperlipidemia was only ranked seventh by the ML algorithm, and while serum cholesterol and HDL were positively associated with TMD, these parameters exhibited ORs close to 1, indicating a weak association.In accordance with our observations, a long-term cohort study found no correlation between TMDs and hyperlipidemia/dyslipidemia [56,59], and another study found no significant association between levels of total cholesterol and TMDs [55].
Furthermore, the multifarious approach included the evaluation of IGT, diabetes, and hyperglycemia (serum glycated hemoglobin, fasting glucose).The ML algorithm ranked IGT and diabetes only in the 9th and 13th places, respectively, and serum glycated hemoglobin and fasting glucose had no significant association with TMDs.Parallel to our investigations, previous studies did not demonstrate a correlation between diabetes and painful TMDs [9,59,60], and while we found no differences in fasting glucose levels, Byun et al. even reported that TMD patients have lower mean levels of fasting blood glucose [55].
Another important biochemical marker incorporated in the analysis was serum CRP.CRP is acknowledged as a significant indicator of persistent inflammatory processes and as one of the major proteins of the acute phase reaction [61].In the present study, CRP levels were not significantly different between those with and without TMDs, like reports in previous studies [7,62].
The study also analyzed the associations between weight, BMI data, and obesity.A sequence of cross-sectional surveys concluded that BMI does not mirror equivalent adjustments of weight relative to height across different genders or across age cohorts [63].
For this reason, we decided to examine both weight and BMI separately.Obesity did not maintain a statistically significant association with TMDs following multivariable analysis and was ranked 12th in feature importance for TMD classification by the ML algorithm.Our findings demonstrate that weight had no statistically significant association with TMDs, and BMI-although positively associated with TMDs-exhibited ORs close to 1, indicating a weak association.Coinciding with our findings, Jordani et al. demonstrated that painful TMDs exhibited a notable correlation with total body fat percentage, but in the multivariable analysis, obesity did not maintain its significance [64].Two recent systematic reviews and meta-analyses did not show a clear association between obesity and TMDs and concluded that obesity is not a risk factor for TMDs [65,66].
Consequences of MetS such as cardiac disease, TIA, stroke, and DVT were also taken into consideration in our holistic approach using statistical and ML analyses.These parameters did not retain a statistically significant positive association in the multivariable analysis and were ranked relatively low by the ML algorithm in terms of feature importance during the task of TMD classification.While we were looking at TMD patients, other studies focused on the prevalence of TMD dysfunction among stroke patients and found it was higher compared with the healthy group [60].However, corresponding to our study, in a nationwide population-based cohort study, Lee et al. also found no association between TMDs and stroke [56].Another study, researching potential risk factors for chronic TMDs found no correlation of TMDs to different cardiovascular conditions, including mitral valve prolapse, high blood pressure, angina, heart attack, heart failure, and stroke [8].
The principal strengths inherent in the current investigation include a substantial sample size and meticulous adherence to a rigorous protocol that incorporated demographic and medical databases.This enabled us to cross-reference TMD diagnosis with demographic and medical data on an unprecedented scale.Definitions were uniform for all patients.To reduce recall bias, the study used demographic data, medical diagnoses, and medical indexes that were extracted from records devoid of dependence on patient selfreports, except for smoking (which was derived from the records but relied on the reports of the patients).Because of the large dataset, a concern is finding significant but clinically meaningless associations.Therefore, the study employed a rigorous multi-step analytical approach by setting the cut-off for statistical significance at p = 0.01, performing collinearity statistics with a VIF cut-off of 2.5 to address the potential pitfalls of variable intercorrelation, and including in the final multivariable model only parameters that exhibited statistical significance in the bivariate analysis, while also demonstrating low collinearity.This meticulous approach accounts for confounding effects, reducing the inflation of type I error rates and enhancing the validity of the results.Moreover, the study utilized a novel approach that combined statistical and ML models to enhance the validity of the findings.
The main limitation of the current research is the cross-sectional study design, which prevents the establishment of causality.Although we used a nationwide military population, further investigations-encompassing extended longitudinal population-based epidemiological surveys conducted in diverse settings and among varied populations-would contribute to the augmentation of generalizability and account for these limitations.Future studies should be multi-centered and should include advanced statistical and artificial intelligence approaches.Federated learning can be employed, which will enable model training across servers holding local data samples without the need to exchange them, thus mitigating concerns related to data security and privacy.Furthermore, alternative forms of data, including textual electronic medical data, vocal information, and auditory data, should be employed in future TMD research.

Conclusions
This study utilized novel methods of statistical and ML analytics to identify the association between TMDs and MetS among 132,529 individuals ranging from youth to middle adulthood, employing a holistic approach that analyzed demographic, medical, and laboratory data on an unmatched scale.We established that a profile of a "patient that is vulnerable to have TMD" includes the following: female sex, older age, and the presence of anemia and hypertension.In the systemic assessment of patients with TMDs, it is imperative to incorporate examinations for metabolic morbidity and anemia.Evaluating risk factors associated with these conditions is essential for the targeted identification of high-risk populations susceptible to TMDs, MetS, and anemia.Health authorities should be cognizant of these co-morbidities in individuals with TMDs, and facilitate appropriate referrals to both dentists and physicians for comprehensive evaluation.

Figure 1 .
Figure 1.Feature importance scores produced by XGBoost algorithm for TMD diagnosis as a target variable.
3.2.The Associations of Temporomandibular Disorders (TMDs) with Ancillary Test Findings including Biochemistry Blood Test Results Used in the Workup of MetS Components

Table 2 .
The associations of temporomandibular disorders (TMDs) with ancillary test findings including biochemistry blood test results used in the workup of MetS components.*: non-paired t-test, #: linear regression; OR: odds ratio, CI: confidence interval.Statistically significant results are in bold.Multivariable Analysis and Collinearity Statistics Evaluating Temporomandibular Disorders (TMDs) as a Dependent Variable with Significantly Associated Parameters Identified in the Bivariate Analysis

Table 3 .
Multivariable analysis and collinearity statistics with temporomandibular disorders (TMDs) as a dependent variable with statistically significant parameters in the bivariate analysis.SE: standard error; VIF: variance inflation factor; statistically significant values are in bold.

Table 4 .
Multivariable analysis stratified according to age (18-30 and 31-50 years) with temporomandibular disorders (TMDs) as a dependent variable with statistically significant parameters in the bivariate analysis.SE: standard error; statistically significant values are in bold.