The Discovery of Data-Driven Temporal Dietary Patterns and a Validation of Their Description Using Energy and Time Cut-Offs

Data-driven temporal dietary patterning (TDP) methods were previously developed. The objectives were to create data-driven temporal dietary patterns and assess concurrent validity of energy and time cut-offs describing the data-driven TDPs by determining their relationships to BMI and waist circumference (WC). The first day 24-h dietary recall timing and amounts of energy for 17,915 U.S. adults of the National Health and Nutrition Examination Survey 2007–2016 were used to create clusters representing four TDPs using dynamic time warping and the kernel k-means clustering algorithm. Energy and time cut-offs were extracted from visualization of the data-derived TDPs and then applied to the data to find cut-off-derived TDPs. The strength of TDP relationships with BMI and WC were assessed using adjusted multivariate regression and compared. Both methods showed a cluster, representing a TDP with proportionally equivalent average energy consumed during three eating events/day, associated with significantly lower BMI and WC compared to the other three clusters that had one energy intake peak/day at 13:00, 18:00, and 19:00 (all p < 0.0001). Participant clusters of the methods were highly overlapped (>83%) and showed similar relationships with obesity. Data-driven TDP was validated using descriptive cut-offs and hold promise for obesity interventions and translation to dietary guidance.


Introduction
The Dietary Guidelines for Americans, 2020-2025 [1] emphasizes the importance of a healthy dietary pattern rather than focusing on nutrients or foods in isolation. Dietary patterns are defined as "the quantities, proportions, variety, or combination of different foods, drinks and nutrients in diets, and the frequency with which they are habitually consumed" [2]. Therefore, dietary patterns include not only foods and their components such as energy, but also the behaviors inherent to dietary intake such as when eating and drinking occur. Yet, little attention has been given to the specific timing of dietary intake, perhaps due to the methodological difficulty of patterning temporal data. Only a few studies [3][4][5][6] have investigated the frequency and timing of eating and even fewer studies [7,8] have incorporated multiple aspects of dietary patterns, such as energy, frequency, and the specific timing of dietary intake, despite evidence of a link to dietary quality and ultimately, health. Data-driven methods were recently applied to create temporal dietary patterns (TDPs) incorporating timing and amount of energy intake over 24-h [3-6]. These studies showed that TDPs were significantly associated with obesityrelated health indicators including body mass index (BMI) and waist circumference (WC) [3,4]. For example, participants with energy-equivalent and evenly distributed eating occasions throughout the day had higher diet quality [5], lower mean BMI and odds of obesity, and smaller WC [3] than those with other temporal dietary patterns exhibiting one energy intake peak sometime in the day. However, since data-driven methods that are based on the true nature of the behaviors were used, the patterns that emerged had no guiding description or indication of adherence to recommendations to explain the resulting patterns or the constraints of inclusion. The patterns emerging from such clusters may be difficult to describe and capture latent characteristics. Therefore, an interpretation describing these data-derived TDPs may not have similar relationships with obesity and should be validated to ensure accuracy.
The purpose of this study was (1) to create data-driven TDPs and determine their relationships to BMI and WC; (2) then to extract the pattern characteristics using energy and time cut-offs based on visualizing the patterns and assess the concurrent validity of the cut-off-derived TDPs by determining the percentage of overlap in cluster membership and determining the cut-off TDPs relationships with BMI and WC. The hypothesis is that the strength of the relationship of TDPs based on energy intake and time cut-offs with BMI and WC is similar to the relationship of TDPs created using data-driven methods with BMI and WC, and participant membership to the various pattern clusters is highly overlapping between the similar cut-off and data-driven TDP clusters.

Participants and Data Set
Participants of the study were drawn from the National Health and Nutrition Examination Survey (NHANES) 2007-2016. NHANES is a National Center for Health Statistics (NCHS) conducted survey containing interviews and a physical health examination to quantify the health and nutritional status of U.S. adults and children. Voluntary participation is invited after selection based on location, characteristics, and randomness. Participants' sociodemographic characteristics, including age, sex, race, ethnicity, and poverty-to-income ratio (PIR), were collected during the in-person household interview. Anthropometric measurement, including height, weight, and WC, and the first dietary recall interview were collected during the physical health examination. The NCHS Research Ethics Review Board approved this survey and all the participants consented to completing the survey [9].
NHANES 2007-2016 were used because data were the most recently available when the study was initiated. The sample included non-pregnant U.S. adults aged 20-65 years with reliable first-day 24-h dietary recall data and complete anthropometric measurements. The temporal dietary behaviors of pregnant women and participants outside of the age range are expected to exhibit unique life stage patterns and were excluded. Participants with missing sociodemographic and anthropometric data were also excluded. Therefore, the analytic sample included 17,915 participants ( Figure 1).

Dietary Data Assessment
The USDA Automated Multiple-Pass Method was included in NHANES to collect the 24-h dietary recall data [10], including the time, amount, and type of foods and beverages consumed, and detailed food descriptions [11]. Valid 24-h dietary recalls that met the NHANES criteria [11] with non-zero energy intake were used in this study. Each participant's energy intake for all reported foods and beverages was determined using the  [12]. The time duration of the eating occasions was not available in NHANES, thus 15 min/occasion was applied at each time of reported intake based on a previous study [13] where reported energy at a time was divided by 15 min to determine the energy per minute for each minute within the 15-min eating occasion.

Anthropometric Measurement
Standing height and WC were measured in centimeters using a stadiometer and measuring tape, respectively. Weight was measured in kilograms using a digital weight scale [14][15][16]. BMI was calculated as a person's weight in kilograms divided by the square of their height in meters [17].

Measures for Covariates
Survey year, sex, age group, race, ethnicity, PIR, and energy misreporting were used as covariates to adjust the regression models that evaluated the relationship of the TDPs to BMI and WC. Survey year included years 2007-2008, 2009-2010, 2011-2012, 2013-2014, and 2015-2016. Sex was classified as male and female. Race and ethnicity were classified as Mexican American, other Hispanic, non-Hispanic white, non-Hispanic black, and other including multi-race. PIR is the ratio of family income-to-poverty and was classified as 0-0.99 (under poverty threshold), 1-1.99, 2-2.99, 3-3.99, 4-4.99, and ≥5 [18]. Energy misreporting was considered a potential confounder to the relationships evaluated and was determined by calculating total energy intake divided by estimated energy requirement (EER) [19][20][21], where EER was derived based on dietary reference intake equations for adults according to the Institute of Medicine [22]. The NCHS assigns weights to participants in the NHANES based on their selection. Weights were constructed when combining survey cycles 2007-2016 and used in the models, thus the results are representative of the US civilian, noninstitutionalized population at the midpoint of the 10 years of data

Dietary Data Assessment
The USDA Automated Multiple-Pass Method was included in NHANES to collect the 24-h dietary recall data [10], including the time, amount, and type of foods and beverages consumed, and detailed food descriptions [11]. Valid 24-h dietary recalls that met the NHANES criteria [11] with non-zero energy intake were used in this study. Each  [12]. The time duration of the eating occasions was not available in NHANES, thus 15 min/occasion was applied at each time of reported intake based on a previous study [13] where reported energy at a time was divided by 15 min to determine the energy per minute for each minute within the 15-min eating occasion.

Anthropometric Measurement
Standing height and WC were measured in centimeters using a stadiometer and measuring tape, respectively. Weight was measured in kilograms using a digital weight scale [14][15][16]. BMI was calculated as a person's weight in kilograms divided by the square of their height in meters [17].

Measures for Covariates
Survey year, sex, age group, race, ethnicity, PIR, and energy misreporting were used as covariates to adjust the regression models that evaluated the relationship of the TDPs to BMI and WC. Survey year included years 2007-2008, 2009-2010, 2011-2012, 2013-2014, and 2015-2016. Sex was classified as male and female. Race and ethnicity were classified as Mexican American, other Hispanic, non-Hispanic white, non-Hispanic black, and other including multi-race. PIR is the ratio of family income-to-poverty and was classified as 0-0.99 (under poverty threshold), 1-1.99, 2-2.99, 3-3.99, 4-4.99, and ≥5 [18]. Energy misreporting was considered a potential confounder to the relationships evaluated and was determined by calculating total energy intake divided by estimated energy requirement (EER) [19][20][21], where EER was derived based on dietary reference intake equations for adults according to the Institute of Medicine [22]. The NCHS assigns weights to participants in the NHANES based on their selection. Weights were constructed when combining survey cycles 2007-2016 and used in the models, thus the results are representative of the US civilian, noninstitutionalized population at the midpoint of the 10 years of data included in the study [23]. The survey design of NHANES included stratification and clustering, which were both accounted for in the regression models according to NCHS guidelines to improve the precision of survey estimates [24].

Creating TDPs through Data-Driven Method
A detailed description of the methods for creating the data-derived TDPs has been published previously [3,6]. Briefly, participants' first 24-h dietary recall was considered as a time series of 24 h × 60 min = 1440 min with each entry representing the absolute amount of energy intake during that minute. Distance-based clustering analysis with a dynamic time warping (DTW)-type distance measure was used to create the TDPs. DTW optimally matches the eating events for each pair of participants in the sample by minimizing a weighted sum of the squared differences between the time and energy intakes of the respective participant's eating events. A weight parameter is used to control the matching by penalizing the time differences relative to the energy uptake differences to avoid pathological matchings (such as matching morning to late night dietary intake), and this variation of DTW is denoted as modified DTW (MDTW) [25]. Then, the distance measure of diet was coupled with the kernel k-means algorithm [26] to partition the ensemble of time series into different clusters to develop TDPs without predetermined standards or cut-offs. The purpose of using the kernel k-means algorithm is to generate TDPs where the dietary intakes are similar within a cluster and more dissimilar between clusters. The number of clusters which partition the participants into mutually exclusive clusters was based first on internal criteria related to the variance and consistency of clusters including silhouette index and Dunn index [27,28] where k = 3 and k = 4 yielded the best results (Table 1). Using these criteria, a high value indicates that the participant's temporal dietary behavior is well matched to its own cluster and poorly matched to neighboring clusters. Next, the number of clusters was evaluated by external criteria associated with the visualization, time and energy differences among the clusters, and health outcome analysis as described in Section 2.9 where k = 4 was optimal. External criteria were also used to optimize the weight parameter in MDTW.

Visualization of TDPs through Data-Driven Method
Based on the criteria above, β = 40 generated the best TDPs. The visualization of the distribution of dietary intake in each of four clusters is illustrated using heat maps in Figure 2. The x axis indicates time ranging from 00:00 to 24:00, and the y-axis shows absolute energy intake ranging from 0 to 4000 kcal. The proportion of individuals in each cluster reporting dietary intake at a certain time and amount of energy is represented through shading and ranges from 0.0% to 12.8% in the 4 TDP clusters. Darker shading represents that a greater percentage of participants in the cluster reported the same amount of energy intake at that time. . Absolute energy intake ranges from 0 to 4000 kcal (y-axis) while timing of intake ranges from 00:00 to 24:00 at hourly increments (x-axis) for non-pregnant U.S. adults 20-65 years. The proportion of participants in each cluster reporting energy intake is shown through shading ranging from 0.0% to 12.8% of participants in the 4 TDP clusters. Darker shading represents a greater percentage of participants in the cluster reporting the same amount of energy intake at that time. C1, cluster 1; C2, cluster 2; C3, cluster 3; C4, cluster 4; TDP, temporal dietary pattern.

Creating TDPs through Cut-Off Method
The heat map visualizations of the data-derived TDPs were used to describe the data and create the energy and time cut-offs. Specifically, shading indicating the proportion of the clusters with energy intake at the various hourly times were observed to find cutpoints where the majority of energy intake and eating events occurred for each cluster. Furthermore, cut-offs were drawn both to describe each cluster independently and together in order that mutually exclusive clusters could be created. Based on Figure 2, no more than 800 kcal at any one eating event was used as the energy cut-off to distinguish cluster 1 from the other 3 clusters, meaning that participants whose energy intake was less than 800 kcal at any eating event during the day would be included in cluster 1. Next, for the remaining participants, the visualization in Figure 2 was used to determine if the participant's highest energy intake occurred between 5:00 and 15:00 when the participant was assigned to cluster 4; if the participant's highest energy intake occurred between 15:00 and 19:00 when the participant was assigned to cluster 2; if the participants' highest energy intake occurred after 19:00 when the participant was assigned to cluster 3; or in the case that the participant had more than 1 highest energy intake during the day, the participant was assigned to cluster 1. Absolute energy intake ranges from 0 to 4000 kcal (y-axis) while timing of intake ranges from 00:00 to 24:00 at hourly increments (x-axis) for non-pregnant U.S. adults 20-65 years. The proportion of participants in each cluster reporting energy intake is shown through shading ranging from 0.0% to 12.8% of participants in the 4 TDP clusters. Darker shading represents a greater percentage of participants in the cluster reporting the same amount of energy intake at that time. C1, cluster 1; C2, cluster 2; C3, cluster 3; C4, cluster 4; TDP, temporal dietary pattern.

Creating TDPs through Cut-Off Method
The heat map visualizations of the data-derived TDPs were used to describe the data and create the energy and time cut-offs. Specifically, shading indicating the proportion of the clusters with energy intake at the various hourly times were observed to find cut-points where the majority of energy intake and eating events occurred for each cluster. Furthermore, cut-offs were drawn both to describe each cluster independently and together in order that mutually exclusive clusters could be created. Based on Figure 2, no more than 800 kcal at any one eating event was used as the energy cut-off to distinguish cluster 1 from the other 3 clusters, meaning that participants whose energy intake was less than 800 kcal at any eating event during the day would be included in cluster 1. Next, for the remaining participants, the visualization in Figure 2 was used to determine if the participant's highest energy intake occurred between 5:00 and 15:00 when the participant was assigned to cluster 4; if the participant's highest energy intake occurred between 15:00 and 19:00 when the participant was assigned to cluster 2; if the participants' highest energy intake occurred after 19:00 when the participant was assigned to cluster 3; or in the case that the participant had more than 1 highest energy intake during the day, the participant was assigned to cluster 1.

Visualization of TDPs through Cut-Off Method
Based on the data-driven TDP visualizations, the chosen cut-offs were used to generate cut-off-derived TDPs. The new cut-off-derived clusters were also visualized using heat maps and these TDS are shown in Figure 3. Similar to Figure 2, the x-axis indicates time ranging from 00:00 to 24:00, and the y-axis shows absolute energy intake ranging from 0 to 4000 kcal. The proportion of individuals in each cluster reporting dietary intake at a certain time and amount of energy is represented through shading and ranges from 0.0% to 13.5% in the 4 TDP clusters. The darker shading represents that a greater percentage of participants in the cluster reported the same amount of energy intake at that time.

Visualization of TDPs through Cut-Off Method
Based on the data-driven TDP visualizations, the chosen cut-offs were used to generate cut-off-derived TDPs. The new cut-off-derived clusters were also visualized using heat maps and these TDS are shown in Figure 3. Similar to Figure 2, the x-axis indicates time ranging from 00:00 to 24:00, and the y-axis shows absolute energy intake ranging from 0 to 4000 kcal. The proportion of individuals in each cluster reporting dietary intake at a certain time and amount of energy is represented through shading and ranges from 0.0% to 13.5% in the 4 TDP clusters. The darker shading represents that a greater percentage of participants in the cluster reported the same amount of energy intake at that time. Absolute energy intake ranges from 0 to 4000 kcal (y-axis) while timing of intake ranges from 00:00 to 24:00 at hourly increments (x-axis) for non-pregnant U.S. adults 20-65 years. The proportion of participants in each cluster reporting energy intake is shown through shading ranging from 0.0% to 13.5% of participants in the 4 TDP clusters. Darker shading represents a greater percentage of participants in the cluster reporting the same amount of energy intake at that time. C1, cluster 1; C2, cluster 2; C3, cluster 3; C4, cluster 4; TDP, temporal dietary pattern.

Statistical Analysis
The Rao-Scott modified chi-square test was used to determine significant differences among clusters by characteristics including survey year, age group, sex, race/ethnicity, PIR, and BMI. Percent of participant overlap between the data-driven and cut-off-derived clusters representing a similar pattern was calculated. TDPs' relationships with health indicators (BMI and WC) were assessed using adjusted multivariate linear regression. Re- Absolute energy intake ranges from 0 to 4000 kcal (y-axis) while timing of intake ranges from 00:00 to 24:00 at hourly increments (x-axis) for non-pregnant U.S. adults 20-65 years. The proportion of participants in each cluster reporting energy intake is shown through shading ranging from 0.0% to 13.5% of participants in the 4 TDP clusters. Darker shading represents a greater percentage of participants in the cluster reporting the same amount of energy intake at that time. C1, cluster 1; C2, cluster 2; C3, cluster 3; C4, cluster 4; TDP, temporal dietary pattern.

Statistical Analysis
The Rao-Scott modified chi-square test was used to determine significant differences among clusters by characteristics including survey year, age group, sex, race/ethnicity, PIR, and BMI. Percent of participant overlap between the data-driven and cut-off-derived clusters representing a similar pattern was calculated. TDPs' relationships with health indi-cators (BMI and WC) were assessed using adjusted multivariate linear regression. Residual plots and outliers were checked. Models using BMI and WC as health status indicators were adjusted for survey year, age group, sex, race/ethnicity, PIR, and energy misreporting. The Tukey-Kramer adjustment was made for multiple comparisons. Adjusted p < 0.05 for comparisons among clusters was considered statistically significant. SAS version 9.4 (SAS Institute Inc., Cary, NC, USA) and R version 4.1.1 (RStudio, Inc., Boston, MA, USA) were used to complete the analysis.

Characteristics of Participants in the TDPs Clusters
The characteristics of the participants in the four clusters of TDP generated through two methods are shown in the Table 2. Abbreviations: NHANES, National Health and Nutrition Examination Survey; PIR, poverty to income ratio; TDPs, temporal dietary patterns. 1 Values are n (%). 2 Rao-Scott F adjusted χ 2 p-value is a goodness-of-fit, one-sided test; statistical significance is indicated when p < 0.05. Analyses were adjusted for clustering and stratification. Sample weights were constructed and applied to the analysis as directed by the NCHS. Weights were rescaled in order that the sum of the weights matched the survey population at the midpoint of 2007-2016. Significance level: * adjusted p < 0.05.

Overlap between the Data-Driven Method and Cut-Off Method
The data-driven and cut-off TDPs generated four clusters with similar patterns. About 83.3%, 87.1%, 92.0%, and 89.1% of the participants in the cut-off-derived TDP clusters overlapped with participant membership in the data-driven TDP clusters. Compared with the other three clusters, the energy intake in cluster 1 was moderate, specifically, each of the main energy intake events included energy at less than 800 kcal. However, clusters 2, 3, and 4 all had an energy intake peak (reaching 4000 kcal) at different times during the day. Cluster 2's energy intake peak was 15:00-19:00, cluster 3's energy intake peak was after 19:00, and cluster 4's energy intake peak was before 15:00.

Associations of TDPs with BMI and WC
The TDPs that were generated by both methods were significantly associated with BMI and WC. Participants in cluster 1 derived from both methods had significantly lower mean BMI and smaller mean WC compared to participants of clusters 2, 3, and 4 (Tables 3  and 4). The greatest significant difference in mean BMI and mean WC were present between clusters 1 and 4 (β = −3.3 ± 0.2, R 2 = 0.12) and clusters 1 and 3 (β = −8.2 ± 0.5 cm, R 2 = 0.17) using the data-driven method and clusters 1 and 3 (β = −3.1 ± 0.2, R 2 = 0.12 and β = −7.9 ± 0.4 cm, R 2 = 0.17) using the cut-off method.  Abbreviations: NHANES, National Health and Nutrition Examination Survey; TDPs, temporal dietary patterns; SE, standard error; WC, waist circumference. 1 Models were adjusted for survey year, age group, sex, race/ethnicity, poverty to income ratio, and energy misreporting. 2 Values are mean (standard error of the mean). 3 ß represents the difference of mean WC between two compared clusters. Least square means were used to calculate the differences in mean WC. Significance level: * adjusted p < 0.05.

Discussion
The results of this study showed TDPs linked to BMI and WC though a data-driven method which were then used to extract and validate a time and energy-based interpretation of the patterns from the visualization of the data-driven TDPs by showing their similar significant relationship with obesity and percentage of overlap among cluster membership using both methods. To the best of our knowledge, this is the first study that created temporal lifestyle patterns using a machine learning method and then extracted a practical interpretation of the patterns that was also validated against U.S. weight outcomes, which will add to the evidence of the link between multidimensional dietary patterns and health. The mean differences in BMI and WC associated with TDPs were not only statistically significant but also clinically meaningful [29,30], which indicates that the timing and amount of dietary intake can be a potential important health exposure to predict and prevent obesity. Since the practical interpretation extracted from the visualization of the data-derived TDPs was similarly linked to the obesity-related indicators and the overlapping cluster membership rate was also shown, the description of the patterns is a validated interpretation of the TDPs. The evidence provides a basis that data-driven methods may be used to find and extract practically translatable TDPs, a topic that is relevant to the timing of dietary intake and highlighted as a question under consideration for the 2025 Dietary Guidelines for Americans' scientific committee to address [31].
Findings from this study show that three evenly spaced, energy balanced eating occasions throughout the day are significantly associated with lower BMI and smaller WC compared to the TDPs that have one energy intake peak at different times throughout the day, which is supported by previous studies [3,4]. In addition, this study also showed that participants in cluster 4 that have an energy intake peak around 12:00 have significant higher BMI and larger WC compared to those in cluster 1. This finding is similar to a previous study where overweight or obese adults reported approximately four eating occasions a day, with the peak number of eating occasions occurring around 12:30 [32]. The pattern of cluster 2 with an energy intake peak after 15:00 and significantly higher BMI and larger WC compared to cluster 1, may have similarities with other findings. A previous study [33] showed that late lunch eaters lost significantly less weight and had slower rates of losing weight compared to early eaters after a 20-week intervention even though both groups had similar habitual energy intakes, and total energy expenditure.
Furthermore, the pattern of participants in cluster 3 with the highest energy intake peak at night (after 19:00), had significantly higher BMI and larger WC than those in cluster 1. This finding is aligned with a previous U.S.-based study [34], Japanese-based studies [35,36], Malaysian-based study [37] and Swedish-based study [38], all showing that late-night eating is associated with higher risk of obesity. High energy intake in the evening may be related to night eating syndrome [39], included in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders and identified as an eating disorder related to dysfunction of the circadian system. One of the reasons that night eating syndrome is significantly associated with obesity may be due to decreased diet-induced thermogenesis after dinner, which may lead to less energy expenditure and potential weight gain. This may also be because that the circadian system increases the glucagon production and reduces insulin production in anticipation of midnight fasting. Melatonin, a hormone signaling night in the circadian system, may further decrease the release of insulin at night. Thus late-night eating may cause a higher blood glucose rise and pose a risk for type 2 diabetes [40,41]. Late night eating is also significantly associated with increased energy intake [42] and may be a risk factor for obesity [43].
One of the biggest challenges in this study was to interpret the visualization of different clusters generated by the data-driven method. Unlike traditional clustering methods, DTW was used to develop TDPs in this study. DTP uses an elastic distance measure that can find the optimal matching paths among eating events of every pair of participants and quantify the pairwise distances between participants where the matched path is minimized. In addition, the kernel k-means algorithm was used to objectively divide participants into different TDPs based on the distances calculated from DTW. Since this objective method did not include predetermined standards or criteria for the temporal dietary patterns, the characteristics of the patterns of each of the TDPs that are generated by the data-driven method are not apparent. Visualization can be used to observe what the time, energy, and proportion of the group's distributions look like and extract the temporal dietary behavior characteristics from each TDP. Previous studies also used visualizations to capture each TDPs' characteristics [3,4,44] in either a DTW method coupled with the kernel k-means algorithm or a latent class analysis approach. Both methods were also previously used to identify unique and unknown patterns according to different observed indicators from multiple layers of data. Other studies used principal components to describe the dietary pattern [45,46]. In these principal component analyses, the major contributing food items or groups were used to describe the sentinel characteristics of the dietary patterns. However, the interpretation of the dietary patterns from these various methods only extracted certain factors including food, time, or amount of energy. Yet, other factors or characteristics of the patterns may be important. It is difficult to know whether other unobserved or observed factors should be prioritized as sentinel characteristics to describe the commonalities of the patterns, representing further needs to be addressed in future studies. Yet, validation of the pattern interpretation is critical to determine whether the selected factors to describe the patterns do indeed yield a similar cluster of participants and relationship with health.
The need for validation of data-derived patterns is in contrast with more traditional indexderived patterns or model-based patterns that need no such validation as their interpretation is already apparent and dependent on preconceived criteria. For example, the Healthy Eating Index was created based on scoring linked to proportions of food groups as recommended in the Dietary Guidelines for Americans. Data-driven clusters have no similar criteria from which diets are judged or ranked and leave the interpretation of the patterns created through data-driven methods open to investigators to subjectively describe. The results of this study show that this limitation can be overcome by extracting descriptions based on visualizations and then validating these interpretations. Based on the authors knowledge, this is the first study that evaluates and validates a data-driven patterning interpretation through membership overlap and associations with obesity-related health indicators. The results showed that cut-offderived clusters highly overlapped with data-driven clusters and demonstrated no differences in strength or pattern relationships with obesity-related indicators between the two methods. Therefore, although interpretation of the patterns has been a limitation for data-derived methods, it can be addressed and removed.
Considering the cross-sectional study design, the results cannot be used to infer causation. In addition, dietary data are from one weekday dietary recall, and data may not represent participants' regular patterns. However, a single 24-h dietary recall may be considered to be representative to estimate the general dietary pattern if days of the week of dietary recalls are evenly selected [47]. Moreover, smaller and specific TDPs, such as night shift patterns or intermittent fasting, may exist but are not observed since these patterns may be combined with other patterns preventing observation of their unique temporal characteristics.
The results provide evidence that data-driven methods have a high potential to discover patterns for which practical interpretations can be extracted and validated and easily translated to practical temporal dietary guidance to prevent obesity such as in the Dietary Guidelines for Americans. In addition, the development of time-based dietary intake translation may also be useful in detecting and prompting interventions to modify daily temporal patterns, potentially integrating other lifestyle behaviors including physical activity and sleep, and informing individualized, precision nutrition.

Conclusions
Four cut-off-derived clusters based on the visualization of data-driven clusters highly overlapped with data-driven clusters and showed no differences in strength or pattern relationships with obesity. The results provide evidence that data-driven methods have a high potential to discover patterns that are easily translatable to practical temporal dietary guidance to prevent obesity such as in the Dietary Guidelines for Americans. The developed time-based dietary intake translation may also be useful in detecting and prompting interventions to modify daily temporal patterns, potentially integrating other lifestyle behaviors including physical activity and sleep, and informing individualized, precision nutrition. Institutional Review Board Statement: This study was not human subjects research as participants were drawn from publicly available deidentified NHANES data.

Informed Consent Statement: Not applicable.
Data Availability Statement: Data described in the manuscript are made publicly and freely available without restriction at https://www.cdc.gov/nchs/nhanes/index.htm (accessed on 2 June 2020). Analytic code is available upon request.