1. Introduction
Artificial intelligence (AI) is rapidly transforming a wide range of industries, including finance, transportation, education, and healthcare, by enabling data-driven decision-making, increased automation, and personalized services. These advancements have improved efficiency, reduced human error, and unlocked new capabilities across numerous sectors. However, one domain that remains significantly underutilized despite its profound impact on public health is nutrition. Nutrition plays a central role in shaping both short-term and long-term health outcomes, influencing metabolic balance, cognitive performance, immune function, and the risk of chronic diseases such as obesity, diabetes, and cardiovascular conditions. Despite its importance, existing dietary recommendation systems often rely on static models and generalized guidelines, offering little to no personalization based on an individual’s biological, behavioral, or cultural characteristics. Most current tools are manually operated, fail to update with real-time data, and are not equipped to reflect dynamic lifestyle factors such as changing activity levels, dietary preferences, or health conditions. This lack of adaptability limits their effectiveness and discourages sustained user engagement. Integrating AI into nutrition presents a transformative opportunity to overcome these limitations by providing intelligent, individualized, and context-aware dietary support.
In recent years, the concept of personalized nutrition has gained significant attention, emphasizing the design of dietary interventions that are customized to an individual’s biological profile, lifestyle habits, and environmental context [
1]. This approach contrasts with conventional one-size-fits-all dietary guidelines by recognizing inter-individual variability in metabolism, preferences, health goals, and cultural food practices. Numerous studies have highlighted the potential of personalized nutrition to improve health outcomes, increase adherence to dietary recommendations, and support sustainable behavior change. However, despite its promise, widespread implementation in clinical or consumer-facing applications has been limited due to several practical barriers. These include the lack of automated systems for data collection and interpretation, insufficient access to qualified nutrition experts, and the unavailability of scalable platforms that offer intuitive and adaptive user interfaces. Additionally, most existing tools do not support real-time updates or dynamic personalization, which further limits their applicability in everyday settings. Recent advancements in artificial intelligence (AI), particularly in machine learning (ML), rule-based filtering systems, and natural language processing (NLP), offer new avenues to address these limitations [
2]. By leveraging these technologies, it is now possible to develop intelligent, scalable, and user-centric nutrition recommendation systems that can adapt to individual needs in real time.
1.1. Gap Analysis
Despite increasing awareness around the need for personalized nutrition, current digital health tools and consumer applications remain limited in scope and functionality. Many existing systems focus solely on calorie tracking or provide fixed diet plans, often neglecting key factors such as individual metabolic differences, user preferences, and micronutrient balance. Moreover, most applications lack dynamic adaptation based on changing user data and fail to integrate personalized estimation with meal planning in a seamless manner. Another major shortcoming is the absence of intuitive and accessible interfaces. Users are typically required to input data through rigid selection boxes or forms, which alienates those with limited nutrition knowledge or low digital literacy. Furthermore, natural language understanding—a feature now common in other AI domains—is rarely implemented in nutrition systems. Lastly, privacy remains a critical concern, with most AI-based systems relying on cloud-based architectures that compromise user data control. Recent advances in artificial intelligence have sparked growing interest in personalized nutrition, a field that aims to tailor dietary recommendations to individual biological, behavioral, and contextual characteristics. Traditional nutrition models—such as calorie calculators or fixed meal plans—often fail to capture inter-individual variability in metabolism, preferences, and lifestyle. To address these limitations, researchers have explored machine learning for energy estimation, natural language processing (NLP) for dietary preference understanding, and optimization methods for generating balanced menus. Despite promising results, existing systems tend to focus on isolated components and often lack integration, scalability, or user-friendly interfaces, highlighting the need for more holistic, AI-driven solutions.
1.2. Research Questions
Following are the research questions explored in this study.
RQ1: Can machine learning algorithms predict an individual’s daily energy requirements more accurately than traditional formulas based on age, sex, height, weight, and physical activity?
RQ2: Can a rule-based filtering approach—combined with a local large language model (LLM)—generate nutritionally balanced and personalized meal recommendations without relying on mathematical optimization?
RQ3: Can a locally deployed LLM accurately interpret dietary preferences and constraints expressed in natural language and convert them into structured filtering parameters?
RQ4: Does the proposed AI system improve usability, personalization, and user satisfaction compared to conventional static diet planning tools?
These questions guide the design, implementation, and evaluation of the system across prediction, recommendation, interaction, and usability layers.
1.3. Problem Statement
Traditional dietary recommendation systems fall short in addressing the diverse and individualized needs of users [
3]. These systems typically rely on predefined calorie estimation formulas, such as the Harris–Benedict or Mifflin–St Jeor equations, which oversimplify the complex, dynamic nature of human metabolism. By neglecting inter-individual variability caused by factors such as genetics, body composition, activity levels, comorbidities, and metabolic rate, these models often generate inaccurate estimations of daily energy needs. Consequently, the recommendations they produce tend to be generic, static, and poorly suited to meet the personalized health objectives of users.
Moreover, conventional systems frequently ignore user-specific elements such as food preferences, cultural and religious dietary practices, allergen sensitivities, and evolving health goals. This results in low adherence rates and reduces the long-term utility of such platforms. Most existing solutions also lack mechanisms for adaptive personalization based on behavioral feedback or longitudinal tracking of dietary intake, making them rigid and non-responsive to changes in user lifestyle or progress over time.
A further significant limitation lies in the lack of natural language interfaces. Many platforms depend on structured data input or rigid forms, which can be burdensome or inaccessible for individuals unfamiliar with nutrition science or those from marginalized backgrounds [
1]. This usability barrier not only limits engagement but also disproportionately excludes users with low digital literacy or language constraints.
Taken together, these deficiencies highlight a critical gap in current dietary recommendation systems: the inability to combine accurate, personalized calorie prediction with dynamic, context-aware, and user-friendly interaction. Bridging this gap requires an integrative solution that leverages machine learning for predictive modeling, rule-based filtering for structured personalization, and large language models for flexible, natural language-based user interaction.
1.4. Novelty of This Study
This study presents a multi-layered and integrated approach to personalized nutrition, distinguishing itself from existing systems through the following innovations:
Machine learning-based energy estimation: Unlike static formulas such as Mifflin–St Jeor or Harris–Benedict, this study introduces a regression model trained on real-world demographic and physiological data (NHANES) for individualized calorie prediction.
Rule-based filtering over optimization: Instead of relying on traditional optimization techniques like linear programming, the system utilizes structured filtering rules to generate balanced meal plans from the USDA nutrient database, ensuring both efficiency and adaptability.
Locally hosted natural language interface: A privacy-preserving large language model (LLM), deployed via Ollama, enables users to input preferences and restrictions in free-text format, enhancing usability and accessibility.
End-to-end integration of AI components: The system unifies prediction, filtering, and interaction layers into a cohesive, modular architecture—uncommon in most existing systems which typically focus on isolated components.
User-focused design with validated usability: A Streamlit-based front-end, developed with UX principles, enables real-time interaction and delivers practical meal plans suited to individual goals, validated through preliminary user testing.
1.5. Significance of Our Work
This work demonstrates the feasibility and effectiveness of an AI-based nutrition recommendation system that unifies predictive modeling, personalized filtering, and natural language interaction in a single, integrated framework. Unlike conventional systems that operate with rigid calorie formulas and static meal templates, our approach leverages machine learning models trained on real-world demographic and dietary data to estimate individual energy needs with higher accuracy. By integrating these estimates with a rule-based filtering mechanism grounded in nutritional standards, the system ensures that meal recommendations are not only tailored to user profiles but also nutritionally balanced. In addition, the incorporation of a locally hosted large language model (LLM) allows users to express preferences and constraints through natural language, significantly lowering the barrier to entry for individuals without technical or nutritional expertise. This enhances usability and fosters long-term user engagement, a challenge many existing platforms fail to address. Our results indicate measurable improvements in prediction performance, relevance of meal suggestions, and subjective user satisfaction, thereby validating the system’s practical viability. Moreover, the modular design, reliance on open-source datasets, and local deployment strategy ensure data privacy while offering flexibility for adaptation across different populations and dietary frameworks. Collectively, these innovations position the system as a scalable, privacy-aware, and user-friendly solution that holds strong potential for real-world implementation in both clinical settings and consumer wellness applications.
3. Methodology
This study proposes a two-layered personalized nutrition recommendation system integrating traditional machine learning (ML) models and large language models (LLMs). The workflow includes preprocessing the NHANES 2017–2018 dataset, estimating calorie and macronutrient requirements using regression models, interpreting user preferences via natural language, and generating culturally relevant daily meal plans with LLMs. The complete workflow is depicted in
Figure 1.
3.1. Dataset
The primary dataset used in this study was sourced from the National Health and Nutrition Examination Survey (NHANES) 2017–2018 cycle. NHANES is a large-scale, nationally representative survey conducted by the Centers for Disease Control and Prevention (CDC) in the United States, designed to assess the health and nutritional status of adults and children through a combination of interviews, physical examinations, and laboratory tests. For this research, three key data files were utilized: demographic information from
DEMO_J, body measurements from
BMX_J, and detailed dietary intake records from
DR1TOT_J. These datasets were merged using the unique respondent identifier
SEQN, which ensures consistent matching of individuals across multiple modules. Some samples from the cleaned NHANES dataset are shown in
Figure 2.
The resulting merged dataset initially included over 9000 participants. However, to ensure data quality and model reliability, extensive preprocessing steps were applied. These included the removal of entries with missing or incomplete records in critical fields such as age, sex, height, weight, and total energy intake. Additionally, participants with implausible values—such as biologically unrealistic body mass indices (BMIs) or calorie intake levels below 500 or above 6000 kilocalories per day—were excluded as extreme outliers. This filtering process produced a clean and reliable working dataset consisting of 6792 participants.
Descriptive statistics were computed to summarize the distribution of key variables, including age, sex, height, weight, BMI, and energy intake. These statistics provide an overview of the cohort’s demographic and nutritional characteristics, which are important for understanding population-level trends and contextualizing model outputs. The dataset features a wide range of participant profiles, making it suitable for developing personalized prediction models. A detailed summary of the cleaned dataset, along with the structure and distribution of key features, is presented in
Table 2. Furthermore, by using NHANES data, this study benefits from the survey’s rigorous data collection protocols and standardized measurement techniques, which enhance the validity and generalizability of the findings. However, limitations of the dataset—such as its reliance on 24 h dietary recall and lack of physical activity metrics—are addressed in the limitations section of this thesis.
3.2. Detailed Methodology
The proposed system adopts a sequential workflow consisting of data preprocessing, machine learning-based calorie prediction, and natural language-driven menu generation. First, raw data from the NHANES 2017–2018 survey is cleaned by removing entries with missing or inconsistent values. Relevant variables from the demographic, body measurement, and dietary datasets are merged via the SEQN identifier. Feature engineering is then applied, including BMI calculation, encoding of gender and ethnicity, log transformation of skewed intake values, and normalization of continuous features.
In the second phase, three regression models—Linear Regression (LR), Random Forest (RF), and Gradient Boosting (GB)—are trained to predict individual daily energy requirements. These models use engineered features such as age, BMI, gender, height, and weight as inputs. The objective function is to minimize the error between predicted and actual caloric intake from 24 h recall data. Model performance is assessed using multiple regression metrics, as described in Section C—Evaluation Metrics. Among the tested models, GB consistently showed the best predictive accuracy across all metrics.
In the final phase, users interact with the system by entering their dietary preferences in natural language (e.g., “vegan, gluten-free”). These inputs are parsed using a locally hosted LLM (Mistral 7B), which extracts structured constraints such as diet type, allergens, cultural relevance, and meal time. Based on the predicted nutritional targets and parsed preferences, the system retrieves matching meals from a filtered database.
3.3. Evaluation Metrics
To assess the performance of the calorie prediction models, we used three standard regression metrics: Coefficient of Determination (), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). These metrics quantify how well the predicted values approximate the actual daily energy intake observed in the NHANES dietary recall data.
The
score is computed using Equation (
1) and measures the proportion of variance explained by the model:
where
is the actual caloric intake,
is the predicted value, and
is the mean of actual values.
MAE is calculated as shown in Equation (
2), representing the average magnitude of errors in predictions:
RMSE, shown in Equation (
3), penalizes larger errors more severely and is suitable for continuous target variables like calorie estimation:
These metrics were computed on the test set and used for model comparison to select the optimal regression algorithm.
3.4. Experimental Settings
All models were implemented in Python 3.8 using the scikit-learn library. The preprocessed dataset of 6792 participants was randomly split into 80% training and 20% testing subsets, ensuring stratification by gender to balance physiological variability. Continuous features were standardized using z-score normalization, while categorical variables such as gender and ethnicity were one-hot encoded.
For the calorie prediction task, three models were evaluated: Linear Regression (LR), Random Forest (RF), and Gradient Boosting Regressor (GB). Default hyperparameters were used for LR. For tree-based models, hyperparameters were selected via grid search using 5-fold cross-validation on the training set. The final configurations are summarized in
Table 3.
3.5. Evaluation of the LLM-Based Preference Parser
To assess the effectiveness of the LLM-driven natural language interface, a separate evaluation was conducted involving 30 simulated user queries reflecting diverse dietary preferences, restrictions, and cultural requests. The local LLM (Mistral 7B) was prompted to extract structured fields such as diet type, allergens, preferred ingredients, meal timing, and cultural cuisine. Each output was manually verified against a ground-truth JSON template prepared for each test query.
The system achieved a structured extraction accuracy of 90%, with the majority of parsing errors occurring when users entered multi-intent or ambiguous phrases. Most accurate results were obtained for common diets (e.g., vegan; gluten-free), while cultural tags (e.g., “Mediterranean, spicy”) sometimes overlapped with ingredient preferences. Despite these limitations, the LLM demonstrated strong adaptability in understanding diverse phrasing and effectively transforming unstructured text into actionable input for meal generation.
5. Discussion
This thesis presents a comprehensive artificial intelligence-based system that integrates machine learning (ML) regression models and large language models (LLMs) to provide personalized nutrition recommendations. The primary aim was to estimate daily calorie and macronutrient needs from anthropometric–demographic features and subsequently convert natural language dietary preferences into structured, usable formats for meal planning. This hybrid approach addresses both the quantitative and qualitative aspects of individual dietary requirements.
To evaluate the ML-based calorie prediction models, three prominent algorithms—Linear Regression, Random Forest, and Gradient Boosting—were tested on the NHANES dataset. The Linear Regression model demonstrated the best performance among the three, reaching an value of 0.108 with MAE and RMSE values of 608.81 and 796.59, respectively. However, this relatively low reflects the inherent limitations of the input data. Notably, essential features such as physical activity level, metabolic rate, and hormonal indicators were not available. The model relied solely on age, gender, height, weight, and derived BMI, which constrained its predictive capacity. This result answers the first research question, emphasizing the limited explanatory power of anthropometric–demographic features alone.
The Random Forest and Gradient Boosting models exhibited poor generalization capabilities despite their theoretical potential to capture non-linear relationships. Particularly, the Random Forest model had an of 0.015, reflecting overfitting and instability caused by insufficient feature richness. The Gradient Boosting algorithm yielded a slightly improved of 0.102, with cross-validation confirming modest generalizability. Yet, these ensemble methods failed to outperform simple Linear Regression substantially. This observation highlights the importance of enriched datasets and complex biological features for accurate calorie prediction, a factor missing in most existing works.
The novelty of this study lies in the dual-layered system that supplements traditional numerical estimation with preference-aware meal planning. This was achieved through a locally deployed LLM framework integrated with Ollama, enabling users to express dietary preferences in natural language. The model parsed inputs such as “low-carb vegan breakfast” and successfully extracted structured filters for meal planning. It demonstrated an average classification accuracy of 91%, outperforming static menu interfaces found in traditional systems. This addresses the second research question by showing that LLMs significantly improve user interaction and personalization in nutrition systems. Beyond empirical performance, theoretical considerations are vital to ensure AI methodologies are interpretable and transferable across domains. For instance, the study ‘A text dataset of fire door defects for pre-delivery inspections of apartments during the construction stage’ demonstrates how structured theoretical frameworks enhance the robustness of defect detection models in construction. Analogously, applying theoretical rigor in personalized nutrition AI could strengthen interpretability and ensure adaptability across diverse populations and clinical contexts [
20].
Another major contribution was the application of log transformations to the calorie variable to normalize skewed distributions. Combined with BMI inclusion, this preprocessing improved the linear model’s accuracy compared to traditional non-transformed models. These methodological enhancements helped mitigate the effect of outliers and better captured central trends, answering the third research question regarding optimization of modeling performance. When compared to contemporary systems that rely solely on predefined filters or static recommendations, the proposed hybrid system demonstrates a superior balance between technical accuracy and user-centeredness. While existing systems often ignore user experience or provide limited customization, this system enables real-time, flexible, and personalized meal recommendations, bridging a critical gap in digital health technologies.
Although the tested regression models provided modest values (maximum 0.108), this outcome reflects the restricted input feature set, which lacked activity measures, metabolic biomarkers, or longitudinal data. The results suggest that anthropometric variables alone are insufficient for reliable calorie prediction. Future enhancements should integrate physiological and behavioral data streams (e.g., wearable sensors, metabolic panels, or lifestyle diaries), which have been shown to significantly improve model performance in similar contexts. Furthermore, the system’s modular architecture supports future extension, such as integrating wearable sensor data, behavioral tracking, or real-time feedback. These strengths make it adaptable to evolving health tech ecosystems, supporting continuous user engagement and health monitoring.
5.1. Limitations
Despite the strengths and innovations introduced in this study, several important limitations should be acknowledged. First and foremost, the NHANES dataset, which served as the foundation for model training and evaluation, lacks critical health-related variables such as physical activity levels, chronic disease biomarkers, medication use, and individual metabolic rates. These omissions limit the model’s ability to produce truly individualized energy requirement predictions, especially for populations with specific health conditions. Additionally, the dietary intake data in NHANES are restricted to a single 24 h recall per individual, which is insufficient to capture habitual dietary patterns or long-term nutritional behaviors. This shortcoming introduces potential variability and reduces the model’s capacity to generalize dietary recommendations across time. The reliance on NHANES dietary recall (24 h) inherently restricts longitudinal dietary assessment, potentially biasing caloric prediction to atypical daily intake patterns. Moreover, the absence of physical activity and behavioral lifestyle measures reduces ecological validity, especially when generalizing to populations with variable activity levels. Future implementations should integrate NHANES with complementary sources such as wearable sensor data, continuous dietary monitoring apps, or longitudinal cohort datasets to enhance real-world generalizability. Reliance on NHANES 2017–2018 with single-day recall and absence of activity or metabolic biomarkers limits temporal, cultural, and ecological generalizability. Broader datasets, multi-day dietary assessments, and international cohorts are needed for cross-cultural robustness.
Another significant limitation arises from the integration of the large language model (LLM) component. While LLMs enabled flexible, natural language-based user interaction, they also introduced challenges such as inaccurate interpretation of vague or ambiguous user input, computational inefficiencies, and dependence on prompt phrasing. These factors could affect the consistency and reliability of the generated meal plans. Moreover, the evaluation of the LLM’s performance was constrained to English, potentially limiting its applicability in multilingual or cross-cultural contexts. Lastly, the dataset and model validation did not include extensive representation across diverse age groups, ethnic backgrounds, or varying health statuses, which may hinder the system’s ability to generalize effectively in broader real-world settings. Future work should aim to address these limitations by incorporating more comprehensive datasets, multilingual capabilities, and domain-specific fine-tuning to enhance robustness and equity in personalized nutrition applications. The pilot usability study () provides preliminary qualitative insights but is insufficient to generalize user satisfaction. Larger, controlled usability studies are required to draw reliable conclusions about user acceptance and effectiveness.
5.2. Future Directions
Future work should prioritize enhancing data quality and diversity through the integration of multi-source datasets, including continuous dietary records, wearable sensor data (e.g., heart rate; activity), and clinical biomarkers such as blood glucose and lipid profiles. To better model temporal eating patterns and metabolic dynamics, advanced deep learning architectures like Long Short-Term Memory (LSTM) networks or Transformer-based models should be explored. For the natural language understanding component, incorporating newer and more capable LLMs—such as GPT-4—or fine-tuning domain-specific models could substantially improve contextual accuracy and reduce misinterpretations of user input. Additionally, there is potential to integrate adaptive reinforcement learning mechanisms that dynamically adjust nutrition guidance based on user feedback, health outcomes, and behavioral patterns over time. The development of explainable AI modules could further enhance user trust by providing transparent justifications for meal recommendations. A mobile–cloud hybrid deployment model should be pursued to enable real-time interaction, scalability, and robust data synchronization across devices and healthcare platforms. Finally, conducting randomized controlled trials and large-scale field evaluations involving diverse user demographics will be essential for assessing effectiveness, equity, usability, and long-term impact in both clinical and consumer-facing nutrition applications. Future work will involve randomized controlled trials (RCTs) with demographically diverse cohorts to rigorously evaluate system effectiveness compared to standard dietary tools. Incorporating longitudinal dietary tracking and real-time biometric data (e.g., heart rate, glucose, and activity) will enable dynamic personalization and robustness testing. Such experimental designs will validate not only prediction accuracy but also behavioral adherence and clinical outcomes, ensuring real-world applicability. Future research should perform systematic ablation studies to isolate the marginal contribution of each demographic feature. Moreover, while our model achieved an MAE of 132 kcal (95% CI: plus minus 25 kcal), the skewed distribution of intake and an RMSE of 797 kcal suggest substantial uncertainty in individual predictions.
6. Conclusions
This study demonstrates the feasibility of an integrated artificial intelligence framework that combines machine learning, natural language processing, and rule-based filtering to advance personalized nutrition. By leveraging regression models on the NHANES dataset, the system generated individualized calorie predictions, while a locally deployed large language model enabled intuitive interpretation of free-text dietary preferences. Although prediction accuracy remained modest, with values below 0.11, the framework highlights the value of incorporating even limited demographic and anthropometric features into automated dietary estimation. More importantly, the natural language interface and modular architecture illustrate how AI can enhance accessibility, usability, and adaptability in nutrition systems. The findings underscore both potential and limitations. The restricted feature set of NHANES and reliance on single-day recalls constrain prediction robustness, while the NLP component, though achieving promising accuracy, requires broader validation across languages and cultural contexts. Nevertheless, the pilot usability test suggests that end users value the system’s flexibility and personalization, supporting the case for AI-driven approaches in dietary planning. Looking forward, integrating richer multimodal data—such as continuous monitoring from wearables, longitudinal diet tracking, and clinical biomarkers—alongside rigorous validation in diverse populations will be crucial. By bridging computational techniques with applied nutrition science, this work lays the foundation for scalable, transparent, and user-centered dietary recommendation systems that can contribute to preventive healthcare, precision nutrition, and digital health innovation.