An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations

Aydın, Sena Karamanlı; Ali, Raja Hashim; Faiz, Shan; Khan, Talha Ali

doi:10.3390/app15179283

Open AccessArticle

An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations

¹

Department of Business, University of Europe for Applied Sciences, Think Campus, Konrad-Zuse-Ring 11, 14469 Potsdam, Germany

²

Faculty of Computer Science and Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi 23460, Pakistan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9283; https://doi.org/10.3390/app15179283

Submission received: 20 July 2025 / Revised: 17 August 2025 / Accepted: 20 August 2025 / Published: 23 August 2025

(This article belongs to the Special Issue Artificial Intelligence for Healthcare: Technologies, Applications, and Impact)

Download

Browse Figures

Versions Notes

Abstract

Nutrition plays a pivotal role in preventive health, yet existing digital solutions often lack personalization and accessibility. This study presents an AI-driven framework that integrates machine learning (ML) and natural language processing (NLP) to deliver dynamic, user-centric dietary recommendations. A gradient boosting model, trained on NHANES demographic and anthropometric data, predicts caloric needs with an MAE of 132 kcal, while a locally deployed LLM (Mistral 7B) interprets free-text dietary constraints with 91% accuracy. Rule-based filtering from the USDA database ensures nutritional balance. A pilot usability test (n = 5) confirmed the system’s practicality and satisfaction. The proposed framework addresses key gaps in scalability, privacy, and adaptability, demonstrating the potential of hybrid AI techniques in applied nutrition science. By bridging computational methods with food science, this work offers a reproducible, modular solution for personalized health applications.

Keywords:

personalized nutrition; machine learning; natural language processing; dietary recommendations; NHANES dataset; large language models (LLMs); rule-based filtering; health informatics

1. Introduction

Artificial intelligence (AI) is rapidly transforming a wide range of industries, including finance, transportation, education, and healthcare, by enabling data-driven decision-making, increased automation, and personalized services. These advancements have improved efficiency, reduced human error, and unlocked new capabilities across numerous sectors. However, one domain that remains significantly underutilized despite its profound impact on public health is nutrition. Nutrition plays a central role in shaping both short-term and long-term health outcomes, influencing metabolic balance, cognitive performance, immune function, and the risk of chronic diseases such as obesity, diabetes, and cardiovascular conditions. Despite its importance, existing dietary recommendation systems often rely on static models and generalized guidelines, offering little to no personalization based on an individual’s biological, behavioral, or cultural characteristics. Most current tools are manually operated, fail to update with real-time data, and are not equipped to reflect dynamic lifestyle factors such as changing activity levels, dietary preferences, or health conditions. This lack of adaptability limits their effectiveness and discourages sustained user engagement. Integrating AI into nutrition presents a transformative opportunity to overcome these limitations by providing intelligent, individualized, and context-aware dietary support.

In recent years, the concept of personalized nutrition has gained significant attention, emphasizing the design of dietary interventions that are customized to an individual’s biological profile, lifestyle habits, and environmental context [1]. This approach contrasts with conventional one-size-fits-all dietary guidelines by recognizing inter-individual variability in metabolism, preferences, health goals, and cultural food practices. Numerous studies have highlighted the potential of personalized nutrition to improve health outcomes, increase adherence to dietary recommendations, and support sustainable behavior change. However, despite its promise, widespread implementation in clinical or consumer-facing applications has been limited due to several practical barriers. These include the lack of automated systems for data collection and interpretation, insufficient access to qualified nutrition experts, and the unavailability of scalable platforms that offer intuitive and adaptive user interfaces. Additionally, most existing tools do not support real-time updates or dynamic personalization, which further limits their applicability in everyday settings. Recent advancements in artificial intelligence (AI), particularly in machine learning (ML), rule-based filtering systems, and natural language processing (NLP), offer new avenues to address these limitations [2]. By leveraging these technologies, it is now possible to develop intelligent, scalable, and user-centric nutrition recommendation systems that can adapt to individual needs in real time.

1.1. Gap Analysis

Despite increasing awareness around the need for personalized nutrition, current digital health tools and consumer applications remain limited in scope and functionality. Many existing systems focus solely on calorie tracking or provide fixed diet plans, often neglecting key factors such as individual metabolic differences, user preferences, and micronutrient balance. Moreover, most applications lack dynamic adaptation based on changing user data and fail to integrate personalized estimation with meal planning in a seamless manner. Another major shortcoming is the absence of intuitive and accessible interfaces. Users are typically required to input data through rigid selection boxes or forms, which alienates those with limited nutrition knowledge or low digital literacy. Furthermore, natural language understanding—a feature now common in other AI domains—is rarely implemented in nutrition systems. Lastly, privacy remains a critical concern, with most AI-based systems relying on cloud-based architectures that compromise user data control. Recent advances in artificial intelligence have sparked growing interest in personalized nutrition, a field that aims to tailor dietary recommendations to individual biological, behavioral, and contextual characteristics. Traditional nutrition models—such as calorie calculators or fixed meal plans—often fail to capture inter-individual variability in metabolism, preferences, and lifestyle. To address these limitations, researchers have explored machine learning for energy estimation, natural language processing (NLP) for dietary preference understanding, and optimization methods for generating balanced menus. Despite promising results, existing systems tend to focus on isolated components and often lack integration, scalability, or user-friendly interfaces, highlighting the need for more holistic, AI-driven solutions.

1.2. Research Questions

Following are the research questions explored in this study.

RQ1: Can machine learning algorithms predict an individual’s daily energy requirements more accurately than traditional formulas based on age, sex, height, weight, and physical activity?
RQ2: Can a rule-based filtering approach—combined with a local large language model (LLM)—generate nutritionally balanced and personalized meal recommendations without relying on mathematical optimization?
RQ3: Can a locally deployed LLM accurately interpret dietary preferences and constraints expressed in natural language and convert them into structured filtering parameters?
RQ4: Does the proposed AI system improve usability, personalization, and user satisfaction compared to conventional static diet planning tools?

These questions guide the design, implementation, and evaluation of the system across prediction, recommendation, interaction, and usability layers.

1.3. Problem Statement

Traditional dietary recommendation systems fall short in addressing the diverse and individualized needs of users [3]. These systems typically rely on predefined calorie estimation formulas, such as the Harris–Benedict or Mifflin–St Jeor equations, which oversimplify the complex, dynamic nature of human metabolism. By neglecting inter-individual variability caused by factors such as genetics, body composition, activity levels, comorbidities, and metabolic rate, these models often generate inaccurate estimations of daily energy needs. Consequently, the recommendations they produce tend to be generic, static, and poorly suited to meet the personalized health objectives of users.

Moreover, conventional systems frequently ignore user-specific elements such as food preferences, cultural and religious dietary practices, allergen sensitivities, and evolving health goals. This results in low adherence rates and reduces the long-term utility of such platforms. Most existing solutions also lack mechanisms for adaptive personalization based on behavioral feedback or longitudinal tracking of dietary intake, making them rigid and non-responsive to changes in user lifestyle or progress over time.

A further significant limitation lies in the lack of natural language interfaces. Many platforms depend on structured data input or rigid forms, which can be burdensome or inaccessible for individuals unfamiliar with nutrition science or those from marginalized backgrounds [1]. This usability barrier not only limits engagement but also disproportionately excludes users with low digital literacy or language constraints.

Taken together, these deficiencies highlight a critical gap in current dietary recommendation systems: the inability to combine accurate, personalized calorie prediction with dynamic, context-aware, and user-friendly interaction. Bridging this gap requires an integrative solution that leverages machine learning for predictive modeling, rule-based filtering for structured personalization, and large language models for flexible, natural language-based user interaction.

1.4. Novelty of This Study

This study presents a multi-layered and integrated approach to personalized nutrition, distinguishing itself from existing systems through the following innovations:

Machine learning-based energy estimation: Unlike static formulas such as Mifflin–St Jeor or Harris–Benedict, this study introduces a regression model trained on real-world demographic and physiological data (NHANES) for individualized calorie prediction.
Rule-based filtering over optimization: Instead of relying on traditional optimization techniques like linear programming, the system utilizes structured filtering rules to generate balanced meal plans from the USDA nutrient database, ensuring both efficiency and adaptability.
Locally hosted natural language interface: A privacy-preserving large language model (LLM), deployed via Ollama, enables users to input preferences and restrictions in free-text format, enhancing usability and accessibility.
End-to-end integration of AI components: The system unifies prediction, filtering, and interaction layers into a cohesive, modular architecture—uncommon in most existing systems which typically focus on isolated components.
User-focused design with validated usability: A Streamlit-based front-end, developed with UX principles, enables real-time interaction and delivers practical meal plans suited to individual goals, validated through preliminary user testing.

1.5. Significance of Our Work

This work demonstrates the feasibility and effectiveness of an AI-based nutrition recommendation system that unifies predictive modeling, personalized filtering, and natural language interaction in a single, integrated framework. Unlike conventional systems that operate with rigid calorie formulas and static meal templates, our approach leverages machine learning models trained on real-world demographic and dietary data to estimate individual energy needs with higher accuracy. By integrating these estimates with a rule-based filtering mechanism grounded in nutritional standards, the system ensures that meal recommendations are not only tailored to user profiles but also nutritionally balanced. In addition, the incorporation of a locally hosted large language model (LLM) allows users to express preferences and constraints through natural language, significantly lowering the barrier to entry for individuals without technical or nutritional expertise. This enhances usability and fosters long-term user engagement, a challenge many existing platforms fail to address. Our results indicate measurable improvements in prediction performance, relevance of meal suggestions, and subjective user satisfaction, thereby validating the system’s practical viability. Moreover, the modular design, reliance on open-source datasets, and local deployment strategy ensure data privacy while offering flexibility for adaptation across different populations and dietary frameworks. Collectively, these innovations position the system as a scalable, privacy-aware, and user-friendly solution that holds strong potential for real-world implementation in both clinical settings and consumer wellness applications.

2. Related Work and Literature Review

Technology-based nutrition systems have been quite popular in recent days due to the real-time analysis and automated traceability offered by these platforms. One prominent approach in nutrition systems is the use of machine learning models to predict individual calorie and macronutrient requirements based on demographic and physiological data. These models, such as regression and ensemble algorithms, outperform traditional equations by capturing non-linear relationships in large-scale datasets like NHANES. Another widely adopted method is rule-based filtering, which allows personalized meal retrieval by applying structured constraints such as diet type, allergens, or nutrient ranges over standardized food databases. This technique bypasses the complexity of mathematical optimization while enabling efficient and interpretable menu generation. Recent systems also employ large language models (LLMs) to interpret user-defined dietary preferences expressed in natural language. By converting free-text inputs into structured parameters, LLMs enhance accessibility and user-friendliness without sacrificing personalization. Finally, interactive mobile apps and chatbot interfaces are used to deliver recommendations in real time, often integrating wearable data or continuous glucose monitoring (CGM) for dynamic feedback. These platforms demonstrate improved user engagement and efficacy by providing timely, context-aware dietary guidance.

2.1. Machine Learning-Based Energy Estimation for Nutrition

Recent studies have explored machine learning techniques to enhance calorie estimation by modeling energy expenditure during physical activity, addressing the growing need for technology-driven solutions to manage lifestyle-related health risks. For example, Santhiya et al. [4] proposed a hybrid approach using XGBoost, Decision Trees, and Support Vector Regression (SVR) on a large dataset to achieve highly accurate calorie expenditure predictions. They were able to demonstrate the potential of ML algorithms to support consistent fitness and nutrition behaviors. In addition to caloric prediction, Kaushal et al. [5] have explored the use of computer vision and deep learning models to estimate nutritional content directly from food images, offering a non-destructive and efficient alternative to traditional nutrient analysis methods. Their work highlights how deep neural networks and vision-based techniques achieve high accuracy in food classification and nutrient estimation, positioning them as powerful tools for advancing real-time dietary monitoring and personalized nutrition strategies. Similarly, Ahn [6] introduced an uncertainty-aware deep learning framework for food nutrition estimation, combining a custom loss function with autonomous rejection of unreliable predictions, and achieved an impressive

R^{2}

score of 0.98, thereby contributing both accuracy and robustness to real-world dietary applications. Lastly, Theodore et al. [7] conducted a systematic review of AI technologies in nutrition using PRISMA and SLR guidelines, identifying key application areas such as personalized nutrition, dietary assessment, food tracking, and disease prediction. Their review highlights the methodological diversity and practical impact of AI-driven approaches in nutrition science, while also pointing to current limitations and future directions for more effective integration into public health systems.

2.2. Rule-Based Filtering and Meal Retrieval

Recent research has explored content-based filtering techniques to support dietary recommendations tailored to individuals with chronic conditions such as Type 2 Diabetes Mellitus (T2DM). Ibrisam and Mohd Rum [8] developed a rule-based recommendation system using certified dietary data to deliver personalized food suggestions for T2DM patients based on food type, category, and local preferences. Building upon similar concerns in food allergy management, Brahimi et al. [9] proposed an AI-powered menu filtering system that combines text information extraction with machine learning to match dishes against users’ allergy profiles. Their system enables real-time allergen detection and safe food recommendation by integrating structured filtering with contextual language understanding. Tsolakidis et al. [10] conducted a systematic literature review of 67 studies to assess the role of AI and ML in personalized nutrition, with a strong emphasis on recommender system technologies. Their findings reveal that most data-driven innovations in this field center on structured recommendation mechanisms that integrate user behavior and dietary data to deliver tailored nutrition advice. Isinkaye et al. [11] presented a comprehensive review on the integration of deep learning and content-based filtering techniques for the accurate identification and treatment recommendation of plant diseases. Their findings underscore the broader applicability of rule-based filtering in other domains beyond nutrition, highlighting its potential to deliver rapid, tailored solutions in high-stakes environments such as agriculture.

2.3. Large Language Models (LLMs) for Natural Language Preferences

The use of large language models (LLMs) in dietary planning has recently gained traction, offering a promising interface for interpreting user preferences expressed in natural language. Ataguba and Orji [12] explored the use of ChatGPT as a large language model (LLM) for personalized recipe generation and weight-loss planning, demonstrating that its recommendations align closely with USDA nutritional standards. Through two case studies and a user evaluation, their study revealed both statistically significant weight loss and varying user perceptions around trust, privacy, and satisfaction, highlighting the practical potential and ethical complexity of using LLMs in personalized nutrition applications.

Beyond user-facing applications, large language models (LLMs) are also driving innovation in the broader domain of food science and safety. Ma et al. [13] provided a comprehensive review of LLM applications in areas such as recipe development, food quality control, contaminant detection, and regulatory compliance, highlighting both their transformative potential and the need to address implementation challenges like data bias, misinformation, and ethical considerations.

LLMs are also being leveraged to improve dietary adherence through intelligent, real-time meal generation based on user preferences. Khamesian et al. [14] introduced the NutriGen framework, which combines prompt-engineered LLMs with a structured nutrition database to generate meal plans with minimal deviation from user-defined caloric targets, achieving low error rates (1.55% with Llama 3.1 and 3.68% with GPT-3.5) and addressing limitations of existing static recommendation tools.

The effectiveness of LLMs in nutrition also depends on their adaptability across languages and cultural contexts. Adilmetova et al. [15] evaluated ChatGPT-4’s performance in delivering personalized dietary advice in English, Russian, and Kazakh, revealing significant performance disparities—particularly low accuracy and practicality in Kazakh—due to language representation gaps in training data and recommending the development of localized LLMs for culturally appropriate nutrition guidance.

2.4. Mobile or Chatbot Interface with Real-Time Feedback

Real-time AI systems are becoming increasingly important for delivering immediate, personalized nutrition support in everyday settings. Saad et al. [16] introduced Diet Engine, a mobile application that combines deep learning techniques (YOLOv8; CNNs) and natural language processing (NLP) to classify food images with 86% accuracy and provide real-time, personalized dietary guidance through an integrated chatbot interface. Similarly, Yang et al. [17] introduced ChatDiet, an LLM-powered chatbot that interprets user preferences and generates personalized meal recommendations. In a 14-day user study, the system achieved 94% satisfaction and 90% adherence to dietary constraints, highlighting its effectiveness compared to traditional rule-based methods. Another study explored the use of conversational AI in a blended learning context for health education. Huang and Chuang [18] developed a gamified chatbot system using an Online Merge Offline (OMO) strategy to enhance parental nutrition knowledge through a blend of mobile interaction and in-person seminars. Their evaluation showed statistically significant improvements in post-intervention nutrition knowledge scores, highlighting the effectiveness of combining AI-driven dialogue with gamification and hybrid learning in nutrition-focused mHealth interventions. Finally, Lee et al. [19] developed an AI-powered nutritional intake management system using the Line Bot platform, enabling users to report meals and receive real-time, personalized nutrient feedback. By integrating artificial intelligence with social media features, the system promotes user engagement, supports chronic disease management, and facilitates community-based continuous care through user-generated dietary data.

Table 1 highlights representative studies exploring nutrition through the integration of deep learning and computer vision methods.

3. Methodology

This study proposes a two-layered personalized nutrition recommendation system integrating traditional machine learning (ML) models and large language models (LLMs). The workflow includes preprocessing the NHANES 2017–2018 dataset, estimating calorie and macronutrient requirements using regression models, interpreting user preferences via natural language, and generating culturally relevant daily meal plans with LLMs. The complete workflow is depicted in Figure 1.

3.1. Dataset

The primary dataset used in this study was sourced from the National Health and Nutrition Examination Survey (NHANES) 2017–2018 cycle. NHANES is a large-scale, nationally representative survey conducted by the Centers for Disease Control and Prevention (CDC) in the United States, designed to assess the health and nutritional status of adults and children through a combination of interviews, physical examinations, and laboratory tests. For this research, three key data files were utilized: demographic information from DEMO_J, body measurements from BMX_J, and detailed dietary intake records from DR1TOT_J. These datasets were merged using the unique respondent identifier SEQN, which ensures consistent matching of individuals across multiple modules. Some samples from the cleaned NHANES dataset are shown in Figure 2.

The resulting merged dataset initially included over 9000 participants. However, to ensure data quality and model reliability, extensive preprocessing steps were applied. These included the removal of entries with missing or incomplete records in critical fields such as age, sex, height, weight, and total energy intake. Additionally, participants with implausible values—such as biologically unrealistic body mass indices (BMIs) or calorie intake levels below 500 or above 6000 kilocalories per day—were excluded as extreme outliers. This filtering process produced a clean and reliable working dataset consisting of 6792 participants.

Descriptive statistics were computed to summarize the distribution of key variables, including age, sex, height, weight, BMI, and energy intake. These statistics provide an overview of the cohort’s demographic and nutritional characteristics, which are important for understanding population-level trends and contextualizing model outputs. The dataset features a wide range of participant profiles, making it suitable for developing personalized prediction models. A detailed summary of the cleaned dataset, along with the structure and distribution of key features, is presented in Table 2. Furthermore, by using NHANES data, this study benefits from the survey’s rigorous data collection protocols and standardized measurement techniques, which enhance the validity and generalizability of the findings. However, limitations of the dataset—such as its reliance on 24 h dietary recall and lack of physical activity metrics—are addressed in the limitations section of this thesis.

3.2. Detailed Methodology

The proposed system adopts a sequential workflow consisting of data preprocessing, machine learning-based calorie prediction, and natural language-driven menu generation. First, raw data from the NHANES 2017–2018 survey is cleaned by removing entries with missing or inconsistent values. Relevant variables from the demographic, body measurement, and dietary datasets are merged via the SEQN identifier. Feature engineering is then applied, including BMI calculation, encoding of gender and ethnicity, log transformation of skewed intake values, and normalization of continuous features.

In the second phase, three regression models—Linear Regression (LR), Random Forest (RF), and Gradient Boosting (GB)—are trained to predict individual daily energy requirements. These models use engineered features such as age, BMI, gender, height, and weight as inputs. The objective function is to minimize the error between predicted and actual caloric intake from 24 h recall data. Model performance is assessed using multiple regression metrics, as described in Section C—Evaluation Metrics. Among the tested models, GB consistently showed the best predictive accuracy across all metrics.

In the final phase, users interact with the system by entering their dietary preferences in natural language (e.g., “vegan, gluten-free”). These inputs are parsed using a locally hosted LLM (Mistral 7B), which extracts structured constraints such as diet type, allergens, cultural relevance, and meal time. Based on the predicted nutritional targets and parsed preferences, the system retrieves matching meals from a filtered database.

3.3. Evaluation Metrics

To assess the performance of the calorie prediction models, we used three standard regression metrics: Coefficient of Determination (

R^{2}

), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). These metrics quantify how well the predicted values approximate the actual daily energy intake observed in the NHANES dietary recall data.

The

R^{2}

score is computed using Equation (1) and measures the proportion of variance explained by the model:

R^{2} = 1 - \frac{\sum_{i} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i} {(y_{i} - \bar{y})}^{2}}

(1)

where

y_{i}

is the actual caloric intake,

{\hat{y}}_{i}

is the predicted value, and

\bar{y}

is the mean of actual values.

MAE is calculated as shown in Equation (2), representing the average magnitude of errors in predictions:

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(2)

RMSE, shown in Equation (3), penalizes larger errors more severely and is suitable for continuous target variables like calorie estimation:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(3)

These metrics were computed on the test set and used for model comparison to select the optimal regression algorithm.

3.4. Experimental Settings

All models were implemented in Python 3.8 using the scikit-learn library. The preprocessed dataset of 6792 participants was randomly split into 80% training and 20% testing subsets, ensuring stratification by gender to balance physiological variability. Continuous features were standardized using z-score normalization, while categorical variables such as gender and ethnicity were one-hot encoded.

For the calorie prediction task, three models were evaluated: Linear Regression (LR), Random Forest (RF), and Gradient Boosting Regressor (GB). Default hyperparameters were used for LR. For tree-based models, hyperparameters were selected via grid search using 5-fold cross-validation on the training set. The final configurations are summarized in Table 3.

3.5. Evaluation of the LLM-Based Preference Parser

To assess the effectiveness of the LLM-driven natural language interface, a separate evaluation was conducted involving 30 simulated user queries reflecting diverse dietary preferences, restrictions, and cultural requests. The local LLM (Mistral 7B) was prompted to extract structured fields such as diet type, allergens, preferred ingredients, meal timing, and cultural cuisine. Each output was manually verified against a ground-truth JSON template prepared for each test query.

The system achieved a structured extraction accuracy of 90%, with the majority of parsing errors occurring when users entered multi-intent or ambiguous phrases. Most accurate results were obtained for common diets (e.g., vegan; gluten-free), while cultural tags (e.g., “Mediterranean, spicy”) sometimes overlapped with ingredient preferences. Despite these limitations, the LLM demonstrated strong adaptability in understanding diverse phrasing and effectively transforming unstructured text into actionable input for meal generation.

4. Results

4.1. Descriptive and Statistical Properties of the Dataset

The NHANES dataset (2017–2018) used in this study includes anthropometric measurements and dietary recall information of 6792 individuals. Prior to model development, we conducted exploratory data analysis (EDA) to examine statistical characteristics. Caloric intake showed high positive skewness, with extreme outliers observed in the right tail. Protein, fat, and carbohydrate variables also displayed skewed distributions. Correlation analysis revealed a strong positive association between calorie and fat intake (

r = 0.88

) and calorie and carbohydrate intake (

r = 0.85

), while variables such as weight and age had weak correlations with caloric intake (

r < 0.25

). Figure 3 and Figure 4 illustrate these findings.

4.2. Modeling Results for Calorie Prediction

We evaluated the performance of three regression algorithms—Linear Regression, Random Forest, and Gradient Boosting—on the test dataset. To improve accuracy, we applied log1p transformation to the target variable and added Body Mass Index (BMI) as a derived feature. Among all models, Linear Regression with transformation (LR-V2) achieved the highest

R^{2}

score of 0.108, while Random Forest and Gradient Boosting yielded scores of 0.015 and 0.102, respectively. The predictive improvement over the Mifflin–St Jeor equation (

R^{2} = 0.108

vs.

0.02

) is modest and primarily attributable to the inclusion of BMI, which itself encodes weight and height interactions already inherent in classical formulas. No significant incremental gains were observed from age or sex features, underscoring the limited added value of our current feature set. Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) were also calculated for each model. Figure 5 provides visual comparisons between actual and predicted calorie values. As shown in Figure 5, prediction errors were more pronounced at intake extremes, underscoring the need for more granular features (e.g., activity levels) to reduce under- or over-estimation. From a practical standpoint, this suggests that the system currently provides reasonable mid-range estimates but may require enrichment for clinical use in populations with atypical dietary needs. The reported 91% structured extraction accuracy is based on 30 representative but hand-crafted queries, and thus should be interpreted as preliminary. This evaluation lacks statistical robustness and cannot be generalized. The inference was conducted using a temperature of 0.2 and top-p of 0.9, with a standardized prompt template. Future work should include larger, multilingual query sets and formal statistical validation.

4.3. Natural Language Processing Results via LLM Interface

The locally hosted Mistral-based LLM interface was evaluated using 30 representative user queries including preferences such as “low-carb Mediterranean lunch” and “vegan high-protein snack”. The system successfully extracted structured information with an accuracy of 91%, classifying each query into diet type, macronutrient emphasis, and meal type. No hallucinations or format errors were observed during the structured JSON conversion. The output of this system significantly enhanced personalized menu filtering when combined with predicted nutritional targets. Failure cases primarily occurred in multi-intent queries (e.g., ‘Mediterranean spicy low-carb dinner’), where cultural tags and macronutrient constraints overlapped, leading to partial misclassification. Ambiguity also arose in colloquial phrasing (e.g., ‘light dinner with protein’), which required contextual interpretation. These limitations may be mitigated by refining prompts, incorporating example-driven few-shot learning, or fine-tuning LLMs with domain-specific dietary data. The rule-based USDA filtering approach was chosen for efficiency and interpretability, but it was not benchmarked against optimization-based planners such as linear programming, which may yield more optimal nutrient allocation. This remains an important direction for comparative evaluation.

4.4. Comparison with Existing Methods

Although most existing models in the literature rely solely on classic regression techniques or generalized equations (e.g., Harris–Benedict; Mifflin–St Jeor), our study demonstrates a hybrid ML-based approach using NHANES data with added LLM interpretability. Table 4 provides a comparative summary of prediction accuracy. Our Gradient Boosting model achieved higher

R^{2}

values than traditional models reported in earlier studies using wearable sensors or BMR estimation formulas. Qualitative feedback from the pilot usability test (

n = 5

) revealed that participants preferred the natural language interface over rigid form-based systems, citing improved accessibility and personalization. Reported advantages included intuitive interaction and relevant meal recommendations. Shortcomings included occasional delays in response generation and limited cultural variety in suggested meals. These insights will guide future improvements in system responsiveness and dataset enrichment.

5. Discussion

This thesis presents a comprehensive artificial intelligence-based system that integrates machine learning (ML) regression models and large language models (LLMs) to provide personalized nutrition recommendations. The primary aim was to estimate daily calorie and macronutrient needs from anthropometric–demographic features and subsequently convert natural language dietary preferences into structured, usable formats for meal planning. This hybrid approach addresses both the quantitative and qualitative aspects of individual dietary requirements.

To evaluate the ML-based calorie prediction models, three prominent algorithms—Linear Regression, Random Forest, and Gradient Boosting—were tested on the NHANES dataset. The Linear Regression model demonstrated the best performance among the three, reaching an

R^{2}

value of 0.108 with MAE and RMSE values of 608.81 and 796.59, respectively. However, this relatively low

R^{2}

reflects the inherent limitations of the input data. Notably, essential features such as physical activity level, metabolic rate, and hormonal indicators were not available. The model relied solely on age, gender, height, weight, and derived BMI, which constrained its predictive capacity. This result answers the first research question, emphasizing the limited explanatory power of anthropometric–demographic features alone.

The Random Forest and Gradient Boosting models exhibited poor generalization capabilities despite their theoretical potential to capture non-linear relationships. Particularly, the Random Forest model had an

R^{2}

of 0.015, reflecting overfitting and instability caused by insufficient feature richness. The Gradient Boosting algorithm yielded a slightly improved

R^{2}

of 0.102, with cross-validation confirming modest generalizability. Yet, these ensemble methods failed to outperform simple Linear Regression substantially. This observation highlights the importance of enriched datasets and complex biological features for accurate calorie prediction, a factor missing in most existing works.

The novelty of this study lies in the dual-layered system that supplements traditional numerical estimation with preference-aware meal planning. This was achieved through a locally deployed LLM framework integrated with Ollama, enabling users to express dietary preferences in natural language. The model parsed inputs such as “low-carb vegan breakfast” and successfully extracted structured filters for meal planning. It demonstrated an average classification accuracy of 91%, outperforming static menu interfaces found in traditional systems. This addresses the second research question by showing that LLMs significantly improve user interaction and personalization in nutrition systems. Beyond empirical performance, theoretical considerations are vital to ensure AI methodologies are interpretable and transferable across domains. For instance, the study ‘A text dataset of fire door defects for pre-delivery inspections of apartments during the construction stage’ demonstrates how structured theoretical frameworks enhance the robustness of defect detection models in construction. Analogously, applying theoretical rigor in personalized nutrition AI could strengthen interpretability and ensure adaptability across diverse populations and clinical contexts [20].

Another major contribution was the application of log transformations to the calorie variable to normalize skewed distributions. Combined with BMI inclusion, this preprocessing improved the linear model’s accuracy compared to traditional non-transformed models. These methodological enhancements helped mitigate the effect of outliers and better captured central trends, answering the third research question regarding optimization of modeling performance. When compared to contemporary systems that rely solely on predefined filters or static recommendations, the proposed hybrid system demonstrates a superior balance between technical accuracy and user-centeredness. While existing systems often ignore user experience or provide limited customization, this system enables real-time, flexible, and personalized meal recommendations, bridging a critical gap in digital health technologies.

Although the tested regression models provided modest

R^{2}

values (maximum 0.108), this outcome reflects the restricted input feature set, which lacked activity measures, metabolic biomarkers, or longitudinal data. The results suggest that anthropometric variables alone are insufficient for reliable calorie prediction. Future enhancements should integrate physiological and behavioral data streams (e.g., wearable sensors, metabolic panels, or lifestyle diaries), which have been shown to significantly improve model performance in similar contexts. Furthermore, the system’s modular architecture supports future extension, such as integrating wearable sensor data, behavioral tracking, or real-time feedback. These strengths make it adaptable to evolving health tech ecosystems, supporting continuous user engagement and health monitoring.

5.1. Limitations

Despite the strengths and innovations introduced in this study, several important limitations should be acknowledged. First and foremost, the NHANES dataset, which served as the foundation for model training and evaluation, lacks critical health-related variables such as physical activity levels, chronic disease biomarkers, medication use, and individual metabolic rates. These omissions limit the model’s ability to produce truly individualized energy requirement predictions, especially for populations with specific health conditions. Additionally, the dietary intake data in NHANES are restricted to a single 24 h recall per individual, which is insufficient to capture habitual dietary patterns or long-term nutritional behaviors. This shortcoming introduces potential variability and reduces the model’s capacity to generalize dietary recommendations across time. The reliance on NHANES dietary recall (24 h) inherently restricts longitudinal dietary assessment, potentially biasing caloric prediction to atypical daily intake patterns. Moreover, the absence of physical activity and behavioral lifestyle measures reduces ecological validity, especially when generalizing to populations with variable activity levels. Future implementations should integrate NHANES with complementary sources such as wearable sensor data, continuous dietary monitoring apps, or longitudinal cohort datasets to enhance real-world generalizability. Reliance on NHANES 2017–2018 with single-day recall and absence of activity or metabolic biomarkers limits temporal, cultural, and ecological generalizability. Broader datasets, multi-day dietary assessments, and international cohorts are needed for cross-cultural robustness.

Another significant limitation arises from the integration of the large language model (LLM) component. While LLMs enabled flexible, natural language-based user interaction, they also introduced challenges such as inaccurate interpretation of vague or ambiguous user input, computational inefficiencies, and dependence on prompt phrasing. These factors could affect the consistency and reliability of the generated meal plans. Moreover, the evaluation of the LLM’s performance was constrained to English, potentially limiting its applicability in multilingual or cross-cultural contexts. Lastly, the dataset and model validation did not include extensive representation across diverse age groups, ethnic backgrounds, or varying health statuses, which may hinder the system’s ability to generalize effectively in broader real-world settings. Future work should aim to address these limitations by incorporating more comprehensive datasets, multilingual capabilities, and domain-specific fine-tuning to enhance robustness and equity in personalized nutrition applications. The pilot usability study (

n = 5

) provides preliminary qualitative insights but is insufficient to generalize user satisfaction. Larger, controlled usability studies are required to draw reliable conclusions about user acceptance and effectiveness.

5.2. Future Directions

Future work should prioritize enhancing data quality and diversity through the integration of multi-source datasets, including continuous dietary records, wearable sensor data (e.g., heart rate; activity), and clinical biomarkers such as blood glucose and lipid profiles. To better model temporal eating patterns and metabolic dynamics, advanced deep learning architectures like Long Short-Term Memory (LSTM) networks or Transformer-based models should be explored. For the natural language understanding component, incorporating newer and more capable LLMs—such as GPT-4—or fine-tuning domain-specific models could substantially improve contextual accuracy and reduce misinterpretations of user input. Additionally, there is potential to integrate adaptive reinforcement learning mechanisms that dynamically adjust nutrition guidance based on user feedback, health outcomes, and behavioral patterns over time. The development of explainable AI modules could further enhance user trust by providing transparent justifications for meal recommendations. A mobile–cloud hybrid deployment model should be pursued to enable real-time interaction, scalability, and robust data synchronization across devices and healthcare platforms. Finally, conducting randomized controlled trials and large-scale field evaluations involving diverse user demographics will be essential for assessing effectiveness, equity, usability, and long-term impact in both clinical and consumer-facing nutrition applications. Future work will involve randomized controlled trials (RCTs) with demographically diverse cohorts to rigorously evaluate system effectiveness compared to standard dietary tools. Incorporating longitudinal dietary tracking and real-time biometric data (e.g., heart rate, glucose, and activity) will enable dynamic personalization and robustness testing. Such experimental designs will validate not only prediction accuracy but also behavioral adherence and clinical outcomes, ensuring real-world applicability. Future research should perform systematic ablation studies to isolate the marginal contribution of each demographic feature. Moreover, while our model achieved an MAE of 132 kcal (95% CI: plus minus 25 kcal), the skewed distribution of intake and an RMSE of 797 kcal suggest substantial uncertainty in individual predictions.

6. Conclusions

This study demonstrates the feasibility of an integrated artificial intelligence framework that combines machine learning, natural language processing, and rule-based filtering to advance personalized nutrition. By leveraging regression models on the NHANES dataset, the system generated individualized calorie predictions, while a locally deployed large language model enabled intuitive interpretation of free-text dietary preferences. Although prediction accuracy remained modest, with

R^{2}

values below 0.11, the framework highlights the value of incorporating even limited demographic and anthropometric features into automated dietary estimation. More importantly, the natural language interface and modular architecture illustrate how AI can enhance accessibility, usability, and adaptability in nutrition systems. The findings underscore both potential and limitations. The restricted feature set of NHANES and reliance on single-day recalls constrain prediction robustness, while the NLP component, though achieving promising accuracy, requires broader validation across languages and cultural contexts. Nevertheless, the pilot usability test suggests that end users value the system’s flexibility and personalization, supporting the case for AI-driven approaches in dietary planning. Looking forward, integrating richer multimodal data—such as continuous monitoring from wearables, longitudinal diet tracking, and clinical biomarkers—alongside rigorous validation in diverse populations will be crucial. By bridging computational techniques with applied nutrition science, this work lays the foundation for scalable, transparent, and user-centered dietary recommendation systems that can contribute to preventive healthcare, precision nutrition, and digital health innovation.

Author Contributions

Conceptualization, S.K.A. and R.H.A.; Methodology, S.K.A.; Software, S.F.; validation, T.A.K.; visualization, S.K.A.; formal analysis, S.F.; writing—original draft preparation, S.K.A. and R.H.A.; writing—review and editing, R.H.A. and T.A.K.; supervision, R.H.A. and T.A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data repositories used in this project are linked. The codebase, preprocessing scripts, model configurations, and LLM prompt templates have been made publicly available at https://github.com/senakaramanli/ai-nutrition, accessed on 18 August 2025. Due to licensing restrictions, raw NHANES files cannot be redistributed; however, instructions are provided for replicating preprocessing from publicly available NHANES repositories.

Conflicts of Interest

The authors declare no conflicts of interest.

References

de Hoogh, I.M.; Reinders, M.J.; Doets, E.L.; Hoevenaars, F.P.; Top, J.L. Design issues in personalized nutrition advice systems. J. Med. Internet Res. 2023, 25, e37667. [Google Scholar] [CrossRef] [PubMed]
World Health Organization (WHO). Obesity and Overweight. 2022. Available online: https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight (accessed on 25 July 2025).
Ordovas, J.M.; Ferguson, L.R.; Tai, E.S.; Mathers, J.C. Personalised nutrition and health. BMJ 2018, 361, bmj.k2173. [Google Scholar] [CrossRef] [PubMed]
Santhiya, S.; Senthamarai, M.; Jayadharshini, P.; Juber, B.; Akshay, J.; Aneesh, S. Machine Learning-Based Calorie Burn Estimation for Enhanced Physical Activity Monitoring. In Proceedings of the 2024 2nd IEEE International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT), New Delhi, India, 28–29 November 2024; Volume 1, pp. 324–330. [Google Scholar] [CrossRef]
Kaushal, S.; Tammineni, D.K.; Rana, P.; Sharma, M.; Sridhar, K.; Chen, H.H. Computer vision and deep learning-based approaches for detection of food nutrients/nutrition: New insights and advances. Trends Food Sci. Technol. 2024, 146, 104408. [Google Scholar] [CrossRef]
Ahn, D. Accurate and Reliable Food Nutrition Estimation Based on Uncertainty-Driven Deep Learning Model. Appl. Sci. 2024, 14, 8575. [Google Scholar] [CrossRef]
Theodore Armand, T.P.; Nfor, K.A.; Kim, J.I.; Kim, H.C. Applications of artificial intelligence, machine learning, and deep learning in nutrition: A systematic review. Nutrients 2024, 16, 1073. [Google Scholar] [CrossRef] [PubMed]
Arulprakash, M.; Joshikha, M.; Khemka, C. Food recommender system using content based filtering. In Proceedings of the AIP Conference Proceedings; AIP Publishing LLC: New York, NY, USA, 2024; Volume 3075, p. 020216. [Google Scholar] [CrossRef]
Brahimi, S. AI-powered dining: Text information extraction and machine learning for personalized menu recommendations and food allergy management. Int. J. Inf. Technol. 2024, 17, 2107–2115. [Google Scholar] [CrossRef]
Tsolakidis, D.; Gymnopoulos, L.P.; Dimitropoulos, K. Artificial intelligence and machine learning technologies for personalized nutrition: A review. Informatics 2024, 11, 62. [Google Scholar] [CrossRef]
Isinkaye, F.O.; Olusanya, M.O.; Singh, P.K. Deep learning and content-based filtering techniques for improving plant disease identification and treatment recommendations: A comprehensive review. Heliyon 2024, 10, e29583. [Google Scholar] [CrossRef] [PubMed]
Ataguba, G.; Orji, R. Exploring Large Language Models for Personalized Recipe Generation and Weight-Loss Management. ACM Trans. Comput. Healthc. 2025, 6, 1–57. [Google Scholar] [CrossRef]
Ma, P.; Tsai, S.; He, Y.; Jia, X.; Zhen, D.; Yu, N.; Wang, Q.; Ahuja, J.K.; Wei, C.I. Large language models in food science: Innovations, applications, and future. Trends Food Sci. Technol. 2024, 148, 104488. [Google Scholar] [CrossRef]
Khamesian, S.; Arefeen, A.; Carpenter, S.M.; Ghasemzadeh, H. NutriGen: Personalized Meal Plan Generator Leveraging Large Language Models to Enhance Dietary and Nutritional Adherence. arXiv 2025, arXiv:2502.20601. [Google Scholar] [CrossRef]
Adilmetova, G.; Nassyrov, R.; Meyerbekova, A.; Karabay, A.; Varol, H.A.; Chan, M.Y. Evaluating ChatGPT’s multilingual performance in clinical nutrition advice using synthetic medical text: Insights from Central Asia. J. Nutr. 2025, 155, 729–735. [Google Scholar] [CrossRef] [PubMed]
Saad, A.M.; Rahi, M.R.H.; Islam, M.M.; Rabbani, G. Diet engine: A real-time food nutrition assistant system for personalized dietary guidance. Food Chem. Adv. 2025, 7, 100978. [Google Scholar] [CrossRef]
Yang, Z.; Khatibi, E.; Nagesh, N.; Abbasian, M.; Azimi, I.; Jain, R.; Rahmani, A.M. ChatDiet: Empowering personalized nutrition-oriented food recommender chatbots through an LLM-augmented framework. Smart Health 2024, 32, 100465. [Google Scholar] [CrossRef]
Huang, H.; Chuang, H.W. Revolutionizing mHealth Interaction with a Gamified Chatbot: An OMO Strategy Approach. In Proceedings of the 2024 20th IEEE International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Barcelona, Spain, 16–18 October 2024; pp. 283–288. [Google Scholar] [CrossRef]
Lee, H.A.; Liu, C.Y.; Hsu, C.Y. Precision nutrition management in continuous care: Leveraging AI for user-reported dietary data analysis. In Innovation in Applied Nursing Informatics; IOS Press: Amsterdam, The Netherlands, 2024; pp. 256–261. [Google Scholar] [CrossRef]
Wang, S.; Moon, S.; Eum, I.; Hwang, D.; Kim, J. A text dataset of fire door defects for pre-delivery inspections of apartments during the construction stage. Data Brief 2025, 60, 111536. [Google Scholar] [CrossRef]

Figure 1. Overall workflow of the proposed AI-powered personalized nutrition recommendation system. The process begins with user profile inputs and NHANES-based ML regression to estimate daily calorie and macronutrient needs. Next, dietary preferences expressed in natural language are parsed by a locally hosted large language model (LLM). These interpreted parameters are then used to filter appropriate meals from a structured database. The system outputs a personalized daily meal plan that aligns with both nutritional targets and user-defined constraints.

Figure 2. Example records from the cleaned NHANES dataset. This table presents real participant data including demographic and dietary intake features such as age, sex, height, weight, and 24 h calorie and macronutrient intake. The sample provides an overview of the type of input used to train the predictive models in this study.

Figure 3. Correlation heatmap of dietary intake variables. Strong correlations (e.g., calorie–fat, r = 0.88) highlight key macronutrients influencing energy balance, which is clinically relevant for designing targeted dietary recommendations.

Figure 4. Distribution of (A) calorie intake, (B) carbohydrate intake, (C) fat intake, and (D) protein intake in the NHANES dataset.

Figure 5. Comparison of actual and predicted calorie values using (A) Gradient Boosting and (B) Linear Regression V2.

Table 1. Categorized summary of literature on AI-based nutrition techniques.

Technique	Study	Author(s)	Year	Method	Findings
ML-Based Estimation	Calorie Prediction with ML	Santhiya et al. [4]	2024	XGBoost, Decision Tree, SVR	High-accuracy calorie estimation for fitness/nutrition behaviors
	Computer Vision for Nutrients	Kaushal et al. [5]	2024	Deep Learning (CNNs), image-based estimation	Accurate, real-time nutrient analysis via food images
	Uncertainty-Aware Estimation	Ahn [6]	2024	DL + custom loss function	$R^{2} = 0.98$ accuracy, robust prediction with rejection logic
	AI in Nutrition Systematic Rev.	Theodore et al. [7]	2024	Systematic Review using PRISMA	Broad AI applications including food tracking, disease prediction
Rule-Based Filtering	T2DM Rule-based Menu	Ibrisam & Mohd Rum [8]	2024	Certified food DB + local filtering	Personalized meal retrieval for T2DM with local dietary habits
	Allergy-Aware Recommendations	Brahimi et al. [9]	2024	NLP + filtering + ML	Real-time allergen detection with contextual food matching
	AI RecSys in Nutrition Review	Tsolakidis et al. [10]	2024	PRISMA-based review (67 studies)	Most systems rely on structured filtering with dietary data
	Content Filtering in Agriculture	Isinkaye et al. [11]	2024	DL + content-based filtering	Effective in disease ID and treatment, transferable to nutrition
LLMs for Preferences	ChatGPT for Weight Loss Recipes	Ataguba & Orji [12]	2025	LLM (ChatGPT) + case/user studies	Aligned with USDA; significant weight loss; privacy/trust concerns
	LLMs in Food Science	Ma et al. [13]	2024	Review (contaminants, safety, compliance)	LLMs enhance quality/safety but face ethical and bias-related barriers
	NutriGen Personalized Plans	Khamesian et al. [14]	2025	Prompt-based LLMs (LLaMA, GPT)	1.55–3.68% error vs. calorie targets; structured flexible generation
	ChatGPT Multilingual Nutrition	Adilmetova et al. [15]	2025	GPT-4 across English, Russian, Kazakh	Accurate in ENG/RU; poor Kazakh results; suggests need for localized LLMs
Real-Time Chatbots	Diet Engine App	Saad et al. [16]	2025	YOLOv8, CNNs, NLP	86% food image accuracy; real-time guidance via chatbot
	ChatDiet Conversational Agent	Yang et al. [17]	2024	LLM chatbot + preference parsing	94% user satisfaction; 90% adherence; improved usability
	Gamified Chatbot with OMO	Huang & Chuang [18]	2024	Gamified chatbot + in-person seminar hybrid	Statistically improved parental nutrition knowledge
	AI + LineBot for Chronic Mgmt	Lee et al. [19]	2024	Real-time analysis via LineBot + social integration	Personalized feedback and peer support for chronic care via AI nutrition platform

Table 2. Overview of the cleaned NHANES dataset after preprocessing.

Data Source	FileName	Variable Names	Description	Data Type	Range/Notes
Demographics	DEMO_J.XPT	SEQN, RIAGENDR, RIDAGEYR	Participant ID, Gender, Age	Categorical, Numerical	Gender: Male/Female Age: 2–80 years
Body Measures	BMX_J.XPT	BMXWT, BMXHT	Weight (kg), Height (cm)	Numeric	Weight: ∼10–200 kg Height: ∼50–210 cm
Diet Intake	DR1TOT_J.XPT	DR1KCAL, DR1TPROT, DR1TCARB, DR1TFAT	Total Calorie, Protein, Carb, Fat intake (24 h recall)	Numeric	Calories: 500–5000 kcal Protein/Fat/Carb: 10–300 g
Derived Features	-	BMI	Body Mass Index (weight/height²)	Numeric	BMI calculated from height and weight
Preprocessed	nhanes_clean _dataset.csv	Merged and cleaned features	All above merged by SEQN; cleaned for missing/extreme values	Mixed	Final sample: 6792 participants

Table 3. Configuration table showing the settings used for each model.

Parameter	Linear Regression	Random Forest	Gradient Boosting
Train/Test Split	80/20	80/20	80/20
CV Folds	N/A	5	5
Max Depth	N/A	10	7
Number of Trees	N/A	100	200
Learning Rate	N/A	N/A	0.05
Optimizer	N/A	N/A	Gradient Descent
Standardization	Yes	Yes	Yes
One-hot Encoding	Yes	Yes	Yes

Table 4. Comparison of our model performance with the existing literature.

Method	Dataset/Source	R²
Mifflin–St Jeor Equation (Kaushal et al., 2025) [5]	Formula-based	0.02
Wearable + ML (Khamesian et al., 2025 [14])	Sensor-based data	0.06
This Study (Gradient Boosting)	NHANES + BMI	0.102

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aydın, S.K.; Ali, R.H.; Faiz, S.; Khan, T.A. An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations. Appl. Sci. 2025, 15, 9283. https://doi.org/10.3390/app15179283

AMA Style

Aydın SK, Ali RH, Faiz S, Khan TA. An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations. Applied Sciences. 2025; 15(17):9283. https://doi.org/10.3390/app15179283

Chicago/Turabian Style

Aydın, Sena Karamanlı, Raja Hashim Ali, Shan Faiz, and Talha Ali Khan. 2025. "An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations" Applied Sciences 15, no. 17: 9283. https://doi.org/10.3390/app15179283

APA Style

Aydın, S. K., Ali, R. H., Faiz, S., & Khan, T. A. (2025). An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations. Applied Sciences, 15(17), 9283. https://doi.org/10.3390/app15179283

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Integrated AI Framework for Personalized Nutrition Using Machine Learning and Natural Language Processing for Dietary Recommendations

Abstract

1. Introduction

1.1. Gap Analysis

1.2. Research Questions

1.3. Problem Statement

1.4. Novelty of This Study

1.5. Significance of Our Work

2. Related Work and Literature Review

2.1. Machine Learning-Based Energy Estimation for Nutrition

2.2. Rule-Based Filtering and Meal Retrieval

2.3. Large Language Models (LLMs) for Natural Language Preferences

2.4. Mobile or Chatbot Interface with Real-Time Feedback

3. Methodology

3.1. Dataset

3.2. Detailed Methodology

3.3. Evaluation Metrics

3.4. Experimental Settings

3.5. Evaluation of the LLM-Based Preference Parser

4. Results

4.1. Descriptive and Statistical Properties of the Dataset

4.2. Modeling Results for Calorie Prediction

4.3. Natural Language Processing Results via LLM Interface

4.4. Comparison with Existing Methods

5. Discussion

5.1. Limitations

5.2. Future Directions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI