Prompt Engineering and Model Selection for LLM-Based Nutritional Estimation from Food Images: A Multi-Dataset Investigation
Abstract
1. Introduction
2. Materials and Methods
2.1. Datasets
2.1.1. NutriImage
2.1.2. SNAPMe
2.1.3. Japan Branded Food Database (JBFD)
2.2. Nutritional Estimation Models
2.3. Prompt Conditions
2.3.1. Default Prompt
2.3.2. Visual Estimation Prompt
2.4. Statistical Analysis
3. Results
3.1. Effect of Prompt Design and Model Selection
3.2. Model Comparison Across Datasets
3.3. Comparison with Prior Work on SNAPMe
3.4. Model Behavior: Basis Statement Analysis
4. Discussion
4.1. Prompt Design and Model Selection
4.2. Cost-Performance Tradeoff
4.3. Dataset- and Nutrient-Specific Estimation Patterns
4.4. Comparison with Prior Work
4.5. Data Availability and Future Directions
4.6. Clinical Considerations and Responsible Use
4.7. Limitations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Prompt Designs and Additional Experiments
Appendix A.1. Prompt Texts
Appendix A.2. Prompt and Model Comparison (JBFD Co-Op Subset, N = 300)
| Nutrient | Haiku Default | Haiku Visual | Sonnet Default | Sonnet Visual | Delta |
|---|---|---|---|---|---|
| Energy (kcal) | 0.235 | 0.170 | — | 0.844 | +0.609 |
| Protein (g) | −0.109 | 0.130 | — | 0.894 | +1.003 |
| Lipids (g) | 0.286 | 0.162 | — | 0.844 | +0.558 |
| Carb (g) | 0.149 | 0.070 | — | 0.842 | +0.693 |
| Salt (g) | 0.237 | 0.079 | — | 0.495 | +0.258 |
Appendix A.3. Representative Basis Statements by Model
| Food Item | Haiku | Sonnet | Opus |
|---|---|---|---|
| Natto (fermented soybean) | Soy product, soft texture—low fat, moderate protein | Natto package—high protein fermented soy, moderate fat, low salt | Typical pack shape of fermented soybean—protein and fat moderately high |
| Milk (carton) | White liquid in carton—low-fat dairy equivalent | White liquid in milk carton—typical whole milk composition | Milk carton shape, whitish color—estimated as regular whole milk |
| Curry rice | Brown sauce with rice—a medium-calorie mixed dish | Curry sauce over white rice—carbohydrate-dominant, moderate fat from roux | Japanese curry rice with dark brown roux—carbohydrate-rich base with moderate fat from sauce |
Appendix A.4. Chain-of-Thought Prompt Experiment (JBFD Co-Op Subset, N = 300)
| Nutrient | Visual R2 | Visual MAE | CoT R2 | CoT MAE |
|---|---|---|---|---|
| Energy (kcal) | 0.844 | 35.97 | 0.816 | 37.66 |
| Protein (g) | 0.894 | 1.98 | 0.877 | 2.08 |
| Lipids (g) | 0.844 | 2.65 | 0.838 | 2.71 |
| Carb (g) | 0.842 | 5.12 | 0.795 | 5.55 |
| Salt (g) | 0.495 | 0.76 | 0.383 | 0.85 |
References
- Shim, J.S.; Oh, K.; Kim, H.C. Dietary assessment methods in epidemiologic studies. Epidemiol. Health 2014, 36, e2014009. [Google Scholar] [CrossRef] [PubMed]
- Dhurandhar, N.V.; Thomas, D.; Sorkin, J.D. The known unknowns of BMI. Int. J. Obes. 2015, 39, 1165–1167. [Google Scholar]
- Rollo, M.E.; Ash, S.; Lyons-Wall, P.; Russell, A. Trial of a mobile application for dietary assessment. J. Hum. Nutr. Diet. 2011, 24, 510–516. [Google Scholar]
- Boushey, C.J.; Spoden, M.; Zhu, F.M.; Delp, E.J.; Kerr, D.A. New mobile methods for dietary assessment. Proc. Nutr. Soc. 2017, 76, 283–294. [Google Scholar] [PubMed]
- Aizawa, K.; Ogawa, M. FoodLog: Multimedia Tool for Healthcare Applications. IEEE Multimed. 2015, 22, 4–9. [Google Scholar] [CrossRef]
- Rodriguez-Jimenez, M.; Martin-del-Campo-Becerra, G.D.; Sumalla-Cano, S.; Crespo-Alvarez, J.; Elio, I. Image-Based Dietary Energy and Macronutrients Estimation with ChatGPT-5: Cross-Source Evaluation Across Escalating Context Scenarios. Nutrients 2025, 17, 3613. [Google Scholar] [PubMed]
- Fridolfsson, J.; Sjoberg, E.; Thiwang, M.; Pettersson, S. Performance Evaluation of 3 Large Language Models for Nutritional Content Estimation from Food Images. Curr. Dev. Nutr. 2025, 9, 107556. [Google Scholar] [CrossRef] [PubMed]
- White, J.; Fu, Q.; Hays, S.; Sandborn, M.; Olea, C.; Gilbert, H.; Elnashar, A.; Spencer-Smith, J.; Schmidt, D.C. A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv 2023, arXiv:2302.11382. [Google Scholar]
- Kimura, M.; Takeda, T.; Yuyama, Y.; Fujita, Y.; Kawanishi, T.; Nakamura, M.; Nakagawa, S.; Kimura, T.; Nagata, H.; Ito, A.; et al. Development of an internet-based image-input simplified nutritional calculation system. J. Jpn. Hosp. Assoc. 2005, 52, 393–398. [Google Scholar]
- Larke, J.A.; Chin, E.L.; Bouzid, Y.Y.; Nguyen, T.; Vainberg, Y.; Smilowitz, J.T.; Lemay, D.G. SNAPMe: A Benchmark Dataset of Food Photos for Dietary Assessment. Nutrients 2023, 15, 4972. [Google Scholar] [CrossRef] [PubMed]
- Nakagawa, S.; Yamamoto, A. Establishing a Bidirectional Correspondence Table between the 8th Edition Japanese Food Composition Table and USDA FoodData Central Using LLM-Based Matching. J. Food Compos. Anal. 2026; under review.
- Nakayama, Y. Research and Development of Nutritional Intake Estimation and Dietary Guidance System for Smartphone Meal Photos Using Artificial Intelligence; SCOPE Research Report; Ministry of Internal Affairs and Communications: Tokyo, Japan, 2019.
- Gemming, L.; Utter, J.; Mhurchu, C.N. Use of a wearable camera to assess population sugars intake. Eur. J. Clin. Nutr. 2015, 69, 1226–1230. [Google Scholar]
- FAO. Food Energy—Methods of Analysis and Conversion Factors; FAO Food and Nutrition Paper 77; FAO: Rome, Italy, 2003.


| Nutrient | NutriImage | SNAPMe | JBFD | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Haiku R2 | Haiku MAE | Sonnet R2 | Sonnet MAE | Opus R2 | Opus MAE | Haiku R2 | Haiku MAE | Sonnet R2 | Sonnet MAE | Opus R2 | Opus MAE | Haiku R2 | Haiku MAE | Sonnet R2 | Sonnet MAE | Opus R2 | Opus MAE | |
| Energy (kcal) | 0.131 | 121.1 | 0.480 | 82.0 | 0.506 | 81.6 | 0.331 | 132.3 | 0.470 | 116.9 | 0.530 | 111.0 | 0.230 | 106.3 | 0.603 | 62.3 | 0.586 | 57.6 |
| Protein (g) | −0.133 | 6.1 | 0.411 | 4.0 | 0.476 | 3.8 | 0.457 | 7.1 | 0.606 | 5.9 | 0.601 | 5.7 | −0.100 | 4.6 | 0.275 | 2.9 | 0.236 | 2.8 |
| Lipids (g) | −0.250 | 6.6 | 0.446 | 4.4 | 0.409 | 4.5 | 0.070 | 8.0 | 0.418 | 6.6 | 0.464 | 6.3 | 0.267 | 7.9 | 0.625 | 4.1 | 0.681 | 3.7 |
| Carb (g) | 0.209 | 16.4 | 0.460 | 11.4 | 0.524 | 11.0 | 0.254 | 15.8 | 0.266 | 14.4 | 0.410 | 13.2 | 0.152 | 16.5 | 0.699 | 8.1 | 0.702 | 7.4 |
| Salt (g) | 0.054 | 0.8 | 0.175 | 0.7 | 0.231 | 0.7 | 0.168 | 0.9 | 0.299 | 0.8 | 0.410 | 0.7 | 0.203 | 1.8 | 0.598 | 1.3 | 0.659 | 1.1 |
| Nutrient | NutriImage | SNAPMe | JBFD | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Haiku | Sonnet | Opus | Haiku | Sonnet | Opus | Haiku | Sonnet | Opus | |
| Energy (kcal) | 51.5% | 35.9% | 34.9% | 38.6% | 33.5% | 32.8% | 37.7% | 19.3% | 14.2% |
| Protein (g) | 61.6% | 38.8% | 38.1% | 48.7% | 38.5% | 38.3% | 47.1% | 25.0% | 21.9% |
| Lipids (g) | 70.1% | 52.4% | 55.1% | 52.2% | 46.9% | 44.2% | 69.8% | 39.1% | 33.3% |
| Carb (g) | 76.7% | 50.4% | 46.9% | 41.0% | 38.1% | 35.6% | 51.4% | 23.4% | 19.8% |
| Salt (g) | 49.6% | 44.6% | 45.4% | 63.8% | 55.8% | 52.2% | 54.0% | 38.3% | 33.9% |
| Nutrient | ChatGPT-5 MAE (N = 195) | ChatGPT-5 RMSE | This Study MAE (N = 1463) | This Study RMSE | Delta MAE% |
|---|---|---|---|---|---|
| Energy (kcal) | 123.03 | 163.80 | 116.85 | 184.38 | −4.9% |
| Protein (g) | 7.89 | 11.32 | 5.86 | 10.18 | −25.7% |
| Lipids (g) | 8.70 | 12.48 | 6.57 | 11.29 | −24.5% |
| Carb (g) | 11.72 | 16.84 | 14.43 | 23.21 | +23.1% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Nakagawa, S.; Yamamoto, A. Prompt Engineering and Model Selection for LLM-Based Nutritional Estimation from Food Images: A Multi-Dataset Investigation. Nutrients 2026, 18, 2017. https://doi.org/10.3390/nu18122017
Nakagawa S, Yamamoto A. Prompt Engineering and Model Selection for LLM-Based Nutritional Estimation from Food Images: A Multi-Dataset Investigation. Nutrients. 2026; 18(12):2017. https://doi.org/10.3390/nu18122017
Chicago/Turabian StyleNakagawa, Shinichi, and Akira Yamamoto. 2026. "Prompt Engineering and Model Selection for LLM-Based Nutritional Estimation from Food Images: A Multi-Dataset Investigation" Nutrients 18, no. 12: 2017. https://doi.org/10.3390/nu18122017
APA StyleNakagawa, S., & Yamamoto, A. (2026). Prompt Engineering and Model Selection for LLM-Based Nutritional Estimation from Food Images: A Multi-Dataset Investigation. Nutrients, 18(12), 2017. https://doi.org/10.3390/nu18122017

