Mapping Keywords in Granny Josie’s Culinary Heritage Using Large Language Models
Abstract
:1. Introduction
- What lexemes and themes related to food dominate Granny Josie’s Notebooks? This question concerns identifying primary theme areas, such as ingredients, cooking techniques, and types of dishes emerging from the content analysis of Granny Josie’s Notebooks;
- How can today’s textual analysis tools, such as LLM-based AI plugins, support research on historical corpora? This question evaluates the usefulness of AI tools for processing and interpreting large datasets, such as Granny Josie’s Notebooks.
2. Background
2.1. Granny Josie’s Culinary Heritage
2.2. Related Work
3. Materials and Methods
3.1. Research Object
3.2. Research Tools and Design
Declared Technical and Performance Specifications of Linguistic Insight
3.3. Visualisation of Results
3.4. Results Validation: Manual Visualisation of Data
4. Results
4.1. Selected Statistics on BaJa and Thematic Analysis Results
4.2. Word Cloud Analysis
4.2.1. Thematic Word Cloud Focused on Ingredients
4.2.2. Thematic Word Cloud Focused on Food Preparation Techniques
4.3. Manual Visualisation of Data
5. Discussion
5.1. Content of BaJa and the Socioeconomic Context of Post-War Poland and Changes in Poles’ Culinary Habits
5.2. Culinary Traditions and Eating Customs Depicted in BaJa
5.3. Experimental Observations and Comparison of the Analytical Tools
5.4. Strengths and Weaknesses of Word Clouds in the Context of the Literature
6. Conclusions
Research Limitations and Future Directions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | artificial intelligence |
CSIs | culture-specific items |
CTA | call-to-action |
CUI | conversational user interface |
GPT | Generative Pre-Trained Transformer |
GUI | graphical user interface |
LI-AI | Linguistic Insight |
LLM | large language model |
NLP | natural language processing |
Portable Document Format | |
SaaS | Software as a Service |
SVG | Scalable Vector Graphics |
WCs | word clouds |
WCG | word cloud generator |
References
- Basinyi, S.; Sagiya, M.E. World Heritage Communities, Anchors and Values for the Safeguarding of Intangible Cultural Heritage in Southern Africa. In Safeguarding Intangible Heritage; Akagawa, N., Smith, L., Eds.; Routledge: London, UK, 2018; pp. 174–186. [Google Scholar]
- Partarakis, N.; Kaplanidi, D.; Doulgeraki, P.; Karuzaki, E.; Petraki, A.; Metilli, D.; Bartalesi, V.; Adami, I.; Meghini, C.; Zabulis, X. Representation and Presentation of Culinary Tradition as Cultural Heritage. Heritage 2021, 4, 612–640. [Google Scholar] [CrossRef]
- Twiss, K. The Archaeology of Food and Social Diversity. J. Archaeol. Res. 2012, 20, 357–395. [Google Scholar] [CrossRef]
- Jönsson, H. A Food Nation Without Culinary Heritage? Gastronationalism in Sweden. J. Gastron. Tour. 2020, 4, 223–237. [Google Scholar] [CrossRef]
- Knapik, W.; Król, K. Inclusion of Vanishing Cultural Heritage in a Sustainable Rural Development Strategy–Prospects, Opportunities, Recommendations. Sustainability 2023, 15, 3656. [Google Scholar] [CrossRef]
- Šlehoferová, T.; Fatková, G. Manuscript Cookbooks on the Move: Social Functions, Heritage Management and Women’s Narratives in the Czech-Bavarian Borderland. Food Cult. Soc. 2023, 1–22. [Google Scholar] [CrossRef]
- Cashman, D. An Investigation of Irish Culinary History through Manuscript Cookbooks, with Particular Reference o the Gentry of County Kilkenny (1714-1830). Ph.D. Theses, Dublin Institute of Technology, Dublin, Ireland, 2016. [Google Scholar] [CrossRef]
- Ghatora, P.S.; Hosseini, S.E.; Pervez, S.; Iqbal, M.J.; Shaukat, N. Sentiment Analysis of Product Reviews Using Machine Learning and Pre-Trained LLM. BDCC 2024, 8, 199. [Google Scholar] [CrossRef]
- Król, K. Between Truth and Hallucinations: Evaluation of the Performance of Large Language Model-Based AI Plugins in Website Quality Analysis. Appl. Sci. 2025, 15, 2292. [Google Scholar] [CrossRef]
- Mącior, B.; Mądel, A.; Podraza, S.; Wójtowicz, E.; Zięba, P. Szynwałd—Tak Było. Historia Szynwałdu na Starych Fotografiach i Dokumentach; Stowarzyszenie Mój Szynwałd: Szynwałd, Poland, 2017. [Google Scholar]
- Magomere, J.; Ishida, S.; Afonja, T.; Salama, A.; Kochin, D.; Yuehgoh, F.; Hamzaoui, I.; Sefala, R.; Alaagib, A.; Dalal, S.; et al. You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes. arXiv 2024, arXiv:2406.09496. [Google Scholar] [CrossRef]
- Yang, H.; Zhao, Y.; Wu, Y.; Wang, S.; Zheng, T.; Zhang, H.; Ma, Z.; Che, W.; Qin, B. Large language models meet text-centric multimodal sentiment analysis: A survey. arXiv 2024, arXiv:2406.08068. [Google Scholar] [CrossRef]
- Gagliardi, I.; Artese, M.T. Exploring and Visualizing Multilingual Cultural Heritage Data Using Multi-Layer Semantic Graphs and Transformers. Electronics 2024, 13, 3741. [Google Scholar] [CrossRef]
- Goel, M.; Bagler, G. Computational Gastronomy: A Data Science Approach to Food. J. Biosci. 2022, 47, 12. [Google Scholar] [CrossRef]
- Guidotti, D.; Pandolfo, L.; Pulina, L. Discovering Sentiment Insights: Streamlining Tourism Review Analysis with Large Language Models. Inf. Technol. Tour. 2025, 27, 227–261. [Google Scholar] [CrossRef]
- Liberato, P.; Mendes, T.; Liberato, D. Culinary Tourism and Food Trends. In Advances in Tourism, Technology and Smart Systems; Rocha, Á., Abreu, A., De Carvalho, J.V., Liberato, D., González, E.A., Liberato, P., Eds.; Springer: Singapore, 2020; Volume 171, pp. 517–526. [Google Scholar]
- Loureiro, S.M.C.; Guerreiro, J.; Friedmann, E.; Lee, M.J.; Han, H. Tourists and Artificial Intelligence-LLM Interaction: The Power of Forgiveness. Curr. Issues Tour. 2025, 28, 1172–1190. [Google Scholar] [CrossRef]
- Carvalho, I.; Ivanov, S. ChatGPT for Tourism: Applications, Benefits and Risks. Tour. Rev. 2024, 79, 290–303. [Google Scholar] [CrossRef]
- Almansouri, M.; Verkerk, R.; Fogliano, V.; Luning, P.A. The Heritage Food Concept and Its Authenticity Risk Factors—Validation by Culinary Professionals. Int. J. Gastron. Food Sci. 2022, 28, 100523. [Google Scholar] [CrossRef]
- Digitized Granny Josie’s Notebooks. Main Library, University of Agriculture in Krakow [Zdigitalizowane (Zeskanowane) Zeszyty Babci Józi Umieszczone w Repozytorium Biblioteki Uniwersytetu Rolniczego w Krakowie]. Available online: http://ruralstrateg.pl/zeszyty-babci-jozi-w-bibliotece-cyfrowej-urk/ (accessed on 25 April 2025).
- SpeechTexter. Available online: https://www.speechtexter.com (accessed on 25 April 2025).
- Hsiao, J.-C.; Chang, J.S. Enhancing EFL Reading and Writing through AI-Powered Tools: Design, Implementation, and Evaluation of an Online Course. Interact. Learn. Environ. 2024, 32, 4934–4949. [Google Scholar] [CrossRef]
- Linguistic Insight. LLM-Based AI, ChatGPT Plugin. Available online: https://chatgpt.com/g/g-6783566ee4b08191a33700a5286f7942-linguistic-insight (accessed on 25 April 2025).
- Kulkarni, A.; Shivananda, A.; Kulkarni, A.; Gudivada, D. The ChatGPT Architecture: An In-Depth Exploration of OpenAI’s Conversational Language Model. In Applied Generative AI for Beginners; Apress: Berkeley, CA, USA, 2023; pp. 55–77. [Google Scholar]
- Lozić, E.; Štular, B. Fluent but Not Factual: A Comparative Analysis of ChatGPT and Other AI Chatbots’ Proficiency and Originality in Scientific Writing for Humanities. Future Internet 2023, 15, 336. [Google Scholar] [CrossRef]
- Rudolph, J.; Tan, S.; Tan, S. War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. J. Appl. Learn. Teach. 2023, 6, 364–389. [Google Scholar] [CrossRef]
- Lister, K.; Coughlan, T.; Iniesto, F.; Freear, N.; Devine, P. Accessible Conversational User Interfaces: Considerations for Design. In Proceedings of the 17th International Web for All Conference, Taipei Taiwan, 20–21 April 2020; pp. 1–11. [Google Scholar]
- Cutler, K. ChatGPT and Search Engine Optimisation: The Future Is Here. Appl. Mark. Anal. 2023, 9, 8. [Google Scholar] [CrossRef]
- Zhang, W.; Deng, Y.; Liu, B.; Pan, S.J.; Bing, L. Sentiment analysis in the era of large language models: A reality check. arXiv 2023, arXiv:2305.15005. [Google Scholar]
- Yang, Z.; Zhou, Z.; Wang, S.; Cong, X.; Han, X.; Yan, Y.; Liu, Z.; Tan, Z.; Liu, P.; Yu, D.; et al. Matplotagent: Method and evaluation for llm-based agentic scientific data visualization. arXiv 2024, arXiv:2402.11453. [Google Scholar] [CrossRef]
- Falatouri, T.; Hrušecká, D.; Fischer, T. Harnessing the Power of LLMs for Service Quality Assessment From User-Generated Content. IEEE Access 2024, 12, 99755–99767. [Google Scholar] [CrossRef]
- Huang, A.H.; Wang, H.; Yang, Y. FinBERT: A Large Language Model for Extracting Information from Financial Text*. Contemp. Accting Res. 2023, 40, 806–841. [Google Scholar] [CrossRef]
- Delnevo, G.; Andruccioli, M.; Mirri, S. On the Interaction with Large Language Models for Web Accessibility: Implications and Challenges. In Proceedings of the 2024 IEEE 21st Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 6–9 January 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Vázquez, P.-P. Are LLMs Ready for Visualization? In Proceedings of the 2024 IEEE 17th Pacific Visualization Conference (PacificVis), Tokyo, Japan, 23–26 April 2024; pp. 343–352. [Google Scholar] [CrossRef]
- Emsley, R. ChatGPT: These Are Not Hallucinations—They’re Fabrications and Falsifications. Schizophr 2023, 9, 52. [Google Scholar] [CrossRef] [PubMed]
- Martino, A.; Iannelli, M.; Truong, C. Knowledge Injection to Counter Large Language Model (LLM) Hallucination. In The Semantic Web: ESWC 2023 Satellite Events; Pesquita, C., Skaf-Molli, H., Efthymiou, V., Kirrane, S., Ngonga, A., Collarana, D., Cerqueira, R., Alam, M., Trojahn, C., Hertling, S., Eds.; Springer Nature: Cham, Switzerland, 2023; Volume 13998, pp. 182–185. [Google Scholar] [CrossRef]
- WordClouds (WC). Zygomatic. Available online: https://www.wordclouds.com/ (accessed on 25 April 2025).
- Word Cloud Generator (WCG). Available online: https://www.jasondavies.com/wordcloud/ (accessed on 25 April 2025).
- Czarniecka-Skubina, E.; Kowalczuk, I. Eating out in Poland: History, Status, Perspectives and Trends. Zeszyty Naukowe Uniwersytetu Szczecińskiego. Serv. Manag. 2015, 16, 75–83. [Google Scholar] [CrossRef]
- Stańczak-Wiślicz, K. Eating Healthy, Eating Modern. The “Urbanization” of Food Tastes in Communist Poland (1945–1989). Ethnol. Pol. 2020, 41. [Google Scholar] [CrossRef]
- Janowski, M. Food in Traumatic Times: Women, Foodways and ‘Polishness’ During a Wartime ‘Odyssey’. Food Foodways 2012, 20, 326–349. [Google Scholar] [CrossRef]
- Łukasiewicz, M.; Zięć, G.; Topolska, K.; Berski, W.; Florkiewicz, A. Ruthenian Culinary Traditions of Lemkivshchyna. In Cultural Heritage—Possibilities for Land-Centered Societal Development; Hernik, J., Walczycka, M., Sankowski, E., Harris, B.J., Eds.; Springer International Publishing: Cham, Switzerland, 2022; Volume 13, pp. 113–125. [Google Scholar]
- Henson, S.; Sekuła, W. Market Reform in the Polish Food Sector: Impact upon Food Consumption and Nutrition. Food Policy 1994, 19, 419–442. [Google Scholar] [CrossRef]
- Cui, W.; Wu, Y.; Liu, S.; Wei, F.; Zhou, M.X.; Qu, H. Context preserving dynamic word cloud visualization. In Proceedings of the 2010 IEEE Pacific Visualization Symposium (PacificVis), Taipei, Taiwan, 2–5 March 2010; pp. 121–128. [Google Scholar] [CrossRef]
- Padmanandam, K.; Bheri, S.P.V.D.S.; Vegesna, L.; Sruthi, K. A Speech Recognized Dynamic Word Cloud Visualization for Text Summarization. In Proceedings of the 2021 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 20–22 January 2021; pp. 609–613. [Google Scholar]
- Roberts, J.; Baker, M.; Andrew, J. Artificial Intelligence and Qualitative Research: The Promise and Perils of Large Language Model (LLM) ‘Assistance’. Crit. Perspect. Account. 2024, 99, 102722. [Google Scholar] [CrossRef]
Reference | Keywords | Research Domain | Research Methods |
---|---|---|---|
[13] | Archives, cultural heritage, semantic graphs, language models, data visualisation | Performance of digital archives regarding their availability and ease of navigation | Data visualisation, algorithmic analysis of multi-lingual text datasets |
[15] | Tourist reviews, language models, tourism industry, keyword extraction, zero-shot classification | LLM effectiveness in classifying tourist reviews and extracting keywords | Experimental analysis, sentiment analysis, zero-shot learning, comparative method, quality analysis |
[11] | Regional cuisine, language models, cultural differences, worldwide recipes, culturally-aware data collection, World Wide Dishes dataset, cultural misrepresentation. | Proposal of a conceptual framework for culturally-aware and participative data collection, development of the World Wide Dishes dataset | Language model experiments, qualitative observations, automated quantitative analysis, comparative analysis, impact assessment |
[12] | Translation accuracy, culture-specific items, large language models | Methods for improving the translation of names of dishes from Chinese to English | Analysis of culinary datasets, translation method performance testing, comparative analysis, quality evaluation |
[14] | Gastronomy, culinary fingerprints, data-driven food innovations, culinary creativity, AI in gastronomy, recipe analysis | Computational gastronomy, exploration of traditional recipes, identification of ‘culinary fingerprints’ | Culinary data analysis, culinary pattern investigation, experimental research |
Attribute | Characteristic | Conformity with Literature |
---|---|---|
User interaction | Graphical user interface (GUI) and conversational user interface (CUI). Dynamic interaction through GPT-based prompts. | [25,27] |
Time and functional details | Real-time analysis with results available immediately in the browser window. Functions available to logged-in users (access control). | [24,28] |
Scope of analysis | Semantic analysis, keyword detection, sentiment analysis, language style evaluation, grammatical structure recognition, text classification, and others. | [29,30,31] |
Results and reporting | Results are presented as summaries and detailed text analyses. Textual and/or graphic reporting. | [32,33] |
Visualisation of results | Results can be presented as text, charts, tables, images, and infographics. | [30,34] |
Personalisation | The scope of the analysis can be adjusted with GPT-based prompts. | [31] |
Limitations and weaknesses | Analyses can be less precise for expert and archaic texts. Risk of hallucinations. | [35,36] |
Category | The Most Common Words |
---|---|
Cooking techniques | ‘Duszenie’ (braising), ‘pieczenie’ (roasting/baking), ‘gotowanie’ (boiling/cooking), ‘smażenie’ (frying), ‘wyrabianie’ (kneading), ‘ucieranie’ (creaming), ‘studzenie’ (cooling), ‘fermentacja’ (fermentation) |
Ingredients | ‘Ziemniaki’ (potatoes), ‘cukier’ (sugar), ‘mąka’ (flour), ‘masło’ (butter), ‘jaja’ (eggs), ‘mleko’ (milk), ‘kapusta’ (cabbage), ‘grzyby’ (mushrooms), ‘buraki’ (beetroots), ‘ryż’ (rice), ‘suszone owoce’ (dried fruit), ‘przyprawy’ (spices): ‘sól’ (salt), ‘pieprz’ (pepper), ‘cynamon’ (cinnamon), and ‘wanilia’ (vanilla) |
Main courses | ‘Gołąbki’ (stuffed cabbage rolls), ‘pierogi ruskie’ (dumplings with potato and cheese filling), ‘kapuśniak’ (cabbage soup), ‘barszcz’ (borscht), ‘ryż zapiekany z jabłkami’ (baked rice with apples), ‘makaron zapiekany z serem’ (baked macaroni and cheese), ‘paszteciki z kapustą’ (cabbage-filled pastries) |
Serving and eating aesthetics | Table set with care, using a clean tablecloth and napkins, plates and cutlery arranged at regular intervals, and decorations. Clean table, decorative napkins, nicely placed food, and food positioned on the guest’s left-hand side |
Eating practices | Families eating together, adherence to religious fasting regulations and holiday customs, and table decorated for the occasion |
Attribute | AI Plugins | Manual Tools |
---|---|---|
Interaction, user interface | Conversational user interface, prompts, partial GUI | GUI, dashboard, drop-down lists, checkboxes, text boxes, CTA |
Input data | Loaded from a file or included directly in the prompt | Loaded from a file or entered directly into the GUI text box |
Input data processing capabilities | The operator can specify parts of the corpus to be excluded with a prompt | All words in the input file are analysed, and unwanted parts can be removed manually |
Scope of analysis | Open, flexible, and modifiable scope of analysis | Restricted, predefined scope of analysis limited by the GUI |
Scope of visualisation | Diverse visualisation designs, including word clouds, bar graphs, pie charts, etc. Visualisation can be altered in the GUI | Word cloud visualisation, which can be modified in terms of shape, colours, and font (format modifications) in the GUI |
International visualisation | The language of the visualisation can be changed automatically. Input data can be automatically translated | No automated translation functions. Input data cannot be automatically translated |
Overview | A tool for descriptive analysis of corpus content capable of visualising the results | A visualisation tool incapable of descriptive analysis (no interpretation, only visualisation) |
Attribute | AI Plugin (LI-AI) | Manual Tools (WCG, WC) |
---|---|---|
Automation | Complete automation, no need for manual intervention in the input data (the corpus) | May require manual data preprocessing |
Stop word filtering | Automated stop word filtering | Requires manual stop word filtering |
Contextual analysis | Can group synonyms and inflected words | No contextual interpretation |
Functionality | Sentiment analysis, emotional classification, linguistic structure detection | Simplified statistical analysis of word frequency |
Visualisation | Advanced formatting and optimisation options | Standard editing (font, colour) is predefined in the GUI |
Results precision | High: considers linguistic meaning, possible hallucinations (false AI output), propensity for embellishments, and overinterpretation | Based on pure word frequency |
Analytical flexibility | The scope of the analysis can be adapted with prompt engineering (CUI) | Restricted to basic data visualisation, predefined, and limited by the GUI |
Processing time | Swift processing of large datasets | Time-consuming, particularly for long texts |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Król, K. Mapping Keywords in Granny Josie’s Culinary Heritage Using Large Language Models. Heritage 2025, 8, 159. https://doi.org/10.3390/heritage8050159
Król K. Mapping Keywords in Granny Josie’s Culinary Heritage Using Large Language Models. Heritage. 2025; 8(5):159. https://doi.org/10.3390/heritage8050159
Chicago/Turabian StyleKról, Karol. 2025. "Mapping Keywords in Granny Josie’s Culinary Heritage Using Large Language Models" Heritage 8, no. 5: 159. https://doi.org/10.3390/heritage8050159
APA StyleKról, K. (2025). Mapping Keywords in Granny Josie’s Culinary Heritage Using Large Language Models. Heritage, 8(5), 159. https://doi.org/10.3390/heritage8050159