Next Article in Journal
A Deep Learning Framework with Multi-Scale Texture Enhancement and Heatmap Fusion for Face Super Resolution
Previous Article in Journal
Artificial Intelligence Driven Smart Hierarchical Control for Micro Grids―A Comprehensive Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Combining User and Venue Personality Proxies with Customers’ Preferences and Opinions to Enhance Restaurant Recommendation Performance

by
Andreas Gregoriades
1,*,
Herodotos Herodotou
2,
Maria Pampaka
3 and
Evripides Christodoulou
4
1
Department of Communication and Marketing, Cyprus University of Technology, Limassol 3036, Cyprus
2
Department of Electrical Engineering, and Computer Science and Engineering, Cyprus University of Technology, Limassol 3036, Cyprus
3
Department of Social Statistics, School of Social Sciences, The University of Manchester, Manchester M13 9PL, UK
4
Department of Management, Entrepreneurship and Digital Business, Cyprus University of Technology, Limassol 3036, Cyprus
*
Author to whom correspondence should be addressed.
Submission received: 8 November 2025 / Revised: 28 December 2025 / Accepted: 4 January 2026 / Published: 9 January 2026

Abstract

Recommendation systems are popular information systems that help consumers manage information overload. Whilst personality has been recognised as an important factor influencing consumers’ choice, it has not yet been fully exploited in recommendation systems. This study proposes a restaurant recommendation approach that integrates customer personality traits, opinions and preferences, extracted either directly from online review platforms or derived from electronic word of mouth (eWOM) text using information extraction techniques. The proposed method leverages the concept of venue personality grounded in personality–brand congruence theory, which posits that customers are more satisfied with brands whose personalities align with their own. A novel model is introduced that combines fine-tuned BERT embeddings with linguistic features to infer users’ personality traits from the text of their reviews. Customers’ preferences are identified using a custom named-entity recogniser, while their opinions are extracted through structural topic modelling. The overall framework integrates neural collaborative filtering (NCF) features with both directly observed and derived information from eWOM to train an extreme gradient boosting (XGBoost) regression model. The proposed approach is compared to baseline collaborative filtering methods and state-of-the-art neural network techniques commonly used in industry. Results across multiple performance metrics demonstrate that incorporating personality, preferences and opinions significantly improves recommendation performance.

1. Introduction

Recommender systems extract patterns from user behaviour and preference data to reduce information overload in product and service selection [1]. Beyond traditional user behaviour data, such as clicks, ratings, and browsing history, recent approaches incorporate psychological user traits, including personality. By incorporating information from personality models, recommenders can capture more stable user traits that shape preferences, interaction patterns, and engagement behaviours [2]. When extracted from user-generated text (e.g., reviews), personality has been shown to improve recommendation performance [3], particularly in addressing the cold start problem and recommendation novelty. However, the sparsity of textual data for each user limits the reliability of personality inference, and the application of personality in restaurant recommender systems remains underexplored.
Recommenders, particularly restaurant recommenders, aim to enhance customer experience and satisfaction through personalisation. Food choice is a crucial activity during vacations that shapes the travel experience of travellers. The problem of overchoice—stemming from an abundance of available options—has been widely studied across psychology, marketing and decision science, which has motivated the use of recommender systems. Recently, interest in food experience analysis has emerged [4], focusing on a deeper understanding of customers’ food preferences to improve recommendations [5] and restaurant selection [6]. Traditional restaurant recommendation approaches extract user preferences from structured information of users’ historical records (e.g., review ratings, purchases) or are collected explicitly from users via questionnaires. Such techniques rely heavily on rating data and basic demographic information, which do not fully capture the nuanced preferences that influence customers’ choices. More recent work leverages electronic word of mouth (eWOM) to extract emotion and personality [7,8], which influence motivation, purchasing decisions, preferences, perception of a service [9,10], satisfaction, and consumer behaviour [7]. Personality has been proven to improve recommenders’ performance [11] since people with similar personalities have similar preferences and needs [12]. Additionally, the automated extraction of personality scores from user-generated textual content addresses the bias and limitations inherent in traditional self-reported surveys (e.g., response bias arising when respondents provide socially desirable answers). Despite these advances, the integration of personality into restaurant recommendation systems remains limited.
In the context of consumer behaviour, businesses are often described using human characteristics [13], a concept referred to as brand personality [14,15,16]. This enables businesses to better engage with their customers, since people tend to interact with brands as if they were human entities. Brand personality represents the set of human traits that consumers associate with a brand, enabling the formation of an emotional connection that can lead to increased loyalty and marketplace success. The relationship between a business and its customers is, therefore, fundamental and can be explained by personality–brand congruence (PBC) theory, which suggests that consumers are more likely to be attracted to brands whose projected personalities are congruent with their own [15,17,18]. Seminal work on brand personality by Aaker [16] explains how consumers perceive brands along five dimensions: sincerity, excitement, competence, sophistication, and ruggedness. Among these, sincerity, excitement, and competence were found to correlate directly with three of the Big Five human personality traits [17], specifically agreeableness, extroversion, and conscientiousness [16,17]. In this work, we utilise these three traits to express brand personality and assess it in a manner consistent with prior work [18], which derives brand personality from social media text. However, unlike previous work [18], we utilise a large language model (LLM) and linguistic style analysis to assess personality, rather than a dictionary-based approach. In the context of the restaurant business, PBC explains why customers with personalities similar to a restaurant’s personality are more likely to visit or revisit such venues. Such PBC insights are utilised by businesses to build relationships with consumers and differentiate themselves from competitors [14]. However, such strategies are mainly manual (i.e., they use questionnaires). To our knowledge, current recommendation approaches do not utilise the PBC to improve recommendations, and very few studies have investigated the use of personality for restaurant recommendations [5]. This work aims to address this research gap.
Users’ food preferences and opinions about venues constitute another critical factor in restaurant recommendations. Such information is typically collected through explicit questionnaires, which impose a high user burden and are prone to bias. This has also been criticised, since users do not (always) know what they want, with recency and frequency of their experiences usually dominating their answers. More advanced methods for extracting consumer preferences include using online behaviour data, such as browsing paths on websites, idle times, etc. However, obtaining such information can be challenging when building recommender systems for various businesses. By contrast, eWOM in the form of online reviews provides a rich source of customer perceptions from different service engagements, expressed in both textual and quantitative forms, and has been utilised to externalise preferences and opinions. However, the research on preference analysis from online reviews remains scarce.
We hypothesise that review text provides valuable information about customers’ preferences (e.g., food) that cannot be inferred from users’ scale evaluations (e.g., food rating, environment rating) alone, since these scales evaluate generic factors assumed by the designer of the social media platform (e.g., TripAdvisor). Customers might have preferences and opinions that are outside the scope of these scales and thus cannot be captured through such approaches. In addition, past reviews of customers provide a history of previous user decisions and opinions that can be extracted from text; thus, frequent opinions about certain foods (positive or negative) or consumption patterns of foods, irrespective of user ratings, provide information about food preference (they ordered that food because they like it) [19]. Accordingly, we extract fine-grained food preferences directly from review text.
This study extends previous work that introduced personality to enhance restaurant recommendations [20], by incorporating the concept of venue personality grounded in PBC, automated user preference extraction from eWOM text, optimised topic modelling component for the extraction of relevant user opinions (topics) about venues, and a novel personality classifier that improves personality recognition in out-of-distribution data [21] through the utilisation of linguistic information in reviews’ text, along with a neural collaborative filtering (NCF) [22]. The proposed method utilises and combines two types of information from eWOM, namely, derived and direct. The former refers to information extracted from eWOM text, such as venues and consumers’ personalities, customer food preferences, and opinions about venues, using text classification, topic modelling and named entity recognition. Direct information refers to restaurants’ properties that are explicitly rated by users, such as price, value for money, service quality, atmosphere, and the offered cuisine, as provided on the reviews platform. Derived and direct information are jointly used to train and test an XGBoost regression model [23] to predict restaurant ratings. We hypothesise that integrating derived and direct information enriches the algorithm’s input, enabling it to learn more generalizable patterns and generate more accurate recommendations.
The guiding research question is whether the combination of derived and direct information from eWOM improves restaurant recommendations when compared to baseline recommender methods (such as neural collaborative filtering and matrix factorisation). In responding to this question, this work makes four methodological contributions:
  • A novel personality classifier for deriving customer personality from reviews, that outperforms baseline machine learning (ML) methods (trained on secondary data);
  • The introduction and evaluation of the concept of venue personality based on PBC [15];
  • Automated extraction of food preferences using a custom named-entity recogniser;
  • Opinion inference via topic modelling to assess its impact on recommendation performance.
Overall, the proposed approach is among the first to integrate personality traits and personality–brand congruence into restaurant recommendations, combining heterogeneous eWOM-derived and explicit information sources to achieve superior performance over traditional models.
The paper is organised as follows. The next section reviews the literature on recommender systems and techniques for extracting user preferences and personality from text. This is followed by a description of the method and the empirical results obtained from applying it to a custom dataset and comparing its performance with that of mainstream recommender methods. The paper concludes with a discussion of findings, implications for management, and future directions.

2. Background

This section reviews background concepts related to recommendation systems and techniques for deriving information from textual data.

2.1. Recommendation Systems

Recently, increasing attention has been paid to food experience research in domains such as marketing [6] and recommender systems [5], where recommendation techniques aim to alleviate the cognitive burden of decision-making. In restaurant decision-making, the more decision alternatives available, the greater the user effort to analyse the available options and the lower the probability that a decision is made. This is also known as the choice overload challenge [24]. Recommender systems address this challenge by predicting consumers’ satisfaction with items (products/services) they have not yet experienced and presenting the most suitable options [25]. This process can be viewed as personalised marketing, which contrasts with mass marketing approaches that target broad consumer segments [26]. Common personalisation methods rely on consumers’ past experiences (e.g., ratings) to construct a user–item matrix, which is then used by Collaborative Filtering (CF) algorithms such as KNN (K nearest neighbours) to predict the most appropriate product/service for a user depending on users’ or items’ similarity [25]. CF has been successfully applied to contexts such as restaurants, hotels, or points of interest recommendations, in tourism, and remains one of the most popular recommendation techniques [27]. Another widely used approach, particularly in tourism, is content-based filtering, which makes recommendations by matching users’ preferences with characteristics of items rather than the interactions of the users with items [25]. Hybrid approaches that combine collaborative and content-based filtering are increasingly adopted in tourism applications, as they mitigate the respective limitations of these methods.
Despite their commercial success, memory-based CF techniques suffer from the cold start problem when limited (or a lack of) information is available for new users [28]. This challenge is exacerbated in the presence of multiple unrated items in the user–item matrix (known as the sparsity problem), which reduces its ability to make reliable inferences [25]. In tourism recommender systems, data sparsity is common due to the limited time tourists spend at holiday destinations. The cold start problem occurs when tourists visit a destination for the first time, and no records of their activity (such as restaurant visits) at the specific destination exist. To address these issues, techniques such as matrix factorisation (model-based collaborative filtering) [27] have been applied to approximate the content of a user–item matrix using latent variables that represent users and items in a lower-dimensional space, derived from the initial sparse data. Matrix factorisation (MF) essentially discovers both the linear and nonlinear relationships between users and items through their interactions. Variations of the MF concept, such as non-negative matrix factorisation (NMF), singular value decomposition (SVD), and optimised SVD (SVD++) [29] models, factorise the sparse user–item matrix and generate a fully populated matrix, which can predict users’ satisfaction for products they have not yet engaged with [29]. In restaurant recommendations, MF is often regarded as state-of-the-art [30], yet it remains vulnerable to data sparsity and cold start issues. Different methods have been applied to address the data sparsity problem, including clustering to group consumers and item characteristics and then finding correlations among clusters [31,32], and clustering with SVD [33]. The cold start problem can be addressed with content-based approaches that exploit items’ metadata. eWOM provides a valuable source of such metadata through user-generated text, which can be analysed using natural language processing (NLP) techniques. For example, sentiment analysis has been used to transform eWOM text into numerical features, thereby enhancing CF performance [34]. Similarly, topic modelling techniques have been used along with CF to vectorise text and estimate the similarity between consumers or items [35]. Neural approaches extend matrix factorisation by leveraging neural networks to capture nonlinear user–item interactions [36,37]. Such a neuro matrix factorisation [22,36] uses embeddings or leverages models, such as variational autoencoders [38]. Neuro matrix factorisation uses an embedding layer that turns the input (user/item) into a dense vector. Most recommender algorithms adopt this neural net architecture since it has shown improved performance over MF. For example, a study [39] utilises this architecture to incorporate additional information about items and users into the model. A different architecture [40] combines a multilayer perceptron with a residual network to model user–item interactions. Cheng [41] introduced the wide and deep architecture that combines a linear model (“wide”) with a “deep” model to learn hierarchical feature representations. A recent study employs deep networks to extract user preferences from text [42]; however, it does not consider the personality of the user or the venue. In addition, deep learning–based recommenders still suffer from sparsity and cold start issues and are often criticised for limited interpretability.
Recent deep hybrid models in restaurant recommendations integrate item properties and eWOM to learn latent user–item interactions [43]. Like our study, they utilise eWOM text to create additional features using a common text vectorisation method, namely “term frequency inverse document frequency” (TF-IDF), to enhance recommendation performance. However, TD-IDF does not capture semantic relationships, whereas contextual embeddings, such as BERT, encode contextual and semantic information.
Recent work by [44] employed BERT-based aspect sentiment analysis within a deep matrix factorisation framework on consumers’ eWOM to address restaurant recommendations. Their approach analyses online review text to identify the sentiment of five prespecified aspect keywords and found to improve recommendations against benchmark CF and NCF models. However, reliance on predefined aspects limits the flexibility of preference extraction. The application of topic modelling, as utilised in our work, makes the preference extraction more dynamic. Moreover, these methods (e.g., [43,44]) do not incorporate psychological user traits such as personality, nor do they explicitly extract food preferences from eWOM. To our knowledge, this combination of psychological, preference, and opinion features has not been explored in restaurant recommendations to enhance user understanding and improve the quality of recommendations.

2.2. User Preferences Extraction

An important aspect of restaurant recommendations is the accurate identification of user preferences. According to Chua et al. [45], key factors driving customers’ restaurant selection include price, food type, and food variety. These factors constitute direct information about the services available at restaurants on platforms such as TripAdvisor. However, a single rating for an experience or a predefined restaurant aspect is insufficient to capture nuanced user preferences. Consequently, derived information can be extracted from consumers’ eWOM, such as opinions about food or venue aspects (e.g., scenery) that are not explicitly assessed on review platforms. Several techniques have been proposed for extracting user preferences from text, including topic modelling, aspect-based opinion mining, named entity recognition, sentiment analysis, and opinion mining [46]. Topic modelling is an unsupervised machine learning technique used in this study to identify latent themes in a collection of reviews by discovering combinations of words that frequently occur together. It is a generative probabilistic approach based on the word distributional hypothesis (i.e., words are characterised by their context). Topic modelling uses observed variables to infer the hidden topic structure that most likely generated the corpus. Topics are commonly used as proxies for user opinions about venues, products, or services across application domains.
The most straightforward approach to identifying users’ food/restaurant preferences is to explicitly ask users to state them. However, this approach suffers from several limitations. According to the theory of preference construction [47], users often do not possess stable preferences but construct them dynamically as situations evolve; thus, the questions asked might not be efficient in capturing their food preferences. Additionally, user preferences can change, and thus, these need to be respecified. Another method for extracting preferences is through user opinion analysis from eWOM text using NLP [48]. Popular NLP-based approaches other than topic modelling, employ food vocabularies and frequencies of foods in reviews [49] and utilise generic ontologies such as the WordNet [50]. However, these approaches struggle to capture local or cuisine-specific foods that may be absent from predefined dictionaries. To address these limitations, ML techniques, such as named entity recognition (NER), can be employed to identify preference-related aspects directly from text. When combined with user sentiment analysis, NER-based methods can reveal fine-grained user preferences.

2.3. User and Venue Personality Extraction

Personality refers to a set of relatively stable values, beliefs, and behavioural tendencies that characterise an individual over time. Several theories posit different traits of personality. In automated personality recognition, the Myers–Briggs Type Indicator (MBTI) [51] and the Big 5 [52] are commonly used due to the availability of labelled data. The MBTI focuses on four dimensions referring to eight key individual characteristics/behaviours: extraversion or introversion, thinking or feeling, sensing or intuition, and judging or perceiving. The Big 5 model describes personality in terms of five dimensions: agreeableness, extraversion, openness to experience, conscientiousness, and neuroticism.
Some studies argue that the Big 5 model provides a more comprehensive perspective of human personality than MBTI [53]. However, MBTI offers a more specific explanation of an individual’s personality and understanding of their preferences [54]. It is also important to note that each model has its strengths and weaknesses, and no model is 100% accurate. Despite that, the benefits of recognising personality using such models are key to many domains, such as using personality as a proxy to understand consumer needs and purchasing decisions. Such insights are recently used by recommender systems to improve recommendation performance, due to the identified link between personality and individuals’ preferences, perceptions, feelings and motivations [55]. Integrating user personality has improved the performance of recommendations in tourism-related applications, such as points of interest [56], compared to traditional methods. Such integration also reduced the cold start and data sparsity problems, enhancing the performance of recommenders in areas such as online advertising, books and music, and social media [57]. However, the application of personality in restaurant recommendations remains limited.
Moreover, brand personality remains largely unexplored in recommender systems. Brand personality is closely related to self-congruity, which describes the alignment between a consumer’s self-concept and their perception of a brand [15]. Previous work used brand personality matching by computing the personality of users and some personality features of the item, such as a product or an actor’s personality in movie recommenders [58]. However, these representations do not capture the holistic personality of a brand. In the context of restaurant recommendations, the concept of venue personality remains largely unexplored.
Personality information has been recently incorporated into prominent CF methods, such as probabilistic matrix factorisation, to infer latent factors that describe users’ preferences [59]; however, these factors do not relate to generic psychological features (personality traits) that are based on a validated psychological model (such as MBTI). Such personality traits are known to affect preference and lead to models that generalise better. Examples of techniques that enhance CF with personality include [55,59], which use personality-boosted probabilistic matrix factorisation. However, these methods either assume that personality is already available or extracted from questionnaires.
Similarly, early personality-based recommendation techniques relied on specialised questionnaires (e.g., [60]) rather than eWOM. Such approaches are time-consuming and hinder continuous updating of preferences. Automated personality extraction can instead be performed using textual data or online behavioural traces [58]. Behavioural approaches infer personality from clicks, likes, and interaction patterns, but require access to proprietary platform data. However, such data are difficult to obtain for problems such as restaurant recommendations since they require access to restaurants’ websites or social media platforms. Therefore, text-based personality extraction from secondary data sources (e.g., TripAdvisor, Yelp) is becoming increasingly popular in recommender systems. Automated personality extraction has been applied to Facebook and Twitter data to identify links between personalities, purchasing behaviours, and preferences [18,61].
Text-based personality prediction relies on the assumption that words can reveal the author’s psychological states and, thus, their personality. Text-based techniques are classified into two main categories: feature-based and deep learning approaches. Feature-based techniques use either a closed vocabulary of words related to emotion/personality, as in [62]), or an open vocabulary of features that can be extracted using techniques such as TF-IDF. However, they both suffer from the out-of-vocabulary problem (i.e., words in the text that are not in the vocabulary) [63]. Deep learning techniques utilise transfer learning of knowledge extracted from large language models (LLMs) trained on a corpus of text in an unsupervised manner. In such models, similar words have similar embeddings, and words with the same spelling but different meanings can be distinguished due to context (e.g., “river bank” versus “commercial bank”). Transformer models, such as BERT (bidirectional encoder representations from transformers), utilise the text context through a multi-head attention mechanism [64] that weights words’ importance based on how they are used in a text, enabling models to capture semantic content [65]. Such models are based on the concept of knowledge transfer, where general knowledge from large pre-trained language models (trained from massive unlabelled datasets) is utilised on a target problem by fine-tuning such models on a smaller set of labelled data representing the target problem. This approach is essential in problems when it is difficult to access a large volume of labelled data sufficiently to train classification models. In transfer learning, fine-tuning alters the embeddings of words in a text based on their context and, thus, provides a better representation of text in numerical terms, which is subsequently used by a feedforward layer to predict the output [66]. Previous work showed that pre-training and fine-tuning outperform traditional text classification approaches and overcome the labelled data bottleneck [63]. Specifically, BERT has been applied to personality prediction with significant improvements over traditional classification techniques [67].
However, transfer learning is susceptible to domain shift, where source and target data distributions differ [21]. Domain shift is addressed through feature expansion by incorporating linguistic cues. Specifically, part-of-speech (POS) embeddings capturing syntactic roles and positional information are combined with BERT representations to enrich linguistic modelling [68].

3. Methodology

The proposed methodology aims to improve restaurant recommendations by combining direct (quantitative information extracted from review websites, such as ratings for food, price, and cuisine offered) and derived information from eWOM text. Derived information includes users’ personalities and opinions extracted through a novel personality classifier and a topic model built from user reviews. Users’ preferences are extracted using NCF, similar to [69], and food preferences are identified using a custom NER. The performance of the proposed approach is evaluated using various recommender systems metrics and compared against traditional and neural networks-based techniques, such as the two-tower model. Figure 1 illustrates the overall workflow of the proposed methodology, which is implemented through the steps detailed below. An ablation is conducted by progressively adding feature groups to the model, allowing for the analysis of the individual contribution of each component to recommendation performance.

3.1. Data Collection [Step 1]

Restaurant reviews were collected from TripAdvisor using a dedicated web crawler (primary dataset). Consumers’ eWOM and additional direct information on restaurant features, such as price range, atmosphere, service, and value for money, were also collected. The dataset consists of approximately 255 thousand English-language reviews posted by consumers who visited restaurants in Cyprus between 2010 and 2021. It includes 52 thousand unique users and 2615 restaurants. For this study, only users with at least 6 reviews and restaurants with at least 50 reviews are considered, to reduce user–item matrix sparsity and ensure sufficient textual data for personality extraction. These users typically correspond to tourists who stay for extended periods or revisit the destination regularly. This filtering resulted in 774 restaurants and 5595 users. Reviewers who systematically posted identical positive messages across venues were removed, as these were indicative of fake reviews. Consequently, four users were eliminated along with their corresponding reviews. In addition, reviewers who systematically rated particular restaurants negatively and others positively in the same location were eliminated, as these were also considered fake reviews that aimed to benefit certain venues while diminishing the reputation of others (e.g., nearby competitors). To identify such users, reviews were grouped by user and restaurant location. Restaurant locations were determined using TripAdvisor’s location ID metadata. For each user who rated multiple restaurants in a particular location, a rating matrix was constructed. The matrix’s columns represent restaurants in that location, and the cells correspond to users’ ratings. Rows represent individual reviews of that user for that restaurant and location. For each user–location matrix, a paired sample t-test is performed between all combinations of the matrix’s columns (i.e., ratings of restaurants). The process is repeated for each user and location in the dataset. A t-test with a mean difference between two restaurants’ ratings greater than 3 (e.g., a user rating one restaurant with 5 and another with less than or equal to 2) and a p-value < 0.01 indicate biased behaviour favouring one restaurant and diminishing another in the same location. The test was applied only when at least three negative reviews were present in any matrix column, to account for cases where users revisited a venue following an initial negative experience. As a result of this process, one additional user account and its reviews were eliminated. The final dataset contains 31,597 reviews from 5591 unique users across 774 restaurants.
Secondary data used in this study include the stream-of-consciousness essay (BIG5) datasets [70] and MBTI [71,72], which are employed for personality classification. The BIG 5 dataset comprises 2468 essays written by individuals, annotated with personality labels [70]. The MBTI dataset consists of social media posts labelled by personality type, as defined using the MBTI questionnaire. The dataset is publicly available on Kaggle [72] and contains 8675 rows corresponding to users’ posts on the social network personalitycafe.com, annotated with personality labels. The dataset was constructed by first asking users to complete an MBTI questionnaire, after which they engaged in discussions with other users on the platform. Each of the 8675 users contributed 50 posts, which were concatenated per user and separated with the delimiter “|||”.
Preprocessing activities were performed prior to model training to ensure that the secondary data did not include TAGS (e.g., personality type indicators or any other tags) or the sentence separator “|||” used by the dataset’s curators, which could overfit the classifier, but have no occurrence in our primary dataset (i.e., domain shift problem). In contrast to previous work [73], the elimination of tags specified by the curators degraded the performance of the classifiers; however, this was necessary since these tags were not present in our primary data. To further address the model’s generalisation and use in our sample data, we enhanced data preprocessing activities so that the training data includes generic features relevant to personality, such as POS and POS sequences, in addition to text embeddings from BERT. Furthermore, emoticons were also converted into text (e.g., “:D”, “:P”, etc.). Uppercase letter words were also explained with additional information in the text, indicating that the author has a strong emotion about something, since this behaviour differs among personalities. Additionally, punctuation repetitions, such as exclamation marks or periods, were also converted into textual form to refer to the additional author’s writing style (for example, the word “emphasis” is inserted when multiple exclamation marks were encountered in the text and “etc.” is used for multiple consecutive periods). Finally, contractual expansion of text abbreviations was performed, joint words were split into separate words, and repeated characters in words were eliminated.

3.2. Text Preprocessing of Primary and Secondary Datasets [Step 2]

Data preprocessing and preparation are common and necessary steps for subsequent analyses (i.e., topic modelling, named entity recognition, and personality classification). Such procedures include elimination of punctuation, URLs, numbers, stop-words, lowering of text, inserting spaces in long words and breaking them into separate words, removing repeated characters in words (e.g., “yeaaahhhh” becomes “yeah”) and contractual expansion of text abbreviations (e.g., “don’t” or “dnt” to “do not”). Additionally, annotations (tags) within the text, specified by the MBTI dataset curators, such as tags of certain words, are removed to generalise the data prior to personality model training.

3.3. Topic Modelling [Step 3]

Topic modelling, and in particular the structural topic model (STM) technique [74], is employed in this step to infer the themes consumers discussed in eWOM. In general, topic models employ statistical models to identify topics arising in a collection of documents [75]. Each topic represents a set of words that occur frequently together in a corpus, and each document is associated with a probability distribution of topics that appear in that document. Restaurants and users’ opinions are produced by averaging the topics’ theta values (representing the distribution of topics over documents) associated with each restaurant/user. These represent common consumer opinions per restaurant and common topics that characterise users (preferences). The identification of the optimal number of topics that best describes the dataset is performed through an iterative process that involves examining different values for the number of topics (K) and inspecting the semantic coherence and held-out likelihood until a satisfactory model is found [74]. Coherence measures the semantic consistency of high-scoring words within a given topic and serves as an indication of the interpretability and meaningfulness of that topic. Held out likelihood tests a trained topic model against a test set with unseen documents, with higher values indicating a statistically strong topic model. Exclusivity measures the extent to which top words for each topic do not appear as top words in other topics. The naming of the topics is performed by domain experts who utilise the most prevalent words that characterise each topic.

3.4. Food Preference Extraction [Step 4]

The extraction of food preferences assumes that consumers who visit multiple restaurants and write numerous reviews about the foods they consume indirectly indicate their food preferences. Named-entity recognition (NER) is used to extract customers’ food preferences. An NER entity can refer to any concept of interest (i.e., food types, locations, products, etc.). Existing food NER models (such as NLTK, SpaCy, and Stanford NER) were evaluated and deemed inappropriate for our analysis, as none were trained on labelled data for Cyprus dishes [76]. Domain-specific applications require different types of entities to be identified by NER models; thus, there was a need to customise an existing NER for the task. To create or fine-tune a NER, text labelled with entities of interest needs to be provided, and a rule-based approach can be used to annotate the text using grammatical rules and linguistic terms. The SpaCy library was utilised, as it demonstrated superior performance when customised compared to alternative libraries [77]. SpaCy comes with a pretrained NER model that can be fine-tuned to different tasks using labelled data. This was an essential step since the SpaCy NER did not recognise Cypriot foods. Customising the SpaCy NER to identify food entities in reviews required the training of the model with additional cases containing custom food words. Thus, the rule-based technique was used to extract sentences that refer to the consumption of food from the local cuisine. For this task, only reviews from traditional restaurants—tavernas—were utilised. The identified cases were used to fine-tune the original NER. Foods that were identified were added to a food dictionary and used as vocabulary during TF-IDF vectorisation with customers’ reviews. The cumulative TF-IDF scores for each food entity in all reviews per user serve as a proxy for food preferences. This assumes that when customers write comments about the food they consume in different restaurants, irrespective of their ratings, they provide information about their food preferences.

3.5. Optimising Personality Classification [Step 5]

To identify customers’ personalities from eWOM, several binary text classification techniques are evaluated utilising knowledge transfer from BERT embeddings and several machine learning techniques, along with deep learning classification (BERT). The models were trained and tested using two labelled personality datasets, the MBTI tweets dataset [71,72] and the BIG-5 famous stream-of-consciousness essays [70]. Each dimension of the two personality models was used to train a binary classifier, resulting in 5 binary classifiers for the BIG-5 and 4 for the MBTI. For instance, a classifier for the extraversion–introversion dimension of the BIG-5 model assigns a probability that the author of a given text is extroverted or zyyyintroverted. The classification process begins with the vectorisation of the text into a form suitable for ML/deep learning algorithms. This is feasible by either using open/closed lexicons or through embeddings of text, learned from large corpora of text in an unsupervised manner, as in the case of language models such as BERT. The vectorised text is used to train logistic regression (LR), XGBoost, naive Bayes (NB), and support vector machines (SVM) classifiers, as they constitute mainstream models in personality recognition [11]. The second group of techniques evaluated employed large language models and transfer learning by fine-tuning a pre-trained BERT model. BERT comes in two versions: BERT-base and BERT-large. The former uses 12 transformer blocks, referring to the number of self-attention heads, the hidden layer size is 768, which defines the size of the text embeddings, and the total number of trained parameters is 110 M. BERT-large has 24 transformer blocks and 340 M parameters. The BERT-base model is utilised to generate embeddings, as it requires fewer computational resources. Evidence suggests that BERT-large provides minimal to no benefit when the datasets used are relatively small, and no benefit in other cases [78]. The most popular architecture used for assessing personality with BERT involves adding a dense layer on top of BERT’s output, followed by a binary output layer (sigmoid) for classification. In the case of multi-class text classification, a softmax layer is used instead.
This approach uses only the final hidden state vector of the [CLS] token from BERT, as it represents an aggregate embedding of the entire text and is generally regarded as the most informative feature for text classification tasks [79]. To enhance the performance of this standard BERT architecture, we combine the [CLS] output with a convolutional neural network (CNN) and a long short-term memory (LSTM) network to capture both local and sequential dependencies within the text. The former is used to extract additional features from the embedding of the last dense layer, and the latter is used to find patterns in the linguistic features of the text, namely, part-of-speech (POS) sequences. LSTM and CNN have been used in personality classification to improve accuracy [80] and in combination with BERT to enhance text classification [81]. However, they have not been used to find patterns within POS sequences. For the CNN, BERT’s last dense layer is used to extract local features by sliding a 1D kernel across contextualised embeddings to capture additional local relationships between tokens (the max-pooling method is adopted in this step). The LSTM layer utilises linguistic features that have been proven to contribute to language complexity prediction, which is linked to personality. SpaCy’s POS tagger is used to extract linguistic features [82], such as the number of pronouns, verbs, adjectives, and nouns, within reviews. The most influential POS features for the personality class are selected using various feature selection techniques, including model-based and statistical approaches. To leverage additional linguistic features from the text, the order of POS tags in reviews is also used as input to the classifier (Figure 2). Tag sequences are expressed using unique IDs, specified for each POS, and expressed as a sequence of numbers. An LSTM layer is used to find patterns in the POS sequences. The CNN, LSTM and linguistic input are concatenated and fed to a linear layer and a sigmoid activation function that predicts the probability for each personality class. The above layers have been used independently and in combination, and the classifier structure that yielded the best results was selected. This is presented in Figure 2.
The second issue addressed while searching for the best personality classifier is the issue of data imbalance. When the number of training examples is skewed toward one class, ML models struggle to correctly predict minority classes. For example, in our case, the number of extrovert cases exceeds that of introvert cases in the training data. Due to such data imbalance, additional techniques are required to balance the data prior to training the classifiers with text embeddings. Deep learning models, however, especially those using pre-trained architectures (such as BERT), can be more resilient to moderate data imbalance. During this step, we considered prominent imbalance treatment techniques for the ML models, such as resampling, cost-sensitive algorithms, ensemble methods [83], and class weighting. Resampling involves under-sampling the majority class or over-sampling the minority class (in case of binary classification) and, thus, balances the data by altering the number of sample units per class. Oversampling is a proven technique for treating class imbalance used in text classification [84] that generates synthetic new cases (instead of replication) based on data from the minority class. Two of the most popular over-sampling techniques are synthetic minority over-sampling (SMOTE) [83] and adaptive synthetic (ADASYN) sampling. The latter is considered an extension of SMOTE that adaptively generates minority data instances based on their distribution [84]. Both SMOTE and ADASYN are evaluated in this step. In the case of BERT, imbalance was treated using class weighting and different loss functions (i.e., focal loss, binary cross-entropy loss) since it performs better than over- and under-sampling.
The third issue addressed regarding BERT-based classification concerns its tendency to perform best with short texts, typically those containing fewer than 128 tokens. Since the reviews text used are longer than this limit, different long text BERT classification techniques were considered that used different parts of the text, such as, the naïve head only that use the first X number of words (tokens) and ignores the remaining text, the naive tail-only use the last (X) number of words of the text and ignores the rest, and the semi-naive combines top X words with bottom X words or combines these with important words in the text and ignore the rest. Even though such approaches lose information, they have a minimal computational cost and achieve good results [81]; however, long-text treatment methods, such as using only head or tail tokens, achieve the best classification performance. Recent work aiming to alleviate the computational cost of processing long text utilises more sophisticated models that involve fragmenting the text into chunks and combining the embeddings of these chunks [85]. The benefits, however, from such models were not significantly different from the aforementioned techniques to justify the extra processing, which is key in recommender systems that aim to serve a large number of users simultaneously. During this step, different long text treatments were evaluated.

3.6. User and Venue Personality Extraction [Step 6]

The best personality classifier from Step 5 is used to label the personality of each consumer and restaurant. Consumer personality is estimated by initially aggregating all text generated by each user. In cases where the text length exceeds 512 tokens, the text is divided into 512-token-length chunks. The remaining part of the chunked text is eliminated since it was less than the required length. Each chunk of the aggregate user text is used as input to the personality classifiers, and the predictions for each chunk are averaged to produce the user’s overall personality. This is repeated for each user and personality dimension. Similarly, venue personality is estimated by aggregating the reviews of users who have visited the venue and liked it (positive evaluation > 4), then chunking the text and averaging the personality scores.

3.7. Extracting Latent User Information Through Neural Collaborative Filtering (NCF) [Step 7]

A neural collaborative filtering (NCF) component is used to extract latent user/item features. A deep neural network (NCF) is trained using embeddings of customers and restaurants as input and user ratings as output. The NCF converts the sparse user–item matrix into low-dimensional user–item embeddings (dense layer), thereby extracting latent customer preferences [36]. Embeddings from the NCF model are extracted and combined with features from previous steps of the method (personality, topics, food preferences). Inputs referring to both derived and direct information are used collectively to train and test an XGBoost regression model. The rationale for using XGBoost lies in its better interpretability and popularity with tabular data, compared to deep neural networks. Hence, its logic on how predictions are made can be explained with techniques such as Shapley additive explanations (SHAP) [86]. This hybrid approach is similar to the wide and deep architecture [41] that leverages a wide linear model for memorisation and a deep neural network for generalisation, allowing it to capture both specific feature interactions and broader patterns in data, but instead of using a dense layer to combine the wide and deep components to make predictions in the multi-layer perceptron [41], an XGBoost model is used since it can be trained faster and is easier to be explained. Another approach related to the one proposed is the two-tower model [87] that evaluates user–item rankings through the inner product of their respective embeddings. Two-tower models are capable of learning complex relationships between users and items, and can scale to large datasets; thus, they are popular in industrial settings.

3.8. Recommendation Generation [Step 8]

An XGBoost regressor model is trained (80%) and tested (20%) to predict user ratings (i.e., customer satisfaction) for restaurants that users have not visited yet. XGBoost is an ensemble method; hence, multiple trees are generated with each tree learning from the errors of previously generated trees [23]. XGBoost is selected due to its good results in similar problems and its faster training and prediction speeds compared to neural networks [88]. To address XGBoost overfitting, several hyperparameters had to be optimised using GridSearch, Bayesian optimisation, or random search, with GridSearch producing the best results. XGBoost predictions were used to rank recommendations based on Recall@k (the proportion of all relevant items successfully retrieved in the first K results) and Precision@k (the proportion of recommended items that are relevant in the first K results).

3.9. Comparative Analysis and Stepwise Ablation Study [Step 9–10]

During this step, alternative techniques are used to stress test the results of the proposed method. Different matrix factorisation techniques are used as alternatives to the proposed method, such as NMF, SVD, SVD++, and NCF models, to predict cells in the user–item matrix with unknown values. The user–item matrix is generated with rows corresponding to consumers and columns to restaurants, and cells containing user–item interactions. The above are popular collaborative filtering techniques that are considered state-of-the-art in the industry [30,33]. They are, thus, used as baseline approaches to compare with the proposed method as part of its evaluation. Hyperparameters such as the K-number of factors and regularization options for SVD, SVD++ and NMF, were tuned using GridSearch (SVD best params: {‘n_factors’: 100, ‘reg_all’: 0.005}; SVD++ best params: {‘n_factors’: 20, ‘reg_all’: 0.01}, where reg_all applies L2 regularization to the model’s learned parameters; NMF best params: {‘n_factors’: 20, ‘reg_pu’: 0.05, ‘reg_qi’: 0.05}, where reg_pu and reg_qi refer to regularization penalty on users’ and items latent dimension values). Additionally, a widely used deep learning architecture, namely the two-tower model, is employed to further evaluate the proposed method, as its rationale is similar to that of the proposed method. Here, the features constitute a mixture of user features, such as user_id, preferences from the topic model (tower 1), while restaurant (items) features are the item_id, food, price, and cuisine offered (tower 2). The two-tower NN model was optimised using the hyperparameters embedding dimension, user units (number of neurons in the dense layer of the user tower), joint_units (the sizes of the dense layers after combining the user & item towers), dropout_rate and learning_rate. The best hyperparameters based on MAE are {‘emb_dim’: 16, ‘user_units’: 128, ‘joint_units’: (128, 64), ‘dropout_rate’: 0.4, ‘learning_rate’: 0.001}. The NCF model was optimised based on {‘emb_dim’, ‘hidden_units’, ‘dropout’, ‘lr’, ‘optimiser’: [adam or sgd], ‘batch_size and ‘epochs’}. The best hyperparameters were based on validation RMSE: {‘emb_dim’: 64, ‘hidden_units’: 256, ‘dropout’: 0.2, ‘lr’: 0.01, ‘optimiser’: ‘sgd’, ‘batch_size’: 64, ‘epochs’: 15}. The performance of the proposed method is evaluated using offline recommender systems evaluation metrics such as the mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE). Additional ranked evaluation metrics were used, such as Recall@k and Precision@k. Table 1 lists the evaluation metrics along with their mathematical formulas.

4. Results

The primary dataset contains restaurant reviews collected from TripAdvisor, as discussed in Section 3.1. Figure 3 illustrates the descriptive statistics of review ratings by year. From this, it is evident that customer satisfaction degraded in 2021, possibly due to the COVID-19 pandemic.

4.1. Topic Modelling

To extract the topics that are discussed by consumers in eWOM, an STM topic model was developed. STM is used over the traditional latent Dirichlet allocation (LDA) [89] since it produced a higher-quality model, while providing insights into how metadata links to documents in the corpus, such as the review rating or sentiment, which helped the naming of the extracted topics. The STM model was trained using the STM package in R (1.3). During text preprocessing, words with fewer than three characters were eliminated, as they provided little contextual information. Custom stop words referring to names of people, towns, countries, cities, etc., were also eliminated, as they offer no information about user preferences. The optimal number of topics (K) based on the model’s performance metrics in Figure 4, with a focus on high coherence and exclusivity, high held-out likelihood, low residuals, and high lower bound scores, is 18 (K = 18).
Topics’ names in Table 2 are derived using domain knowledge from domain experts using the words with the highest probability per topic and high Lift (words that appear less frequently in other topics get higher weight) and FREX (Frequency, Relative Extractability) scores (summarizes words with the harmonic mean of the probability of appearance under a topic and the exclusivity to that topic) [90]. These provide more semantically intuitive representations of topics and are, thus, good for distinguishing topics.
The probability distribution of topics per review represents the topics discussed in each review, and the sum of the probabilities of all topics in each review is 1. The STM model’s theta values were used as review embeddings, and in combination with the other features that were extracted from eWOM, were used to train the XGBoost model. Figure 5 shows the average theta value per topic, indicating the prevalence of each topic in the corpus. Among these topics, only topics that were informative for the recommendation task were considered. In particular, the topic “Intention to revisit” was not included in the list of features since it did not provide information about customer preference. The remaining topics were utilised.

4.2. Food Preference Extraction

Users’ food preferences are extracted from eWOM’s text using a custom named entity recognition (NER) model trained using annotated data generated using a rule-based approach and the SpaCy library. Initially, the SpaCy library was utilised to extract sentences with food mentions, and then these annotated sentences were used as training data to update the existing SpaCy NER. For the annotation task, several rules have been specified using the SpaCy pattern language to extract the sentences that mention food consumption in reviews. Examples of patterns that have been specified include: “I ate {}”, “We had {} for dinner”, etc. Patterns were designed using combinations of generic part-of-speech tags to relax the constraints of sentence filtering and address variations in sentences that refer to food consumption, such as “we ate steak at the xxx restaurant” and “we had a nice steak at this lovely restaurant”. Extracted sentences were annotated automatically based on the position of the food entity in the sentence. This was identified based on the string length of the pattern that was satisfied when the sentence was selected. Figure 6 depicts an example sentence annotated with the position of the food entity in the text. Thus, number 20 refers to the position in the text where the entity name starts and five refers to the number of characters that comprise the entity name. This process was necessary to create a training dataset from which to fine-tune the generic SpaCy NER. During NER training, the dataset was split into training (70%) and testing (30%) sets. Generalisation of the model was achieved through regularisation techniques, such as dropout rates (preventing complex co-adaptations on training data by randomly shutting down neurons). The trained NER model achieved an accuracy of 94%. In addition, the NER was evaluated qualitatively to verify the correctness of the labels using a sample of 50 reviews on a dataset different from the one used to fine-tune the model. The manual process involved assessing the FP/TP/FN. The results yielded a recall of 65% and a precision of 70%.
The fine-tuned NER was applied to restaurant reviews to extract foods associated with each review. Many food entities were generated, and there were numerous repetitions due to variations in spelling. To reduce the number of features, a feature selection process was performed using a random forest classifier to identify the most important food names (features), using the cuisine offered by restaurants as the target variable. The restaurant’s cuisine is sourced from its page on the reviews platform. During this process, restaurants were initially clustered based on the cuisine they offered. Reviews in each cluster were used to extract the dominant foods using the identified foods in reviews as features and cuisine as the target variable. Several binary classifiers were generated using one cuisine as a class variable versus the rest. For the identification of local cuisine foods, traditional restaurants were used as one cluster. The most important features from the binary classifiers were combined into a collection of 220 international and local foods that formed our food vocabulary. TF-IDF vectorisation using the compiled food names as vocabulary was used on eWOM’s text. Users’ food preferences were specified as foods with the highest cumulative TF-IDF scores for all reviews by the user. These scores refer to foods that users either ordered/consumed and discussed in their reviews. The assumption here is that regardless of liking the food or not, these refer to foods they prefer to consume. The same approach is used to identify the foods that each restaurant is famous/good at. In the case of restaurants, only positive reviews were used.

4.3. Selecting the Best Personality Classifier

The stream-of-consciousness essay dataset [70] is used to train the BIG 5 models, while the MBTI dataset [72] is used to train the MBTI models (recall Section 3.1). To identify the best-performing BERT long-text handling method for personality classification, we compared the naive approach, which used the head (i.e., the beginning) of the text with lengths of 256 and 512 tokens, to the semi-naive approach, which divided the text into 128-token chunks and combined their embeddings. The results from this analysis have shown that the BERT classifier, using the head-only 512-token long text preprocessing strategy [85], outperformed the semi-naive approach; thus, this approach was employed in the methodology. Therefore, aggregated reviews from each user were initially chunked into 512 tokens. Then, the personality of each chunk was assessed and averaged to determine the user’s overall personality. This chunked long-text approach proved to be the best when compared to different ML classifiers and two datasets (MBTI and BIG 5) that utilised BERT embeddings as features. The results of Table 3 show that the MBTI BERT 512-based approach outperformed the ML models and the BIG5 BERT 512-based model. During the second evaluation phase, the MBTI BERT-512 was compared with four ML models that utilised data balancing. Oversampling with SMOTE and ADASYN did not improve the performance of the MBTI ML classifiers, as seen in Table 4. This could be because SMOTE may amplify noise in the minority class by creating synthetic samples from noisy instances, or it may focus solely on the minority class, potentially overlooking important characteristics of the majority class. SMOTE can also create synthetic samples that cross class boundaries, potentially confusing the classifier. On the other hand, ADASYN focuses on generating more synthetic samples for minority class instances that are harder to learn (i.e., those closer to the majority class). While this can be beneficial in some cases, it can also significantly amplify noise. Also, if there are outliers or mislabelled samples in the minority class, ADASYN may generate more synthetic samples around these problematic instances. Therefore, based on the results in Table 4 and after statistically evaluating the significance of one algorithm over the other using the McNemar–Bowker test, the MBTI BERT 512 classifier was chosen to label the primary data.
Having identified the best classifier for personality, this was used to label each user and venue. Reviews were initially aggregated per user and venue. Each user’s text was chunked into 512 slots, as this is the best length of text that our classifier can handle. The text is then vectorised using BERT, and through the model of Figure 2, the personality of each chunk is predicted. The personality of the user is the average for all chunks of text for each user. For each review, four binary classifiers were used to predict the probabilities for each of the four MBTI dimensions. Figure 7 shows descriptive statistics, in the form of personality distributions of users resulting from the MBTI BERT classifier. The distributions indicate that each personality trait varies around a mean probability of 0.5, representing an approximately balanced likelihood of belonging to either class. This observation is consistent with personality theory, which suggests that, on average, most individuals tend to lie near the midpoint of personality continua rather than at the extremes of either class.
Venue personality is assessed by averaging the personality profiles of users who reviewed each restaurant and expressed positive evaluations (>4). From the five brand dimensions proposed in [14,16]—sincerity, excitement, competence, sophistication, and ruggedness—which describe how consumers perceive brands, three dimensions (i.e., sincerity, excitement, and competence) closely correspond to three human personality traits in the BIG 5 model (i.e., agreeableness, extraversion, and conscientiousness). In turn, as the BIG 5 dimensions are known to map the four dimensions of MBTI [91], the three MBTI dimensions (introversion–extroversion, thinking–feeling, and judging–perceiving) were utilised in the recommendation approach proposed (Figure 7), to evaluate brand personality, since the intuition–sensing dimension is not directly linked to any dimension of the venue personality model based on [14,16].

4.4. Training and Evaluating the Proposed Model: A Stepwise Ablation Study

The features derived from eWOM, along with direct information regarding restaurants’ service dimensions, were combined with embeddings from the NCF component and used collectively to train an XGBoost regression model, with the output variable being the rating of restaurants. The following XGBoost hyperparameters were tuned using GridSearch: [learning rate, max depth, number of estimators, scale_pos_weight] for data balancing, along with regularisation parameters such as alpha, lambda and gamma.
The model’s performance was evaluated using ranking and accuracy metrics, including the mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), Precision@k, and Recall@k. Precision and recall were computed by measuring the proportion of relevant restaurants retrieved among the top-K recommendations. The comparison of the proposed XGBoost approach against NMF, SVD, SVD++, NCF, and the two-tower model revealed improved performance over these models.
In the analysis conducted using the extracted restaurant reviews, the data was split into test and training sets (80/20) using stratified sampling by user ratings, to ensure sufficient representation of all categories in the test and training sets. The model’s hyperparameters were tuned, and the model was trained and tested using the same samples (Table 5).
An incremental, stepwise evaluation was conducted using the MBTI dataset to assess the impact of various components of the proposed model on recommendation performance. The evaluation metrics summarised in Table 5 show that the proposed model achieves its best performance when all components—personality, topics, and food preferences—are included along with direct information (ratings, cuisine, etc). The baseline models, which represent current industry best practices, do not incorporate personality information.
The results reveal several key findings. (1) Introducing venue and user personality into the initial model (XGBoost trained on direct information) leads to a clear improvement in recommendation performance. This supports our hypothesis that consumers prefer restaurants whose personalities align with their own, as reflected in the enhanced accuracy of recommendations. (2) When food preferences and restaurant cuisines are incorporated in addition to personality features, performance improves further. This suggests that recommendations are more effective when users’ food preferences align with the cuisines in which restaurants excel, increasing the likelihood of customer satisfaction. (3) Finally, incorporating user opinions and venue themes, extracted from topic modelling, alongside personality and food preferences, yields the highest gains in performance. This indicates that matching user opinions with the thematic attributes of restaurants (i.e., topics that characterise positively reviewed venues) further enhances the relevance of recommendations.
A final observation is that when all features are combined, the proposed model outperforms both collaborative filtering (CF) and neural network-based (NCF) approaches. This demonstrates that information derived from eWOM substantially improves recommendation quality, providing strong empirical support for our initial hypothesis. In the proposed approach, recommendations are generated by ranking the XGBoost model’s predictions for each user. The number of recommendations to be produced (e.g., the top five restaurants) is specified by the user.
An additional evaluation is conducted by first selecting the best hyperparameters for each model configuration (corresponding to different component combinations) using a fixed 80/20 tuning split of the training data, ensuring that hyperparameter selection is not influenced by the test set. After tuning, the model is retrained and evaluated multiple times (30) using different random seeds, which affect both the data splitting (different samples end up in the training and test sets) and the stochastic elements of model training. Each run produces evaluation metrics, and the results are aggregated across runs to assess the standard deviation for each metric. This procedure quantifies the sensitivity of each model configuration to randomness in training and data sampling. The results in Table 6 show that all model variants exhibit low variability across 30 random runs, indicating robust performance and greater consistency.

4.5. Explanation of XGBoost Using SHAP

The SHAP (Shapley Additive exPlanations) summary plot (Figure 8) [86] shows that the most influential implicit category of features is the topic, followed by personality and then food. This result is consistent with the RMSE, MAE, and MSE performance metrics in Table 5, which show a more drastic improvement when topics are introduced into the model. The SHAP summary plot indicates that the topics “Disappointment”, “Long wait”, and “Bad food” are negatively influencing the ratings of the reviews. The postfix “u_avg” and “r_avg” in the topics‘ names refer to user average and restaurant average. Higher values of these topics (red points on the left) are associated with a decrease in the predicted outcome, indicating that disappointment and long wait strongly drive the model towards lower scores. In contrast, topics such as service and atmosphere (red points on the right) increase the prediction, suggesting that these have a positive impact. The user’s personality also emerges as a significant feature. Specifically, the levels of extraversion (IE) and thinking–feeling (TF) traits have a positive influence on the model when they are high. The thinking–feeling dimension refers to how a person makes decisions and evaluates information, with thinking associated more with logic and objective analysis, while the latter is associated with emotion and empathy. Along with the above two categories of features, the food category is also influential to the model’s output, with foods such as chicken showing a positive effect on the rating. Food features are shown with lower-case names followed by the postfix “u_avg” and “r_avg”.

5. Discussion

This work combines direct and derived information from electronic word-of-mouth (eWOM) to enhance the performance of recommendations. The importance of personality in recommender systems has been acknowledged in previous studies [58]; however, most of these rely on self-reported personality assessments obtained through questionnaires, which are often impractical and hinder consumer adoption. Automated personality assessment from eWOM text has been explored using machine learning (ML) and deep learning techniques; however, such approaches have not yet been effectively applied to the restaurant recommendation domain.
In addition to user personality, we operationalised and incorporated the concept of brand personality within recommender systems, demonstrating that the joint consideration of user and brand (in this case, restaurant) personality positively contributes to recommendation performance. This finding aligns with personality–brand congruence theory, which posits that individuals tend to prefer brands that reflect their own personality traits [14].
Combining direct information (e.g., ratings, cuisine type, and restaurant metadata) with derived information (e.g., personality, topics, and food preferences) enriches the model’s feature space. Each component contributes unique, complementary information. Direct features reflect explicit user behaviour. Personality captures latent psychological tendencies influencing preferences. Food preferences and topics encompass nuanced, contextual, and sentiment-based cues. The integration of these heterogeneous data sources reduces feature sparsity, enabling the model to learn more robust and generalizable patterns, thereby improving predictive accuracy. Fusing diverse feature types enhances the model’s representational capacity and mitigates bias toward purely behavioural data. Models such as XGBoost can leverage interactions between structured and unstructured features to more accurately approximate user–item relevance. This multimodal fusion increases the model’s ability to generalise beyond observed ratings, which explains the consistent improvement over baseline methods that rely solely on direct features.
The work presented evaluates various personality classification techniques to determine the most effective performer. Prominent ML techniques are evaluated using two methods for addressing imbalance. The paper extends our previous work [20] that examines deep learning classifiers such as BERT by optimising its performance using different long text treatment strategies. Similar work that uses transferred learning through language models for text classification [92] does not adequately address long-text challenges, resulting in inferior classification performance. This systematic evaluation of personality classifiers prior to labelling data contributes towards improved labelling, which is of paramount importance, as also highlighted in [70]. The performance of our proposed personality classifier is better than ML classifiers and other baseline personality classification techniques, such as [62], thereby giving us greater confidence in the labelling of the data.
Moreover, the method introduces an automated approach for identifying user preferences from eWOM text. Similarly, ref. [50] utilises sentiment for food preference extraction and clustering to identify topics from online reviews; however, it does not jointly address the concept of personality, nor does it utilise a topic modelling technique such as STM that enables the association of metadata, such as sentiment, with topics to improve interpretability.s
Our results also show that a combination of direct and derived information from eWOM (i.e., personality, food preference, and opinions from topic modelling) enhances recommendation performance.

6. Conclusions

This study proposes the combined use of personality (at both user and venue levels) with food preferences and is one of the first studies to utilise customer and venue personality in the restaurant recommendation problem. The study evaluates two popular personality models, namely the MBTI and the Big 5, and various classification techniques to inform restaurant recommendations. The best classifier is used to label the personality of users from eWOM text. Regarding food preferences, a custom NER is used within the proposed approach to extract the food preferences of users. Additional derived features of users and restaurants, extracted from eWOM’s text through topic modelling, are also utilised. The method combines the above with latent features that emerge from an NCF model. Derived and direct features are used collectively to train an XGBoost regressor to predict consumers’ satisfaction for restaurants they have not visited yet. The results show that user and venue personality, in combination with food preferences and opinions, can improve recommendations and outperform model-based CF techniques, such as NCF and the two-tower model. The results of combining personality, opinions, and food preferences indicate that they can make valuable contributions to restaurant recommendations.
As with any ML model, the quality of personality quantification, as well as our derivation of food preferences, is determined by the quality of the labelled training data [71]. Although the labelled personality data used is considered the gold standard in computational personality assessment, our models could be significantly improved through the addition of further labelled data. Thus, future work should aim to enhance the data with additional labelled data, such as Pandora’s dataset. Future work would also enhance the methodology to address group dynamics and the adaptive nature of personalised recommendations by considering the dynamically evolved user behaviour and contextual information. To further clarify the contribution of each model component, future work will employ a comprehensive combinatorial ablation analysis rather than an incremental one. Moreover, the recency and helpfulness of reviewers can be considered when computing the venue personality to improve the venue’s classification. Fairness is another important aspect in recommender systems, with certain foods being more popular than others. Hence, restaurants that target these are favoured over more specialised cuisines. In addition, restaurants that target extroverts would also be favoured since extroverts are more sociable and visit restaurants more frequently. This is an aspect that we did not address in this work, but would be an interesting topic to investigate in the future. Furthermore, correlation analysis with restaurants that have the same personality could be performed to cross-test whether the style of each restaurant with a similar personality aligns. Although information about the style of restaurants is not readily available, this is an interesting aspect that warrants investigation in future work.

Author Contributions

Conceptualization, A.G.; methodology, A.G., H.H. and M.P.; software, A.G. and E.C.; validation, A.G.; formal analysis, A.G.; investigation, A.G. and E.C.; data curation, E.C.; writing—original draft preparation, A.G., H.H. and M.P.; writing—review and editing, A.G., H.H. and M.P.; visualization, A.G.; supervision, A.G.; project administration, A.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. del Carmen Rodríguez-Hernández, M.; Ilarri, S. AI-based mobile context-aware recommender systems from an information management perspective: Progress and directions. Knowl.-Based Syst. 2021, 215, 106740. [Google Scholar] [CrossRef]
  2. Alves, P.; Martins, H.; Saraiva, P.; Carneiro, J.; Novais, P.; Marreiros, G. Group recommender systems for tourism: How does personality predict preferences for attractions, travel motivations, preferences and concerns? User Model. User-Adapt. Interact. 2023, 33, 1141–1210. [Google Scholar] [CrossRef] [PubMed]
  3. Lu, X.; Kan, M.-Y. Improving Recommendation Systems with User Personality Inferred from Product Reviews. arXiv 2023, arXiv:2303.05039. [Google Scholar] [CrossRef]
  4. Kim, Y.; Eves, A.; Scarles, C. Building a model of local food consumption on trips and holidays: A grounded theory approach. Int. J. Hosp. Manag. 2009, 28, 423–431. [Google Scholar] [CrossRef]
  5. Anderson, C. A survey of food recommenders. arXiv 2018, arXiv:1809.02862. [Google Scholar] [CrossRef]
  6. Min, K.-H.; Lee, T.J. Customer Satisfaction with Korean Restaurants in Australia and Their Role as Ambassadors for Tourism Marketing. J. Travel. Tour. Mark. 2014, 31, 493–506. [Google Scholar] [CrossRef]
  7. Zhe, L.; WangJalal, M.; Jalal, M.; Donovan, B. To Buy or Not to Buy? Understanding the Role of Personality Traits in Predicting Consumer Behaviors. In Proceedings of the International Conference on Social Informatics, Bellevue, WA, USA, 11–14 November 2016; pp. 337–346. [Google Scholar]
  8. Polignano, M.; Narducci, F.; de Gemmis, M.; Semeraro, G. Towards Emotion-aware Recommender Systems: An Affective Coherence Model based on Emotion-driven Behaviors. Expert. Syst. Appl. 2021, 170, 114382. [Google Scholar] [CrossRef]
  9. Jameson, A.; Willemsen, M.C.; Felfernig, A.; de Gemmis, M.; Lops, P.; Semeraro, G.; Chen, L. Human Decision Making and Recommender Systems BT—Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: Boston, MA, USA, 2015; pp. 611–648. ISBN 978-1-4899-7637-6. [Google Scholar]
  10. Hong, E.; Ahn, J. Influence of customer personality on perceived attractiveness and similarity in a food service context. J. Hosp. Mark. Manag. 2023, 32, 745–766. [Google Scholar] [CrossRef]
  11. Hashemi Motlagh, S.M.; Rezvani, M.H.; Khounsiavash, M. AI methods for personality traits recognition: A systematic review. Neurocomputing 2025, 640, 130301. [Google Scholar] [CrossRef]
  12. Gountas, J.; Gountas, S. Personality orientations, emotional states, customer satisfaction, and intention to repurchase. J. Bus. Res. 2007, 60, 72–75. [Google Scholar] [CrossRef]
  13. Kim, D.; Magnini, V.P.; Singal, M. The effects of customers’ perceptions of brand personality in casual theme restaurants. Int. J. Hosp. Manag. 2011, 30, 448–458. [Google Scholar] [CrossRef]
  14. Geuens, M.; Weijters, B.; De Wulf, K. A new measure of brand personality. Int. J. Res. Mark. 2009, 26, 97–107. [Google Scholar] [CrossRef]
  15. Pamuksuz, U.; Yun, J.T.; Humphreys, A. A Brand-New Look at You: Predicting Brand Personality in Social Media Networks with Machine Learning. J. Interact. Mark. 2021, 56, 1–15. [Google Scholar] [CrossRef]
  16. Aaker, J.L. Dimensions of Brand Personality. J. Mark. Res. 1997, 34, 347–356. [Google Scholar] [CrossRef]
  17. Demirbag Kaplan, M.; Yurt, O.; Guneri, B.; Kurtulus, K. Branding places: Applying brand personality concept to cities. Eur. J. Mark. 2010, 44, 1286–1304. [Google Scholar] [CrossRef]
  18. Yun, J.T.; Pamuksuz, U.; Duff, B.R.L. Are we who we follow? Computationally analyzing human personality and brand following on Twitter. Int. J. Advert. 2019, 38, 776–795. [Google Scholar] [CrossRef]
  19. Timoshenko, A.; Hauser, J.R. Identifying Customer Needs from User-Generated Content. Mark. Sci. 2019, 38, 1–20. [Google Scholar] [CrossRef]
  20. Christodoulou, E.; Gregoriades, A.; Pampaka, M.; Herodotou, H. Personality-Informed Restaurant Recommendation BT—Information Systems and Technologies; Rocha, A., Adeli, H., Dzemyda, G., Moreira, F., Eds.; Springer: Cham, Switzerland, 2022; pp. 13–21. [Google Scholar]
  21. Syed, A.A.; Gaol, F.L.; Boediman, A.; Budiharto, W. Airline reviews processing: Abstractive summarization and rating-based sentiment classification using deep transfer learning. Int. J. Inf. Manag. Data Insights 2024, 4, 100238. [Google Scholar] [CrossRef]
  22. He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.-S. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2017; pp. 173–182. [Google Scholar]
  23. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  24. Atas, M.; Felfernig, A.; Polat-Erdeniz, S.; Popescu, A.; Tran, T.N.T.; Uta, M. Towards psychology-aware preference construction in recommender systems: Overview and research issues. J. Intell. Inf. Syst. 2021, 57, 467–489. [Google Scholar] [CrossRef]
  25. Malik, S.; Rana, A.; Bansal, M. A Survey of Recommendation Systems. Inf. Resour. Manag. J. 2020, 33, 53–73. [Google Scholar] [CrossRef]
  26. Ansari, A.; Essegaier, S.; Kohli, R. Internet Recommendation Systems. J. Mark. Res. 2000, 37, 363–375. [Google Scholar] [CrossRef]
  27. Nilashi, M.; bin Ibrahim, O.; Ithnin, N.; Sarmin, N.H. A multi-criteria collaborative filtering recommender system for the tourism domain using Expectation Maximization (EM) and PCA–ANFIS. Electron. Commer. Res. Appl. 2015, 14, 542–562. [Google Scholar] [CrossRef]
  28. Silva, N.; Carvalho, D.; Pereira, A.C.M.; Mourão, F.; Rocha, L. The Pure Cold-Start Problem: A deep study about how to conquer first-time users in recommendations domains. Inf. Syst. 2019, 80, 1–12. [Google Scholar] [CrossRef]
  29. Koren, Y.; Bell, R.; Volinsky, C. Matrix Factorization Techniques for Recommender Systems. Computer 2009, 42, 30–37. [Google Scholar] [CrossRef]
  30. Reham, A. Matrix Factorization Collaborative-Based Recommender System for Riyadh Restaurants: Leveraging Machine Learning to Enhance Consumer Choice. Appl. Sci. 2023, 13, 9574. [Google Scholar] [CrossRef]
  31. Zhang, C.; Zhang, H.; Wang, J. Personalized restaurant recommendation method combining group correlations and customer preferences. Inf. Sci. 2018, 454–455, 128–143. [Google Scholar] [CrossRef]
  32. Bellini, P.; Palesi, L.A.I.; Nesi, P.; Pantaleo, G. Multi Clustering Recommendation System for Fashion Retail. Multimed. Tools Appl. 2023, 82, 9989–10016. [Google Scholar] [CrossRef]
  33. Movafegh, Z.; Rezapour, A. Improving collaborative recommender system using hybrid clustering and optimized singular value decomposition. Eng. Appl. Artif. Intell. 2023, 126, 107109. [Google Scholar] [CrossRef]
  34. Sun, L.; Guo, J.; Zhu, Y. Applying uncertainty theory into the restaurant recommender system based on sentiment analysis of online Chinese reviews. World Wide Web 2019, 22, 83–100. [Google Scholar] [CrossRef]
  35. Herwanto, G.B.; Ningtyas, A.M. Recommendation system for web article based on association rules and topic modelling. Bull. Soc. Inform. Theory Appl. 2017, 1, 26–33. [Google Scholar] [CrossRef]
  36. Xue, H.-J.; Dai, X.; Zhang, J.; Huang, S.; Chen, J. Deep Matrix Factorization Models for Recommender Systems. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 3203–3209. [Google Scholar]
  37. Peng, Z.-F.; Zhang, H.-R.; Min, F. IUG-CF: Neural collaborative filtering with ideal user group labels. Expert. Syst. Appl. 2024, 238, 121887. [Google Scholar] [CrossRef]
  38. Sedhain, S.; Menon, A.K.; Sanner, S.; Xie, L. AutoRec: Autoencoders Meet Collaborative Filtering. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 111–112. [Google Scholar]
  39. Sarker, M.R.I.; Matin, A. A Hybrid Collaborative Recommendation System Based On Matrix Factorization And Deep Neural Network. In Proceedings of the 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), Dhaka, Bangladesh, 27–28 February 2021; pp. 371–374. [Google Scholar]
  40. Zeng, W.; Fan, G.; Sun, S.; Geng, B.; Wang, W.; Li, J.; Liu, W. Collaborative filtering via heterogeneous neural networks. Appl. Soft Comput. 2021, 109, 107516. [Google Scholar] [CrossRef]
  41. Cheng, H.-T.; Koc, L.; Harmsen, J.; Shaked, T.; Chandra, T.; Aradhye, H.; Anderson, G.; Corrado, G.; Chai, W.; Ispir, M.; et al. Wide Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems; Association for Computing Machinery, Boston, MA, USA, 15–16 September 2016; pp. 7–10. [Google Scholar]
  42. Zuheros, C.; Martínez-Cámara, E.; Herrera-Viedma, E.; Herrera, F. Sentiment Analysis based Multi-Person Multi-criteria Decision Making methodology using natural language processing and deep learning for smarter decision aid. Case study of restaurant choice using TripAdvisor reviews. Inf. Fusion. 2021, 68, 22–36. [Google Scholar] [CrossRef]
  43. Saelim, A.; Kijsirikul, B. A Deep Neural Networks Model for Restaurant Recommendation Systems in Thailand. In Proceedings of the 2022 14th International Conference on Machine Learning and Computing (ICMLC), Shenzhen, China, 18–21 February 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 103–109. [Google Scholar]
  44. Yang, S.; Li, Q.; Jang, D.; Kim, J. Deep learning mechanism and big data in hospitality and tourism: Developing personalized restaurant recommendation model to customer decision-making. Int. J. Hosp. Manag. 2024, 121, 103803. [Google Scholar] [CrossRef]
  45. Chua, B.-L.; Karim, S.; Lee, S.; Han, H. Customer Restaurant Choice: An Empirical Analysis of Restaurant Types and Eating-out Occasions. Int. J. Environ. Res. Public. Health 2020, 17, 6276. [Google Scholar] [CrossRef]
  46. Chen, L.; Chen, G.; Wang, F. Recommender systems based on user reviews: The state of the art. User Model. User-adapt. Interact. 2015, 25, 99–154. [Google Scholar] [CrossRef]
  47. Lichtenstein, S.; Slovic, P. The Construction of Preference; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  48. Abbasi-Moud, Z.; Vahdat-Nejad, H.; Sadri, J. Tourism recommendation system based on semantic clustering and sentiment analysis. Expert. Syst. Appl. 2021, 167, 114324. [Google Scholar] [CrossRef]
  49. Hegde, S.B.; Satyappanavar, S.; Setty, S. Sentiment based Food Classification for Restaurant Business. In Proceedings of the 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, India, 19–22 September 2018; pp. 1455–1462. [Google Scholar]
  50. Asani, E.; Vahdat-Nejad, H.; Sadri, J. Restaurant recommender system based on sentiment analysis. Mach. Learn. Appl. 2021, 6, 100114. [Google Scholar] [CrossRef]
  51. Myers, I.B. Introduction to Type: A Description of the Theory and Applications of the Myers-Briggs Type Indicator; Consulting Psychologists Press: Palo Alto, CA, USA, 1987. [Google Scholar]
  52. McCrae, R.R.; John, O.P. An introduction to the five-factor model and its applications. J. Pers. 1992, 60, 175–215. [Google Scholar] [CrossRef]
  53. Ryan, G.; Katarina, P.; Suhartono, D. MBTI Personality Prediction Using Machine Learning and SMOTE for Balancing Data Based on Statement Sentences. Information 2023, 14, 217. [Google Scholar] [CrossRef]
  54. Amirhosseini, M.H.; Kazemian, H. Machine Learning Approach to Personality Type Prediction Based on the Myers–Briggs Type Indicator®. Multimodal Technol. Interact. 2020, 4, 9. [Google Scholar] [CrossRef]
  55. Wang, H.; Zuo, Y.; Li, H.; Wu, J. Cross-domain recommendation with user personality. Knowl.-Based Syst. 2021, 213, 106664. [Google Scholar] [CrossRef]
  56. Alves, P.; Martins, A.; Negrão, N.; Novais, P.; Almeida, A.; Marreiros, G. Are heterogeinity and conflicting preferences no longer a problem? Personality-based dynamic clustering for group recommender systems. Expert Syst. Appl. 2024, 255, 124812. [Google Scholar] [CrossRef]
  57. Wu, W.; Chen, L.; Zhao, Y. Personalizing recommendation diversity based on user personality. User Model. User-Adapt. Interact. 2018, 28, 237–276. [Google Scholar] [CrossRef]
  58. Dhelim, S.; Aung, N.; Bouras, M.A.; Ning, H.; Cambria, E. A Survey on Personality-Aware Recommendation Systems. Artif. Intell. Rev. 2022, 55, 2409–2454. [Google Scholar] [CrossRef]
  59. Fernández-Tobías, I.; Braunhofer, M.; Elahi, M.; Ricci, F.; Cantador, I. Alleviating the new user problem in collaborative filtering by exploiting personality information. User Model. User-Adapt. Interact. 2016, 26, 221–255. [Google Scholar] [CrossRef]
  60. Yusefi Hafshejani, Z.; Kaedi, M.; Fatemi, A. Improving sparsity and new user problems in collaborative filtering by clustering the personality factors. Electron. Commer. Res. 2018, 18, 813–836. [Google Scholar] [CrossRef]
  61. Karumur, R.P.; Nguyen, T.T.; Konstan, J.A. Personality, User Preferences and Behavior in Recommender systems. Inf. Syst. Front. 2018, 20, 1241–1265. [Google Scholar] [CrossRef]
  62. Mairesse, F.; Walker, M.A.; Mehl, M.R.; Moore, R.K. Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 2007, 30, 457–500. [Google Scholar] [CrossRef]
  63. Yang, K.; Lau, R.Y.K.; Abbasi, A. Getting Personal: A Deep Learning Artifact for Text-Based Measurement of Personality. Inf. Syst. Res. 2023, 34, 194–222. [Google Scholar] [CrossRef]
  64. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Guyon, I., Luxburg, U., Von Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
  65. Kardakis, S.; Perikos, I.; Grivokostopoulou, F.; Hatzilygeroudis, I. Examining attention mechanisms in deep learning models for sentiment analysis. Appl. Sci. 2021, 11, 3883. [Google Scholar] [CrossRef]
  66. Sun, C.; Qiu, X.; Xu, Y.; Huang, X. How to Fine-Tune BERT for Text Classification? In Chinese Computational Linguistics; Springer Nature: Berlin/Heidelberg, Germany, 2019. [Google Scholar] [CrossRef]
  67. Jun, H.; Peng, L.; Changhui, J.; Pengzheng, L.; Shenke, W.; Kejia, Z. Personality Classification Based on Bert Model. In Proceedings of the 2021 IEEE International Conference on Emergency Science and Information Technology (ICESIT), Chongqing, China, 22–24 November 2021; pp. 150–152. [Google Scholar] [CrossRef]
  68. Liu, W.; Li, J.; Huang, F.; Che, Z.; Li, L.; Liu, Z. BERT-FEI: Enhancing BERT with POS Tagging and Adversarial Training for MOOC Sentiment Analysis. In Proceedings of the 2024 5th International Conference on Computer, Big Data and Artificial Intelligence (ICCBD+AI), Jingdezhen, China, 1–3 November 2024; pp. 313–317. [Google Scholar]
  69. Dursun, C.; Ozcan, A. Sentiment-enhanced Neural Collaborative Filtering Models Using Explicit User Preferences. In Proceedings of the 2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Istanbul, Turkey, 8–10 June 2023; pp. 1–4. [Google Scholar]
  70. Pennebaker, J.W.; King, L.A. Linguistic styles: Language use as an individual difference. J. Pers. Soc. Psychol. 1999, 77, 1296–1312. [Google Scholar] [CrossRef] [PubMed]
  71. Yang, L.; Li, S.; Luo, X.; Xu, B.; Geng, Y.; Zeng, Z.; Zhang, F.; Lin, H. Computational personality: A survey. Soft Comput. 2022, 26, 9587–9605. [Google Scholar] [CrossRef]
  72. Kaggle (MBTI) Myers-Briggs Personality Type Dataset. Available online: https://www.kaggle.com/datasets/datasnaek/mbti-type (accessed on 1 February 2023).
  73. Christodoulou, E.; Gregoriades, A.; Herodotou, H.; Pampaka, M. Combination of User and Venue Personality with Topic Modelling in Restaurant Recommender Systems. Rectour Work. RecSys 2022, 3219, 21–36. [Google Scholar]
  74. Roberts, M.E.; Stewart, B.M.; Tingley, D.; Lucas, C.; Leder-Luis, J.; Gadarian, S.K.; Albertson, B.; Rand, D.G. Structural Topic Models for Open-Ended Survey Responses. Am. J. Pol. Sci. 2014, 58, 1064–1082. [Google Scholar] [CrossRef]
  75. Nikolenko, S.I.; Koltcov, S.; Koltsova, O. Topic modelling for qualitative studies. J. Inf. Sci. 2017, 43, 88–102. [Google Scholar] [CrossRef]
  76. Popovski, G.; Seljak, B.K.; Eftimov, T. A Survey of Named-Entity Recognition Methods for Food Information Extraction. IEEE Access 2020, 8, 31586–31594. [Google Scholar] [CrossRef]
  77. Shelar, H.; Kaur, G.; Heda, N.; Agrawal, P. Named Entity Recognition Approaches and Their Comparison for Custom NER Model. Sci. Technol. Libr. 2020, 39, 324–337. [Google Scholar] [CrossRef]
  78. Goldberg, Y. Assessing BERT’s Syntactic Abilities. arXiv 2019, arXiv:1901.05287. [Google Scholar]
  79. Zangari, A.; Marcuzzo, M.; Schiavinato, M.; Gasparetto, A.; Albarelli, A. Ticket automation: An insight into current research with applications to multi-level classification scenarios. Expert. Syst. Appl. 2023, 225, 119984. [Google Scholar] [CrossRef]
  80. Ahmad, H.; Asghar, M.U.; Asghar, M.Z.; Khan, A.; Mosavi, A.H. A Hybrid Deep Learning Technique for Personality Trait Classification From Text. IEEE Access 2021, 9, 146214–146232. [Google Scholar] [CrossRef]
  81. Murfi, H.; Syamsyuriani; Gowandi, T.; Ardaneswari, G.; Nurrohmah, S. BERT-based combination of convolutional and recurrent neural network for indonesian sentiment analysis. Appl. Soft Comput. 2024, 151, 111112. [Google Scholar] [CrossRef]
  82. Ortiz-Zambrano, J.A.; Espin-Riofrio, C.; Montejo-Ráez, A. Combining Transformer Embeddings with Linguistic Features for Complex Word Identification. Electronics 2023, 12, 120. [Google Scholar] [CrossRef]
  83. Haixiang, G.; Yijing, L.; Shang, J.; Mingyun, G.; Yuanyue, H.; Bing, G. Learning from class-imbalanced data: Review of methods and applications. Expert. Syst. Appl. 2017, 73, 220–239. [Google Scholar] [CrossRef]
  84. Amaar, A.; Aljedaani, W.; Rustam, F.; Ullah, S.; Rupapara, V.; Ludi, S. Detection of Fake Job Postings by Utilizing Machine Learning and Natural Language Processing Approaches. Neural Process. Lett. 2022, 54, 2219–2247. [Google Scholar] [CrossRef]
  85. Fiok, K.; Karwowski, W.; Gutierrez-Franco, E.; Davahli, M.R.; Wilamowski, M.; Ahram, T.; Al-Juaid, A.; Zurada, J. Text Guide: Improving the Quality of Long Text Classification by a Text Selection Method Based on Feature Importance. IEEE Access 2021, 9, 105439–105450. [Google Scholar] [CrossRef]
  86. Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. Adv. Neural Inf. Process. Syst. arXiv 2017, arXiv:1705.07874. [Google Scholar]
  87. Yang, J.; Yi, X.; Cheng, Z.; Hong, L.; Li, Y.; Wang, X.; Xu, T.; Chi, E.H. Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations. In Proceedings of the Companion Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 441–447. [Google Scholar]
  88. Shahbazi, Z.; Byun, Y.; Byun, Y.-C. Product Recommendation Based on Content-based Filtering Using XGBoost Classifier. Int. J. Adv. Sci. Technol. 2020, 29, 6979–6988. [Google Scholar]
  89. Blei, D.M. Probabilistic Topic Models. Commun. ACM 2012, 55, 77–84. [Google Scholar] [CrossRef]
  90. Margaret, E.; Roberts, B.M.S.; Airoldi, E.M. A Model of Text for Experimentation in the Social Sciences. J. Am. Stat. Assoc. 2016, 111, 988–1003. [Google Scholar] [CrossRef]
  91. Furnham, A. The big five versus the big four: The relationship between the Myers-Briggs Type Indicator (MBTI) and NEO-PI five factor model of personality. Pers. Individ. Dif. 1996, 21, 303–307. [Google Scholar] [CrossRef]
  92. Wang, Y.; Zheng, J.; Li, Q.; Wang, C.; Zhang, H.; Gong, J. XLNet-Caps: Personality Classification from Textual Posts. Electronics 2021, 10, 1360. [Google Scholar] [CrossRef]
Figure 1. Workflow of the methodology. Numbers refer to steps in the methodology.
Figure 1. Workflow of the methodology. Numbers refer to steps in the methodology.
Ai 07 00019 g001
Figure 2. Architecture of the proposed personality classifier combining BERT with POS features.
Figure 2. Architecture of the proposed personality classifier combining BERT with POS features.
Ai 07 00019 g002
Figure 3. Distribution (% per year) of restaurant reviews from the primary dataset by rating category [1,2,3,4,5].
Figure 3. Distribution (% per year) of restaurant reviews from the primary dataset by rating category [1,2,3,4,5].
Ai 07 00019 g003
Figure 4. Performance measures (e.g., held-out likelihood, residuals, semantic coherence, and lower bound-how well the model explains the data) for identifying the optimal number of topics in the primary dataset. The optimal K is 18 based on both graphs, as shown by the arrows.
Figure 4. Performance measures (e.g., held-out likelihood, residuals, semantic coherence, and lower bound-how well the model explains the data) for identifying the optimal number of topics in the primary dataset. The optimal K is 18 based on both graphs, as shown by the arrows.
Ai 07 00019 g004
Figure 5. Average proportion of topics in the primary dataset corpus.
Figure 5. Average proportion of topics in the primary dataset corpus.
Ai 07 00019 g005
Figure 6. An example annotation of text during SpaCy NER training.
Figure 6. An example annotation of text during SpaCy NER training.
Ai 07 00019 g006
Figure 7. Distribution of the 3 MBTI personality traits used by the recommender.
Figure 7. Distribution of the 3 MBTI personality traits used by the recommender.
Ai 07 00019 g007
Figure 8. SHAP summary plot showing the importance and effect of the XGBoost features in the combined input dataset.
Figure 8. SHAP summary plot showing the importance and effect of the XGBoost features in the combined input dataset.
Ai 07 00019 g008
Table 1. Description of performance evaluation metrics.
Table 1. Description of performance evaluation metrics.
MetricDescriptionFormula
Mean Squared Error (MSE)Measures the average of the squares of the errors between the predicted rating y ^ and the observed rating y 1 n i = 1 n y i y ^ i 2
Mean Absolute Error (MAE)Measures the average absolute magnitude of the errors in a set of predictions 1 n i = 1 n y i y ^ i
Root Mean Squared Error (RMSE)The square root of the MSE 1 n i = 1 n y i y ^ i 2
Precision@kThe proportion of recommended items in the top-k set that are relevant to the user. R e l e v a n t R e c o m m e n d @ k k
Recall@kThe proportion of all relevant items that were successfully captured in the top-k recommended list. R e l e v a n t R e c o m m e n d @ k T o t a l   R e l e v a n t
Table 2. Names associated with each of the STM learned topics from the primary dataset.
Table 2. Names associated with each of the STM learned topics from the primary dataset.
Topic NameWords with High Probability, Lift & Frex Scores
1.
Value
good, value, portions, money, large, price, busy
2.
Events
made, family, celebrate, welcome, lovely, feel, birthday
3.
Location and view
beach, sea, right, location, harbour, lunch, views
4.
Historic place
place, can, old, want, enjoy, crowded, people
5.
Long wait
table, order, asked, waiter, ordered, arrived, minutes
6.
Service and atmosphere
good, service, nice, staff, friendly, place, atmosphere
7.
Traditional
worth, well, special, cypriot, village, taverna
8.
Music and dancing
music, live, grilled, band, songs, night, dance
9.
Deserts
view, fresh, ice, cream, pie, cake, chocolate
10.
Smoking
smoking, last, used, just, time, still, night
11.
Drinks
bar, drinks, drink, cocktails, beer, watch, pub
12.
Bad food
ever, like, frozen, dont, tasted, disaster, just
13.
Intention to revisit
meal, enjoyed, back, definitely, really, went, lovely
14.
Variety
menu, choice, well, set, presented, variety, dishes
15.
Quality
quality, dishes, high, price, taste, one, service
16.
Price
pricey, bit, better, expensive, although, though, quite
17.
Disappointment
always, disappoints, time, will, staff, back
18.
Service
service, great, just, good, try, staff, well
Table 3. Performance (in terms of AUC and Accuracy) of the best BERT personality classifier against ML classifiers using the MBTI and BIG 5 datasets. In bold the average score.
Table 3. Performance (in terms of AUC and Accuracy) of the best BERT personality classifier against ML classifiers using the MBTI and BIG 5 datasets. In bold the average score.
Classifiers’ Performance Per Personality Model
Using BERT Embeddings with ML ModelsBERT Classifier with 512 Tokens
XGBSVMNaïve BayesLogistic Regression
AUCACCAUCACCAUCACCAUCACCAUCACC
Personality DimensionMBTI
Introversion-Extroversion0.600.770.590.770.580.370.640.770.740.75
Intuition-Sensing0.580.860.580.860.540.440.610.860.700.78
Thinking-Feeling0.630.590.660.620.560.540.700.650.840.76
Judging-Perceiving0.520.600.570.600.530.600.570.600.720.71
Average0.580.700.600.710.550.480.630.720.750.75
Personality DimensionBIG 5
Introversion-Extroversion0.600.570.620.520.570.490.590.580.740.71
Calm-Neuroticism0.520.490.540.520.500.490.530.530.650.72
Competitive-Agreeable0.500.520.490.520.470.510.530.550.690.67
Inattentive-Conscientious0.550.540.560.550.530.530.590.560.620.71
Closeness-Openness0.620.590.600.570.540.530.620.580.750.74
Average0.550.540.560.530.520.510.570.560.690.71
Table 4. Performance in terms of AUC and accuracy(ACC) of ML personality classifiers on the MBTI and BIG 5 datasets after oversampling using SMOTE and ADASYN.
Table 4. Performance in terms of AUC and accuracy(ACC) of ML personality classifiers on the MBTI and BIG 5 datasets after oversampling using SMOTE and ADASYN.
Average Performance Scores of ML Models Trained with BERT Embeddings with Different Imbalance Data Treatments Using 2 Different Datasets (MBTI, BIG5)
XGBSVMNaïve BayesLogistic Regression
SMOTEAUCACCAUCACCAUCACCAUCACC
MBTI 0.5760.5190.6070.5510.5360.4610.6250.612
BIG50.5570.5300.5170.5400.5200.5120.5730.556
ADASYNAUCACCAUCACCAUCACCAUCACC
MBTI 0.5590.4650.5870.4790.5240.4220.6010.589
BIG50.5300.5110.5370.5240.4990.4930.5340.540
Table 5. Recommendation performance using combinations of derived and direct features and comparing against traditional and neural network-based (NNB) techniques using MAE, MSE, RMSE, precision@5, recall@5, precision@10, and recall@10.
Table 5. Recommendation performance using combinations of derived and direct features and comparing against traditional and neural network-based (NNB) techniques using MAE, MSE, RMSE, precision@5, recall@5, precision@10, and recall@10.
Evaluation MetricXGBoost Model Performance Using Combinations of Features (Added Progressively from Left to Right most Column)Performance of Traditional Models Using direct InformationPerformance of NNB Models Using Direct Information
Personality (User, Venue) & Direct InformationPersonality (User, Venue) & Food & Direct InformationPersonality (User, Venue) & Food
& Topics & Direct Information
SVDSVD++NFMNCFTwo Tower Model
MAE 0.450.440.410.660.670.820.570.70
MSE 0.490.480.430.830.841.260.660.90
RMSE 0.700.680.650.910.911.120.810.97
Precision@5 0.870.880.880.740.720.510.530.66
Recall@50.910.920.930.730.710.480.900.77
Precsion@100.900.900.900.740.720.530.590.69
Recall@100.900.910.920.740.710.490.910.80
Table 6. Standard deviations of metrics across models’ variants.
Table 6. Standard deviations of metrics across models’ variants.
ModelStandard Deviations
Precision@5Recall@5Precision@10Recall@10MSERMSEMAE
XGB (Personality + Foods) & direct information0.0067680.0063180.0066770.0060990.0184070.0102620.010423
XGB (Personality + Topics + Foods) & direct information0.0068120.0077240.0064330.0072110.0145270.0082180.00871
XGB (Personality only) & direct information0.0073970.0076220.0070540.0074320.0186510.0097780.011194
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gregoriades, A.; Herodotou, H.; Pampaka, M.; Christodoulou, E. Combining User and Venue Personality Proxies with Customers’ Preferences and Opinions to Enhance Restaurant Recommendation Performance. AI 2026, 7, 19. https://doi.org/10.3390/ai7010019

AMA Style

Gregoriades A, Herodotou H, Pampaka M, Christodoulou E. Combining User and Venue Personality Proxies with Customers’ Preferences and Opinions to Enhance Restaurant Recommendation Performance. AI. 2026; 7(1):19. https://doi.org/10.3390/ai7010019

Chicago/Turabian Style

Gregoriades, Andreas, Herodotos Herodotou, Maria Pampaka, and Evripides Christodoulou. 2026. "Combining User and Venue Personality Proxies with Customers’ Preferences and Opinions to Enhance Restaurant Recommendation Performance" AI 7, no. 1: 19. https://doi.org/10.3390/ai7010019

APA Style

Gregoriades, A., Herodotou, H., Pampaka, M., & Christodoulou, E. (2026). Combining User and Venue Personality Proxies with Customers’ Preferences and Opinions to Enhance Restaurant Recommendation Performance. AI, 7(1), 19. https://doi.org/10.3390/ai7010019

Article Metrics

Back to TopTop