Recommender Systems in the Real Estate Market—A Survey

: The shift to e-commerce has changed many business areas. Real estate is one of the applications that has been affected by this modern technological wave. Recommender systems are intelligent models that assist users of real estate platforms in ﬁnding the best possible properties that fulﬁll their needs. However, the recommendation task is substantially more challenging in the real estate domain due to the many domain-speciﬁc limitations that impair typical recommender systems. For instance, real estate recommender systems usually face the clod-start problem where there are no historical logs for new users or new items, and the recommender system should provide recommendations for these new entities. Therefore, the recommender systems in the real estate market are different and substantially less studied than in other domains. In this article, we aim at providing a comprehensive and systematic literature review on applications of recommender systems in the real estate market. We evaluate a set of research articles (13 journal and 13 conference papers) which represent the majority of research and commercial solutions proposed in the ﬁeld of real estate recommender systems. These papers have been reviewed and categorized based on their methodological approaches, the main challenges that they addressed, and their evaluation procedures. Based on these categorizations, we outlined some possible directions for future research.


Introduction
In the age of digitalization, people use online platforms to find their desired items. These platforms usually have a huge catalog of items, which makes it difficult for their users to find only a short list of desired items out of many other irrelevant items. A recommender system (RS) can assist users and online platforms by inferring users' preferences and providing personalized recommendations that fulfill their needs. RSs are intelligent models that leverage data mining and machine learning methods to learn from users' historical interactions with the system and personalize the user experiences. RSs are omni-present; they are utilized by movie and music platforms, online sellers, booking agencies, marketing agencies, and the real estate market. Although we do acknowledge the importance of all the aforementioned applications, in this paper, we focus on the real estate market as we deem that this field has not been adequately explored, and its particular recommendation challenges have not been well studied in the past years.
The need for accommodation is one of the pivotal needs of every human. People purchase/rent properties infrequently throughout their entire life, which makes the housing selection a particularly complex decision-making procedure. When searching for a desired property, users consider several criteria and decision factors. Some of these criteria represent users' key decision factors, such as geographical location, price, and size of the property. Others represent minor factors that may have a lower effect on users' choices, such as specific facilities, proximity to schools, public transportation, and markets. Currently, there is growing interest in using online platforms to search and find real estate items (properties), as users can specify their preferences and find suitable items that best match their criteria among many other irrelevant properties. RSs serve a user of a real estate platform by providing personalized recommendations based on the profile of the user. Users interact with these platforms in various ways. They usually express their needs explicitly by setting some search criteria, such as geographical location, price, and number of bedrooms. Furthermore, they may interact with some of the properties in the website by clicking on the property links, bookmarking, inquiring for more information, or requesting a visit. All these pulses help an RS to better infer the user preferences and to provide more relevant recommendations.
Although RSs are effective in the real estate domain, they have received limited attention in the literature, and to the best of our knowledge, there is no survey paper related to real estate recommendation. This is due to the increased complexity of the this task (e.g., temporal attribute) that limits the number of papers tackling the problem of real estate recommendation. In this paper, we first provide an overview of existing studies in the field of real estate recommendation by reviewing relevant methodological approaches. Collaborative filtering, content-based filtering, knowledge-based filtering, multi-criteria decision making, hybrid approach, and reinforcement learning are the main methodological approaches the researchers in this domain have used to provide real estate recommendations. We then identify the specific challenges that real estate RSs face, such as the cold-start problem, the integration of rich item features, the handling of the complex buying behavior, conflicting criteria, and existing data sparsity. Based on this overview, we finally outline some promising research directions to stimulate further research in this domain and assist researchers in positioning their contributions in the context of real estate RSs.
The structure of this paper is as follows: We provide an overview of the main types of RSs in Section 2. In Section 3, the strategy that was used to conduct the literature review is explained. In Section 4, we discuss the various methodological approaches in real estate recommendation tasks and categorize the related studies based on these approaches. Then, in Section 5, we outline the main challenges that RSs in the real estate domain encounter. Next, in Section 6, we review the datasets, the evaluation strategies and measures, and the baselines that are used for the evaluation procedures of the selected papers. Finally, we suggest some research directions to advance state-of-the-art real estate RSs in Section 7 and draw conclusions in Section 8.

Background
There are different types of RSs, and each models user preferences from a different perspective. In this section, we provide an overview on the main types of RSs before discussing the reviewed papers.

Content-Based Filtering
A content-based (CB) RS recommends items whose features match the user profile. CB RSs do not infer user preferences from collaborative information. Therefore, they suffer from the issue of over-specification and the generation of obvious recommendations. For instance, if a user of a movie streaming platform has already watched The Godfather and The Godfather Part II, then a CB RS would recommend The Godfather Part III to this user. While this kind of recommendation seems logical, it misses the surprise factor that most users are looking for. It also fails at providing even slightly more diverse suggestions, e.g., movies of different genres.
Recently, there have been more and more sources of side information and knowledge about elements of the system which are valuable for CB RSs. These information sources can be structured features, semi-structured, or unstructured. The unstructured sources of information such as user reviews and comments have become richer these days and are very helpful in generating recommendations [1]. Lops et al. [1] outlined the recent trends in the available sources of data for CB RSs as follows: linked open data, user-generated content, multimedia features, and heterogeneous information.

Collaborative Filtering
Collaborative filtering (CF) is a group of RSs which infers the preferences of a user using the preferences of other users in the system. Therefore, this type of RS assumes that users with similar history have similar tastes. The main advantage of CF-based RSs is that, unlike CB that provides obvious and over-specified recommendations, they are able to provide non-obvious and therefore more surprising recommendations. Generally there are two main categories of CF-based RSs: model-based and memory-based ones.
Model-based CF methods such as FunkSVD [2], BPR [3], and NeuMF [4] use the given users' feedback to train a model and learn parameters to provide recommendations. These methods learn two low-rank matrices for users and items that represent them in a dense latent feature space. The given users' feedback can be explicit (e.g., rating) or implicit (e.g., clicks). Model-based CF methods can be categorized based on their learning approaches to learning to predict ratings/interactions [2,4,5] and learning-to-rank methods [3,6,7]. Unlike the model-based CF, memory-based methods such as user-based KNN (UKNN) [8] and item-based KNN (IKNN) [9] are heuristic approaches that do not learn parameters but form neighborhoods based on user or item similarities to generate recommendations.

Hybrid Recommender Systems
To employ the capabilities of multiple RSs, a hybrid RS can be applied. There are several hybridization methods such as weighted, switching, mixed, feature combination, feature augmentation, cascading, and meta-level [10]. Batet et al. [11] proposed an agentbased hybrid RS which uses CB and CF information to overcome the caveats of individual recommendation strategies. In some applications, such as e-tourism, user context (e.g., time, location, vicinity, seasonality) plays an important role in a hybrid recommendation system [12]. For instance, in [13], the authors proposed a context-aware hybrid travel recommender system where the context such as location, time, and user's mobility is used to re-rank the recommendation lists of a hybrid RS.

Survey Strategy
In this section we explain the selection procedure of relevant research papers. As opposed to narrative and traditional literature reviews, in this paper we used a systematic literature review (SLR) [14] procedure to systematically and comprehensively review relevant studies. An SLR should provide precise and understandable protocols to reduce the bias and systematic faults and, therefore, to guarantee the repeatability and trustworthiness of the review conclusions. Hence, SLR is more comprehensive, has lower risk of bias, has more formal and systematic protocols, but is relatively slower compared to narrative and traditional literature reviews. Recently, several studies used this type of review in the field of recommender systems [15][16][17].
To find the papers related to real estate RSs, we checked four main bibliography databases: ACM Digital Library, IEEE Xplore, Scopus, and Web of Science. We also used Google Scholar, which is one of the most popular scholarly search engines, to verify that we had covered all the relevant papers. We queried these sources to retrieve papers related to recommendation systems in real estate with the following query in February 2021: (("recommender system"OR"recommendation system")AND("real estate"OR"housing")) The initial list contained 150 papers. For each paper, we reviewed the abstracts based on the exclusion criteria described in Table 1 and identified 27 papers that passed the filtering. After reading the full text of these papers, we further excluded seven papers based on the exclusion criteria. Additionally, we reviewed the related work section of the remaining 20 papers and added six more papers to our corpus. As a result, we end up with 26 papers constituting of 13 conference papers and 13 journal papers. The summary tables of these selected papers are reported in Appendix A.

EC-1
The paper in not written in English. EC-2 The paper is not a full scientific paper. EC-3 The paper is not about recommender systems. EC-4 The paper is not about real estate market. EC-5 The paper or its extension has been already selected.
The distribution of the publication year of selected papers is depicted in Figure 1. The first paper related to real estate recommendation emerged in 1996 [18], indicating that the need for an online tool to recommend real estate items to users is not something new. However, the number of papers grows substantially in the course of time. As can be seen in Figure 1

Methodological Approaches
There are multiple types of recommendation systems, and each has specific characteristics and advantages. We categorized the selected papers into six general methodological approaches, namely collaborative filtering (CF), content-based filtering (CB), knowledgebased RS (KB), reinforcement learning (RL), multi criteria decision making (MCDM), and hybrid approach (HB). We define one last broad category, denoted as other approaches, to encapsulate any remaining methods that do not fit in these six categories. This categorization is summarized in Table 2.

Collaborative Filtering
CF is widely used as an effective recommendation approach in various applications. It is also the most common RS in the real estate context (see Table 2). Seven papers in our corpus used CF RSs.

Model-Based Collaborative Filtering
Five papers among the selected ones used model-based CF to provide recommendation lists in a real estate context. Yu et al. [22] proposed two geographical proximity boosted real estate CF models, assuming that users' preferences are highly related to properties' geographical proximity. In this regard, they added two geographical-based regularization terms to the weighted regularized matrix factorization (WRMF) [43] and showed that the proposed geographical proximity boosted approaches perform better compared to the regular WRMF, PMF [44], SVD++ [45], UKNN, and IKNN.
Jun et al. [23] proposed the SeoulHouse2Vec RS which is an embedding-based housing RS using a neural network collaborative model. In this model, user ids and property ids are fed into a fully connected neural network to predict the user-item ratings obtained through a survey. In the trained model, the users and items are mapped to low-dimensional vectors (embeddings) which can be used to provide recommendations.
Milkovich et al. [19] used a simple deep neural network architecture that obtains the user-item interactions as input to learn the embeddings. They compared Adam and stochastic gradient descent (SGD) optimizers and L1, L2, and ElasticNet regularizers to avoid overfitting. They reported that the model with SGD and the L1 regularizer has the best performance.
Rehman et al. [20] cast the real estate recommendation problem as a session-based recommendation task where the RS should predict the next item of a session given the previous items in the session. They specifically proposed a two-step recommendation task. In the first step, they used the gated orthogonal recurrent unit (GORU) [46] with the Top1 loss function as a session-based recommender to generate an initial recommendation list that contains the most probable next items given the current items in the session. Then, the final ranking is formed based on the weighted cosine similarity of the last item in the session and the candidate items in the initial list. They showed that the proposed method performs better compared to GRU4REC [47,48], BPR, and KNN.
Knoll et al. [21] incorporated the item side-information in NeuMF and factorization machines (FM) [49] and evaluated them in two different scenarios: normal recommendation tasks and item cold-start recommendation tasks. They showed that NeuMF with side information performs better compared to FM in both scenarios.

Memory-Based Collaborative Filtering
There are two papers in our selected list of publications that used memory-based CF to provide housing recommendations. Wang et al. [24] changed the Pearson similarity measure in UKNN in order to better reflect the similarity between users with similar preferences. In this regard, they replaced the average rating over all users in the Pearson similarity measure to the average rating by users with similar preferences. While they stated that the proposed method is more accurate and more effective than UKNN, they did not compare the performance of the proposed approach with UKNN or any other baselines.
The authors of [25] proposed using a modified cosine similarity measure to recalculate the scores between users and property attributes such as area, price, position, pattern, and traffic and then used UKNN to find users with the same preferences. However, they did not provide a performance evaluation of their approach.

Content-Based Filtering
There are four papers in our list of selected papers which used CB RSs to provide housing recommendations. Kabir et al. [26] adapted the neural tensor network (NTN) [50] to calculate similarity scores between recommendable properties and the items that the user has seen so far and then provide a ranking of these recommendable properties for that user. In their model, the property features are converted to a word2vec representation and then fed into the NTN. They also captured the user context through a chat box, and therefore the final recommendations are based on both inferred preferences from the NTN and users' contexts from the chat box.
Zhang et al. [27] proposed a two-stage CB model for housing recommendations. In the first stage they calculate the similarity scores between the target user and items using a cosine similarity measure where users and items are represented in the same feature space. They stated that users have different levels of behavior such as clicking, checking the detailed view, bookmarking, and inquiring, and they argued that these different behaviors should receive different weights in the user profile. In the second stage, they used XGBOOST [51] to output probabilities that a user likes the items in the preliminary recommendation list generated in the first stage. Next, they ranked the items based on their relevance scores.
Badriyah et al. [28] applied the TF-IDF (term frequency inverse document frequency) method based on the words in the title, description, address, and ad description of the properties that the user has visited. Their CB model estimates whether the user is interested in an item based on the formed user profile (visited property ads) and item features. Then, the Apriori algorithm is used to find the frequent item sets to provide the final recommendations. Li et al. [29] used a simple cosine-similarity-based CB RS to provide a ranked list of properties for a user session. For a new user, who has no historical records, they proposed using the average vector of all users as the profile of the new user.

Knowledge-Based
A knowledge-based (KB) RS infers users' preferences based on the knowledge it has on how a particular item meets a particular user's needs. In this type of RS, the knowledge about users and items should be represented to be used by the RS [52]. RentMe [18] is one of the earliest studies in the field of housing recommendations. It conveys three types of knowledge in order to provide recommendations: quality of neighborhoods, relative location of neighborhoods, and features of apartments along with their relative quality. In RentMe, a user first starts by selecting some criteria to limit the search space and then traverses the remaining search space to end up with a predefined number of apartments. Alrawhani et al. [30] used case-based reasoning in a housing recommendation task. Upon a user query for a desired property, the RS checks the previous cases in the database and retrieves the relevant solution.
Yuan et al. [31] used a method called methontology [53] to represent semantic relationships between nodes in an ontology, based on the knowledge gained from user study and real estate experts. Then, the case-based reasoning approach was used to find the best solution or case based on the problem definition (user query) in the system. The problem definition is based on the user's search criteria such as price and location.
The authors of [32] formed a domain ontology which contained the semantic relations between different elements such as criteria, objectives, attributes, alternatives, weights, and geographical units. Then, they used the analytic hierarchy process (AHP) to select the best housing choices for users based on this domain ontology. In the proposed model, the user selects the geographical area and the criteria weights to obtain recommendations from the website.

Multi Criteria Decision Making
Multi criteria decision making (MCDM) is a group of methods with the aim of finding the best solution when there are multiple objective criteria which are mostly conflicting. For instance, in a housing selection context, a user would like a bigger house with lower price. MCDM methods can be categorized to multi objective optimization (MODM) and multi attribute decision making (MADM) [54]. MODM methods cast the problem as an optimization task with multiple objectives and constraints aiming at finding the optimal solution. MADM are a group of methods that select the best alternative among a limited number of pre-specified alternatives with respect to multiple criteria.

Multi Objective Decision Making
Two studies among the selected papers used MODM approaches to provide recommendations in the real estate domain. Daly et al. [33] argued that the travel time between the candidate property and some fixed locations (e.g., work place or school) is an important decision factor. They used the Dijkstra's algorithm to calculate this travel time. Then, a set of Pareto optimal solutions can be found with two main objectives: minimizing travel time to three specified locations and minimizing price. They showed that the proposed method is effective in reducing the traveling time and rent.
Ho et al. [35] stated that a buyer of a real estate item has mainly two goals related to the future value of the property, namely the maximization of the expected gain and the minimization of the expected loss. Users have different tolerances toward risk, and this should be considered in the recommendations. They also argued that the criteria in the housing selection are fuzzy (e.g., nice neighborhood). Therefore, they proposed a fuzzy goal programming approach with s-shape risk aversion to address the mentioned considerations.

Multi Attribute Decision Making
There are three papers in our corpus [32,34,35] that used analytic hierarchy process (AHP) to specify the weights of each criterion/alternative for users. AHP, which is an MADM method, represents the problem as a hierarchy containing criteria and alternatives. User preferences are assigned to the nodes in the hierarchy reflecting the relative importance of the nodes in the defined problem. Das et al. [36] used the PROMETHEE II [55] method to rank the properties based on four main decision criteria, namely location, price, size, and property type. PROMETHEE is an MADM method that uses the importance of different criteria to rank several alternatives [56].

Reinforcement Learning
Reinforcement Learning (RL) is a subdomain of artificial intelligence where there are agents with no or limited knowledge that learn incrementally by taking actions in a dynamic context maximizing cumulative reward [57]. Two studies in our list of selected papers used RL to generate housing recommendations. In the apt decision [38], each user is linked to an agent, and it learns from agents' interactions with not just the real estate items but also the related features. More specifically, in this application, users rank six positive and negative features and compare pairs of properties to reflect their preferences.
In [37], the authors proposed a two-phase method. In the first phase, similar to [38], users specify the desired and undesired features also interacting with popularity-based recommendations. In this phase, the agents gain initial knowledge and are therefore confronted with smaller search spaces. Then, in the second phase, agents learn from user reactions through an interactive learning approach.

Hybrid Approach
There are two studies among the selected papers that applied hybrid RSs in the real estate domain. Tas et al. [39] proposed a hybrid model consisting of a CB RS and a CF RS to generate recommendation lists. They compared IKNN and WRMF for the CF component and showed that IKNN provides more accurate recommendations compared to WRMF. Ojokoh et al. [40] hybridized an IKNN and a fuzzy-rule-based RS to rank the properties on a real estate website in Nigeria. They fuzzified the input variables to address vagueness in property features and user preferences.

Other Approaches
There are some other approaches that do no fit into the provided categorization. Chonwiharnphan et al. [42] proposed a method to generate realistic logs of users for a new real estate item. They used a neural network based model to learn item embeddings. This model consists of an autoencoder with six layers to map items to the embeddings of size 64 and a GRU (gated recurrent unit)-based predictor that predicts the next item given the embeddings of other items in the user profile. In this way, the generated embeddings are based on both item features and user preferences. Then, they used a conditional generative adversarial network (GAN) based on the learned item embeddings to generate user logs. In the proposed GAN-based model, they applied the straight-through Gumbel estimator [58] to avoid using an additional classifier in order to classify the predicted embeddings.
Li et al. [41] used XGBOOST to calculate relevance scores for recommendable properties given the user current search query, user previous interactions, and item features. To prevent XGBOOST from favoring numerical features in particular, they cascaded the proposed model in a way to balance the focus between categorical features and numerical ones. They showed that their proposed approach performs better compared to a CB RS.
In [34], a probabilistic relational model (PRM) with existence uncertainty is used to provide recommendations. PRM is an extension of Bayesian networks for relational databases that models the uncertainty between the existing attributes of objects and the relations between objects in the database. The proposed PRM-based RS predicts the probability that a particular user would like a candidate property given the search query of the user. They showed that their proposed model outperforms the content-based version of the same model, i.e., the model that only considers the immediate parents of the leaf nodes in the network.

Challenges
There are some profound challenges in housing recommendation tasks. These challenges are rooted in the unique characteristics of the items, users, and decision-making procedure in this domain. After thoroughly investigating the content of the reviewed papers in this study, we outlined the following main challenges in housing recommendations: • This categorization is summarized in Table 3, and the specific challenges are thoroughly discussed in the following sections.

Cold-Start Problem
The cold-start problem in RSs refers to a situation where a new user or item enters the system and the RS is unable to generate recommendations for this new entity as there are no or very few interactions in the system for this new entity. In housing recommendation tasks, two types of cold-start problems exist: new users who have had a very limited number of interactions with the real estate platform, and new items that just appeared and should be recommended by the RS to the relevant users. There are five papers in our corpus that considered the cold-start problem in their studies.

Cold-Start Problem for New Items
Knoll et al. [21] addressed the cold-start problem for new items using two methods: FM and NeuMF. Both of these methods have the capability of capturing user and item side information and therefore are able to relate the new items to existing ones and recommend them even if these items have no or a very limited number of interactions.
Zhang et al. [27] used CB to address the cold-start problem for new items. Users and items are represented in the same feature space based on item metadata, and therefore the proposed CB method can generate recommendations for new items even if there is no interaction for these items in users' history logs.
Chonwiharnphan et al. [42] proposed a method to generate realistic user logs for new items. They used a GAN-based approach to predict the order of possible embeddings and corresponding items that these embeddings represent. Then, based on these generated logs for new items, regular RSs can be applied to provide recommendations.

Cold-Start Problem for New Users
In [34], the authors argued that the cold-start problem for new users corresponds to a state of the proposed PRM-based model where the length of the slot chain is one. They showed that this state of the model can be considered as a pure CB model. The recommendations in this state are only based on the domain expert's knowledge and user search criteria.
Rehman et al. [20] stated that normally, users' long histories are not available in real estate websites, and therefore the typical CF methods are not applicable. To address this issue, instead of using a user-based approach, they proposed a session-based recommendation approach that predicts the next items for an active session even with a very short history of click events.

Domain-Specific Item Features
Real estate items have some domain-specific characteristics and features, and therefore generic recommendations may not directly apply to these types of items. The main real estate item features that are mentioned in the selected papers are summarized in Table 4. As is shown in this table, price, number of rooms, property type, living area, and geographicalbased information are the most common features in housing recommendations.
Among the outlined features in Table 4, the ones that are related to geographical attributes such as city, location, neighborhood attributes, and proximity to places of interest (POIs) are more specific to the real estate context. In [22], it was stated that users are usually interested in properties with larger geographical proximity. Therefore, they adapted the WRMF model with two geographical regularization terms to exploit the importance of geographical information. Daly et al. [33] argued that the housing decision making depends on the geographical distances between the candidate property and some other locations such as workplace and school, which are considered POIs. The aim of their proposed model is to find candidate items in the desired price range with minimum travel time between the candidate item's location and the selected POIs. In [31], the authors stated that the location is the most important feature that the user should specify, and the RS should be able to incorporate this selection in order to find more relevant items near the specified location.
Furthermore, some papers claimed that not all item features have the same effects on housing decision making. In apt decision [38], the users need to specify in their profiles what they consider as pivotal or inessential features. This helps the RS to focus more on pivotal features and relax the filtering for inessential features. Similarly, in RentMe [18], users should first start with some more pivotal features to limit the search space and then traverse the remaining search space with the more specific inessential features to obtain a final recommendation list. Burke et al. [18] supported the idea that the unweighted similarity over all item features is a poor measure as users consider different weights for features based on their preferences.

Complex Buying Behavior
Making decisions for some products such as cars and houses is complex as they are relatively expensive and people usually purchase/rent them infrequently. Therefore, RSs for these types of products should consider the complexity of decision making in this domain. Ojokoh et al. [40] argued that fuzzy logic can be very helpful in complex decision making situations to address uncertainty, impreciseness, and ambiguity in features and user preferences.
In [37], the authors discussed how the housing selection is a complex decision-making procedure, and as opposed to some other domains where the agent can start with zero knowledge and learn incrementally, in real estate agent-based RSs, an agent will fail to converge when the initial human-supplied knowledge is missing. They also argued that it is not feasible to transfer agents from other applications to the real estate domain without adapting the agents' knowledge with real estate experts. These experts should provide initial knowledge in form of some policies that help the agents in agent-based RSs to find interesting trajectories in the huge search space.
Li et al. [41] stated that transactions in a real estate context are infrequent due to the longevity of housing decision making. To address this issue, they proposed using the whole user history (collaborative information) instead of only using search queries. They argued that this approach would be applicable in other complex decision-making procedures such as car recommendations and recruiting.
Burke et al. [18] looked at the housing decision-making procedure from another perspective. They discussed how a typical user in a real estate website would ask for a property similar to an existing one but with some small differences. For example, a user would say "I like an apartment similar to apartment "A" but a little bit cheaper or a little bit larger". They had the tweak option in their platform, which means that the user can select an apartment as the search basis and then retrieve more desired cases by tweaking it, i.e., by changing some criteria such as number of bedrooms or neighborhood environment.

Conflicting Criteria
Users, who are the main decision makers in the housing selection procedure, have multiple criteria which are often conflicting. For instance, a typical user would like to find a house which is not expensive but at the same time spacious or in a nice neighborhood [36]. Ho et al. [35] proposed a hierarchy of criteria where four main ones are housing value, structure attributes, neighborhood attributes, and location attributes. They stated that some of the criteria in this hierarchy are conflicting, and therefore they used AHP to personalize the weights of criteria based on individuals' preferences. Similarly, Malczewski and Jelokhani-Niaraki [32] used AHP on a hierarchy of criteria based on a predefined ontology to personalize the scores of real estate alternatives based on the different weights that users gave to the different (conflicting) criteria.

Data Sparsity
In the context of real estate recommendations, usually the interaction matrix between users and items is highly sparse, i.e., each user usually interacts with a few number of properties. Tonara and Widyawono [59] stated that the number of user interactions compared to the number of properties in real estate websites is highly limited. They proposed replacing the user-item matrix with a user-criteria matrix to address this sparsity issue, as the number of criteria is much smaller than the number of items, and the criteria set is quite static while the item set changes rapidly. Moreover, Oh and Tan [37] stated that the housing search space is highly sparse. They addressed the sparsity problem by providing initial knowledge to the agents to limit the search space and improve the RS performance.

Evaluation and Benchmarking
In this section we assess the selected papers with respect to their evaluation settings. We reviewed the type of datasets that were used, their evaluation strategies, the corresponding performance measures, and the baselines that were employed in the benchmarking.

Datasets
There is no paper in our corpus that publicly shared the used datasets or applied a publicly available real estate dataset. We described the used datasets in Table 5. In this table, the size of datasets (number of users and items) and the type of feedback that users provide are reported. In some papers, the details about the used datasets are not reported, and therefore they are excluded from Table 5. As reflected in this table, there are two types of feedback from users: explicit and implicit. The types of explicit feedback that have been used in our corpus are rating, liking/disliking, bookmarking, inquiring for more information, asking to visit, or filling out a questionnaire. Users usually show their implicit preferences via clicking on property links or by checking the more detailed information of the property.

Evaluation Strategy
There are three types of strategies that were used in the selected papers for evaluation purposes: offline, online, and user survey. In an offline evaluation strategy, the available historical user logs with properties are used to assess the capability of the trained model in predicting the hidden (test) interactions. The offline evaluation strategy is the only way of benchmarking when there is no access to the online users. On the other hand, the aim of an online evaluation strategy is to assess the model performance on data points that do not exist in a historical dataset. In this strategy, the proposed model can be evaluated by recommending items to real online users and checking their reactions. The performance of the RS can be also assessed by asking users to fill out surveys. The main advantage of conducting user surveys is that users can explicitly reflect their feedback on the provided service. From the papers that reported their evaluation strategies, 15 papers used offline evaluation, 3 papers used online evaluation, and 3 papers employed a user survey. Therefore, the offline evaluation approach is the dominant way of evaluating RSs in the real estate context, which is in line with the insights of other studies [60,61].

Evaluation Measures and Baselines
There are several types of evaluation measures to assess the performance of RSs. While the relevance measures are frequently used for evaluation, there are other measures such as coverage, diversity, and serendipity which are also considered as additional performance measures [62] for RSs. Relevance measures such as precision, recall, and NDCG evaluate the RS in predicting the items that the user will interact with. Coverage measures how well the item catalog [63] or stakeholders [64] are covered in recommendation lists. Diversity measures can be applied, for instance, in news recommendations [65,66] or music recommendations [67] to measure to what extent the recommendation lists contain diverse content. Serendipity metrics measure the novelty and unexpectedness of recommendation lists generated by RSs [68].
There are different types of evaluation measures in the selected papers to reflect the performance of the proposed housing RS. The distribution of these measures is summarized in Figure 2. As is shown in this figure, precision and recall, which are classic information retrieval metrics, are the most popular measures in evaluating housing RSs. The first five measures in the figure (AUC, accuracy, precision, recall, and F1) evaluate the ability of the RS in predicting the relevant items for users. MRR (mean reciprocal rank), MAP (mean average precision), and NDCG (normalized discounted cumulative gain) are rank-sensitive measures, which means that they evaluate the ability of the proposed method in recommending relevant items in higher ranks. All the aforementioned measures are applicable in offline evaluation approaches. Moreover, conversion rate is an online evaluation measure which shows whether users find their desired items and therefore are satisfied with the recommendations. When it comes to baseline methods, the set of used baselines in the selected papers is quite diverse. In most of the papers, multiple versions of their proposed approaches are used as the competing methods in their benchmarking. The most frequently used baselines are KNN-based approaches (item-based or user-based) and popularity-based recommendation.

Possible Research Directions
In the previous sections, we assessed the selected studies in housing recommendation from various perspectives. As discussed, the selected papers addressed various challenges in housing recommendation tasks using different methodological and evaluation approaches. Nevertheless, there is still much room for better capturing the user preferences and subsequently improving the recommendation performance. In addition, as was thoroughly discussed in the previous sections, there are many bottlenecks in the domain of real estate recommendation that make the corresponding task particularly challenging. To the best of our knowledge, there has not been any proposed method addressing every challenge in the field, and all of the existing approaches have limitations. To this end, we believe that there is still much space for improvement in the field. In our effort to stimulate further research in this domain, we suggest the following research directions.
Cold-start problem for new items As discussed in Section 5.1, the cold-start problem is one of the most important challenges in real estate RSs. These RSs should be able to recommend new items that just entered the system to the relevant users even if there is no historical log for these items. Items in real estate context usually have very rich metadata that can be used to address the item cold-start problem. Research efforts are still required to better investigate the incorporation of this rich metadata into CF-based RSs to serve new items. This can be achieved, for instance, by using a two-step model [69,70] or joint-optimization [71]. Another direction to address the cold-start problem is assessing the capability of hybrid models by combining models that can serve already known items (e.g., CF) with models that can handle new items (e.g., CB).

Feature extraction
Real estate items usually have unstructured metadata such as description, layout, images, and geo-spatial data that convey useful information and therefore should be used in RSs. Descriptions usually contain some sort of information that does not fit in structured/relational features. For instance, descriptions may reflect fuzzy features such as neighborhood environment and convenience. Images can represent visual features such as style of the property (e.g., modern or classic), lightness, and building exterior design. Geospatial data reflect the relative proximity of the property to other desired locations. These unstructured data can be processed in a separate model (e.g., a text classifier) and then incorporated into the main model, or one can consider an aggregated model that extracts features from these data and uses them in the recommendations. Additional features can be also extracted from the user-platform interactions, as there are different types of such activities (e.g., bookmarking, requesting a visit, or just clicking). These features could weigh or describe the user-property interactions.

Benchmarking
The literature in real estate RSs lacks a comprehensive benchmarking study that evaluates various recommendation models from different perspectives on public datasets. Most of the papers in our corpus compared their models with either a simpler version of the same model or with naive baselines such as the popularity-based RS. Moreover, the proposed approaches have been mainly evaluated with accuracy-based measures. Further assessment should be carried out in order to evaluate real estate RSs with respect to beyond accuracy measures such as diversity, coverage, and serendipity. Furthermore, providing a publicly available real estate dataset is essential to make the research efforts comparable and reproducible. Such datasets could first provide the RS community with valuable means for developing and testing new models. Second, they can be used in analytical comparison studies assisting scientists with drawing conclusions regarding the applicability and performance of newly proposed recommendation models.

Multi-stakeholder recommendation
In the real estate RSs, renters/buyers are not the only stakeholders in the system. Sellers/landlords and real estate brokers, who are the representatives of the sellers or landlords, also have a stake in recommendations on real estate websites. Therefore, multi-stakeholder RSs should be used in order to consider the preferences of all the stakeholders in generating recommendation lists. In this type of RS, maximizing the predicted accuracy should not be the only objective of recommendations. There are huge real estate agencies that advertise many properties and small agencies that have a limited number of properties in the system. A typical RS that is optimized only for predicted accuracy may be biased toward these huge agencies and favor them disproportionately and unfairly in the recommendations. In this situation, the multi-stakeholder RS should provide fair and calibrated recommendation lists that address this unfair bias [72].
Addressing data sparsity As discussed in Section 5, usually the interaction matrix between users and real estate items is highly sparse, which can negatively affect the performance of RSs. This issue arises as the users are usually interested in a very small fraction of active real estate items (properties that have not been rented/purchased yet) in the catalog, and in the end, each item can only be rented/purchased by one user. The features and previous interactions of inactive items, i.e., properties that have been rented/purchased and are not available anymore, can still be used to generate better recommendations. This dynamic character of real estate RS, which also exists in recommending second-hand items and job RSs, is not discussed in the reviewed studies and would be an interesting topic for future research. To address this issue, one can densify the interaction matrix using artificial or real user logs. Artificial logs can be generated by applying generative models such as VAE (variational auto encoder) or GAN (generative adversarial network) to fill the interaction matrix. Another solution to address data sparsity is to fill the interaction matrix with real user logs by querying users' feedback on some specific items that are more informative to the model.

Scalability
An important issue in real estate recommendations which is rarely discussed in the selected papers is scalability. A housing RS should be scalable, as the number of items and users are usually high, and the RS should provide relevant recommendations in a timely manner. A very accurate model would be useless in the real estate domain if it is not able to serve many users and items. Further research efforts should be carried out in order to address scalability issues both in model design and implementation.

Conclusions
In this paper, we present a survey of RSs in the context of the real estate market. Our corpus contains 26 papers (13 journal papers and 13 conference papers). We reviewed these papers based on the methodological approaches that they proposed, the challenges that they addressed, and the evaluation procedures they employed. For the methodological approaches, we outlined six main categories of RSs, namely collaborative filtering, contentbased filtering, knowledge-based filtering, multi-criteria decision making, hybrid approach, and reinforcement learning. Similar to many other domains, collaborative filtering is the most common methodological approach in real estate recommender systems. We also outlined five main challenges in real estate recommendation: cold-start problem for new items and new users, incorporating specific item features, handling the complex buying behavior, decision making based on conflicting criteria, and sparsity. Then, we assessed the evaluation procedures of the papers in the corpus with respect to the used dataset, evaluation strategy, performance measures, and baselines. Finally, we suggested some research directions to advance the state of the art in real estate recommendation. Table A2. Summary of the selected papers with their journal/conference names and number of citations.

Ref.
Journal/Conference Name # Citations [18] National conference on artificial intelligence 238 [38] International conference on Intelligent user interfaces 180 [37] AI Magazine 3 [32] Geo-spatial Information Science 10 [31] Information Systems 92 [24] International Conference of Modern Computer Science and Applications 5 [59] International Journal of the Computer, the Internet and Management 2 [33] ACM Conference on Recommender systems 12 [34] International Conference on Knowledge-Based and Intelligent Information & Engineering Systems 33 [35] European Journal of Operational Research 26 [30] Journal of Telecommunication, Electronic and Computer Engineering 8 [41] Pacific-Asia Conference on Knowledge Discovery and Data Mining 4 [21] International Conference on Innovations for Community Services 3 [22] International Conference on Web Information Systems Engineering 2 [28] International Conference on Information and Communications Technology 9 [40] Information Management and Business Review 2 [27] IEEE International Conference on Intelligent Systems and Knowledge Engineering - [25] International Conference on Virtual Reality and Intelligent Systems 1 [39] International Journal of Technology and Engineering Studies - [19] IEEE International Conference on Big Data Computing Service and Applications 1 [20] Mediterranean Conference on Pattern Recognition and Artificial Intelligence - [26] Mediterranean Conference on Pattern Recognition and Artificial Intelligence 1 [23] Sustainability 6 [42] IEEE Access 5 [29] International Conference on Intelligent and Interactive Systems and Applications - [36] International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications -