A Map-Based Recommendation System and House Price Prediction Model for Real Estate

Simple Summary: The accessibility of spatial big data help real estate investors to make better judgement calls and earn additional proﬁt. Since location is considered necessary for real estate and consequent decision-making, digital maps have become a prime resource for real estate purchases, planning and development. Personalisation can support in making judgments by identifying user re-quirements and inclinations, which a user interacts with digital map, it records all the user’s activities. A personalised real estate portal can use this information to suggest properties, assist homeowners and provide valuable real estate analytics. By monitoring user interactions through an online real estate portal, the framework provided in this article can make personalised recommendations of real estate based on content, collaboration and location. The effectiveness of the recommendations was tested by the user feedback mechanism through a method of mean absolute precision, and the results show that 79% precise suggestions were generated. Out of 5 recommendations produced, users were interested in at least 3. A separate house price prediction model was also developed base on neural networks and classical regression technique. This model implemented to assist users in making an informed decision regarding prospects of real estate purchase. Abstract: In 2015, global real estate was worth $217 trillion, which is approximately 2.7 times the global GDP; it also accounts for roughly 60% of all conventional global resources, making it one of the key factors behind any country’s economic growth and stability. The accessibility of spatial big data will help real estate investors make better judgement calls and earn additional proﬁt. Since location is deemed necessary for real estate and consequent decision-making, digital maps have become a prime resource for real estate purchases, planning and development. Personalisation can assist in making judgments by identifying user desires and inclinations, which can then be recorded or captured as a user performs some interactions with a digital map. A personalised real estate portal can use this information to suggest properties, assist homeowners and provide valuable real estate analytics. This article presents a novel framework for recommending real estate to users. By monitoring user interactions through an online real estate portal, the framework can make personalised recommendations of real estate based on content, collaboration and location. The effectiveness of the recommendations was tested by the user feedback mechanism through a method of mean absolute precision, and the results show that 79% precise suggestions were generated, i.e., out of 5 recommendations produced, users were interested in at least 3. Along with that, a separate house price prediction model based on neural networks and classical regression techniques was also implemented to assist users in making an informed decision regarding prospects of real estate purchase.


Introduction
Driven by advertising technologies and goals to produce targeted ads, the personalisation and customisation of websites and services have become the new norm in our society. The need for personalisation has been driven by the increase in data and information available. Information overload, which makes it challenging to find relevant information, has been a phenomenon for the past two decades. For example, a study from 2003 found that unique information creation was estimated to be between 1 to 2 exabytes. This implied that each human being must be processing 250 megabytes of information. Almost 20 years later, this demonstrates the mounting need for efficient and accurate user recommendation systems to help find pertinent data and information. Personalised content delivery to any set of users may consist of multiple aspects.
A factor that plays a vital role in most personalised web interfaces is the interactivity and the "user-friendly" nature of the User Interface (UI). Every web user, be it a novice, or an expert, wants the interface to provide meaningful content delivered without having much prior expertise about its functionality. This process involves a lot of work from a web developer's perspective but should be invisible and seamless to the end-user. Therefore, various tools and techniques have been developed to implicitly collect data from users. Implicit data collection, in simpler terms, is just the collection of a user's data through "interface interactions" without the user having to provide the data in a specific manner. The data is then used to determine interests and make recommendations. At the same time, another aspect growing in popularity is having the location information of a user to make recommendations.
Such recommender systems are widely deployed in many consumer domains, such as online shopping, although our research focuses on real estate recommendations. Real estate recommendation is often about the location of a property item, so we have incorporated online map interactions as a tool to understand a user's interests. This paper presents four principle recommendation approaches for effectively identifying property items in our real estate portal. (1) Analysis and implementation of content-based filtering for suggesting real estate items. (2) Collaborative filtering approach reduces the computational cost by suggesting similar items to a similar group of users. (3) Location-based approach for predicting the area of interest to the user based on geographical location and user preferences. (4) Building a price prediction model to assist users in making an informed decision. The reason for selecting the first two approaches is based on the fact that the features of a real estate database closely resemble a movie database. Both content-based filtering and collaborative filtering have proven to provide precise recommendations to users [1]. Introducing a location-based approach is essential since property items have an inherent location aspect.
We have used data from the Estatech map's portal: https://www.the-estatech.com (accessed on 15 September 2021) for the recommendation part of the study. We also obtained data in explicit and implicit formats. In addition, historical data of properties and price listings were obtained from Zameen.com (accessed on 15 September 2021), a real estate portal for online property listings. The techniques and methods used for recommendation algorithms were the score tree processes, TF-IDF and K-nearest neighbours. For house prediction, we cross-compared two techniques, namely multiple linear regression and Keras regression based on neural networks.
The remainder of the article is organised as follows: Section 2 presents the related literature review. The methodological approach is given in Section 3. Section 4 presents a discussion and the results, while Section 5 concludes the study and provides future recommendations.

Related Work
Today's modern recommendation engines have emerged from the domain of information filtering, a term created by [2] outlines one solution for the issue of retrieving the correct information against a pool of massive online data, called content filters. To ascertain a user's choice correctly, multiple visualisation tools have also been developed to accurately distinguish a user's interests and inclinations. These tools can also be considered as a form of content filter. This domain has been progressing ever since. [3] demonstrate various options for integrating a recommendation engine into a real estate portal's user journey. Furthermore, in the same manner, the work validated how additional real estate details can provide more accurate recommendation results when integrated into the proposed model of deep learning and factorisation machines.
Another study by [4] aims to determine if consumer loyalty will help a recommender system be more accurate. Other techniques implemented by [5] such as using intelligent data analysis methods to create a recommender framework to solve the problem of recommending the most appropriate components for each user at any given time. They have further addressed the problem of converting an original dataset from a real component-based application to an optimised dataset. After gathering the interaction data and developing a dataset to produce optimised recommendation results, machine learning algorithms using feature engineering techniques and feature selection methods were also applied. Users and developers alike want information processing and its display to be swift. The system developed by [6] is based on an implicit profiling system for tracking the user's interests through mouse movements.
A gap analysis approach by [7] identifies the differences between theory and reality in presenting information on location choice by developing a seven-factor classification tool for evaluating property websites. To capture the relations between the latent feature vectors of real estate items, Ref. [8] utilised the average-based and individual-based geographical regularisation terms. Both terms are integrated with the weighted regularised matrix factorisation framework to model users' implicit feedback behaviours to provide them with personalised property recommendations.
A probabilistic model for collaborative filtering by [9] calculates the predicted values for items against active users, given that there is information already available about those active users. The same research divides collaborative filtering methods into two primary modules, memory-based collaborative filtering and model-based collaborative filtering. Additional probabilistic approaches have been presented, some more sophisticated than others, including the work of [10]. The recommended procedure is taken as a sequential decision-making process, and the use of Markov decision chains have been suggested to create a model. However, they do not state any improved accuracy over Breese's projected models. Another recommendation system by [11] applies content-based filtering, a fuzzy technique for identifying similar and different content and a prediction algorithm for identifying the right set of movie content for the user. At the same time, Ref. [12] developed item to item centred algorithms. It has been done to provide improved outcomes than user-based algorithms by comparing the approach with K-nearest neighbour.
In the domain of GIS, a complete map personalisation system is developed by [13] in which the users' interests are implicitly recorded and given specific rankings based on certain criteria fulfilment upon user's mouse clicks or movements. As already mentioned, map personalisation has become an area of interest since data overload has become a common scenario in spatial information systems. In the model developed by [14], the entire focus is to understand map usage patterns of the end-users. The goal is again focused on developing personalised maps for users on a web interface. Working on similar lines, RecoMap [13], is a web-based platform through which each user receives customised spatial recommendations based on their likings. The results are presented in a map interface highlighting the user's personalised spatial recommendations. The adaptive map also shows the user's preferences and the context in which they are used. A different approach by [15] is to build a recommendation system and map interface, represented in a personalised format for the user to acquire quick results. Further inferences are made by studying the user's behaviour for system improvement.
Another recommender system designed by [16] is for real estate users who do not have a user profile for any real estate portal. The session-based interaction of the user is made more effective by utilising a user's search context and ranking criteria for any suitable property item. A portal developed by [8] specifically designed for real estate uses two basic approaches for user profiling, an ontological structure and case-based reasoning. The purpose is to save the end-user from the stress of massive online searching and deliver results where the user gets quick recommendations based on their interests. A recommendation system that is being used by the US-based real estate website "Trulia" utilises a "square counting method" [17] The method works well with large scale datasets and delivers swift results per the user's preferences based on love and hate edge configurations.
Things have changed significantly in the real estate industry during the COVID-19 era. In some regions, house prices have shown signs of stagnancy and even, in some cases, decreasing trends as people lost their livelihoods. These conditions have urged people to tread more carefully while making investments in this sector. In such a scenario, a price prediction model can help users make an informed decision. A method by [18] for predicting house prices utilises a Mallows model averaging estimator, which is vigorous in terms of spatial dependence. Another study on ML models for house price prediction by concludes that the random forest regressor model provides the best results amongst all other compared models like linear regression, decision tree, k-means regression [19]. Another similar study carried out by [20] applies regression as a predictive model. They use MSE, MAE and RMSE as their evaluation metrics for their model's accuracy. Another interesting study by [21] used Multiple Regression Analysis (MRA) to estimate property prices for mass evaluation. The structural qualities and the property's location were viewed as two primary micro factors of house pricing. MRA was utilised to determine the structural characteristics and locational attributes that statistically influence house price using a sample of 106 house sale transactions from 2011 to 2015. An alternative approach by [22] focuses on traditional solutions based on widely known methods and procedures and faith in the infallibility and objectivity of a human analysing the real estate market. Since modern technologies are also boldly entering the arena. Hence, the study's key focus is that organisations should stop viewing automated solutions (such as AVM, CAMA, and AAVM) as functioning in opposition to traditional approaches and instead embrace them as supplemental tools.
Our previous work in map personalisation discusses the initial concept of personalisation using real estate analytics [23]. It also evaluates background research relating to the building blocks that lead to a recommendation engine for real-time analytics. Extensive research in this field has revealed gaps between real-estate analytics and map-based personalisation, recommendation and prediction; thus, we have tried to bridge this gap in our research and initial development work. We also found motivation for our study and consequent development since map-based personalised real estate portals do not widely exist in the online real estate market. Having to sift through a plethora of online data is no longer suitable for most users, and personalisation has become a key concept in every aspect of data search. In our scenario, real estate test users have been interacting with a real estate portal, "Estatech Maps", to search and post property items. Our recommendation system is based on three techniques. This includes content, collaboration and locationbased filtering. The interaction of users is captured via the map-based interface of the real estate application, Estatech Maps, and stored in a database. Based on this data and analysis, a user gets recommendations as per their area of interest. Along with that, we have incorporated a module based on traditional regression techniques and Keras API for predicting the future price trends of property items.
The subsequent section discusses the detailed insight of the research process regarding data collection, its pre-processing, run time environment creation, and model conception. Finally, the section will discuss the following crucial areas of the research process in detail. (1) Data collection and Technology. (2) Property Recommendation. (3) Price prediction model.

Methodology
"Estatech Maps" main focus is to provide personalised real estate listings to its users on a map-based interface by making accurate recommendations and providing insight about price trends of a user's area of interest. Recommendation and price prediction were the key focus areas to deliver map-based personalisation to the users. In the first stage, a detailed study on the mathematical interpretation of recommendation algorithms was carried out. The second stage focused on the algorithm's designs, and in the third stage, development based on those algorithms was carried out, and the models were implemented. The validation and testing of these models were carried out in the final stage of the research. The sequence of the study is illustrated in Figure 1. database. Based on this data and analysis, a user gets recommendations as per their area of interest. Along with that, we have incorporated a module based on traditional regression techniques and Keras API for predicting the future price trends of property items.
The subsequent section discusses the detailed insight of the research process regarding data collection, its pre-processing, run time environment creation, and model conception. Finally, the section will discuss the following crucial areas of the research process in detail. (1) Data collection and Technology. (2) Property Recommendation. (3) Price prediction model.

Methodology
"Estatech Maps" main focus is to provide personalised real estate listings to its users on a map-based interface by making accurate recommendations and providing insight about price trends of a user's area of interest. Recommendation and price prediction were the key focus areas to deliver map-based personalisation to the users. In the first stage, a detailed study on the mathematical interpretation of recommendation algorithms was carried out. The second stage focused on the algorithm's designs, and in the third stage, development based on those algorithms was carried out, and the models were implemented. The validation and testing of these models were carried out in the final stage of the research. The sequence of the study is illustrated in Figure 1. Regarding price prediction, after researching various prediction techniques, two models were selected. One is based on a classical regression technique, and the other relies on neural networks.

Data Collection and Technology
User interaction data was extracted from the portal over a year (May 2020-March 2021). Data were extracted in JSON format from a MongoDB database, which was converted to a CSV format. It consisted of 1600 recorded user interactions with the portal. The data for house price prediction was acquired from a Pakistani based real estate portal Zameen.com for two years between 2019-2020 for Islamabad City.
Both the datasets from Estatech maps and Zameen.com were converted into test and training datasets. Zameen.com data, used for a house price prediction model, was further converted into a validation dataset. The data consisted of multiple files: User login information (User demographics), Interaction Data (Most viewed properties list) and Item Data (Properties).
TuriCreate was used to build the recommendation engine for content-based and collaborative filtering, whereas a K-means clustering technique was employed for the location-based recommendation. TuriCreate is an open-source toolkit for building Core Regarding price prediction, after researching various prediction techniques, two models were selected. One is based on a classical regression technique, and the other relies on neural networks.

Data Collection and Technology
User interaction data was extracted from the portal over a year (May 2020-March 2021). Data were extracted in JSON format from a MongoDB database, which was converted to a CSV format. It consisted of 1600 recorded user interactions with the portal. The data for house price prediction was acquired from a Pakistani based real estate portal Zameen.com (accessed on 15 September 2021) for two years between 2019-2020 for Islamabad City.
Both the datasets from Estatech maps and Zameen.com (accessed on 15 September 2021) were converted into test and training datasets. Zameen.com (accessed on 15 September 2021) data, used for a house price prediction model, was further converted into a validation dataset. The data consisted of multiple files: User login information (User demographics), Interaction Data (Most viewed properties list) and Item Data (Properties).
TuriCreate was used to build the recommendation engine for content-based and collaborative filtering, whereas a K-means clustering technique was employed for the location-based recommendation. TuriCreate is an open-source toolkit for building Core ML models for tasks like image recognition, object detection, style transfers, and recommendation generation, among others.
Tensor Flow and Keras API were used as baseline technologies to build the house price prediction model and a proper validation for model loss and model accuracy, which was done through evaluation techniques of MSE, MAE and RMSE. TensorFlow is a machine learning software library that is free and open-source. It can be used for various activities, but it focuses on deep neural network training and inference. The Google Brain team created TensorFlow for internal Google use. In 2015, it was published under the Apache License 2.0.
The reason for using TensorFlow is that it is an open-source artificial intelligence library that builds models using data flow graphs. It enables programmers to create large-scale neural networks with multiple layers. Keras is a deep learning API written in Python that runs on top of the TensorFlow machine learning system. It was built with the objective of allowing fast experimentation.

Property Recommendation
The three areas of focus for the recommendation engine are discussed in detail in each of the following sections.

Content-Based Filtering
The concept behind recommender systems is data analytics. This can be achieved either by score-based algorithms or by suggesting to a user the top items in an N-th list of item array. In our scenario, our recommender system is designed for suggesting property items listed for sale or rent. If a person has interacted with a map-based interface with a property item, say in area "A" with attribute array "X". The recommender system can display similar items for the user in an instant and accurate manner.
In content-based filtering, the angle between the user's profile and the items the user is interested in is determined. This cosine angle determines how close in space the vectors lie to each other and is also termed cosine similarity. The closer they are, the more similar they are deemed. Let us consider a vector "U" of users {user1, user2, user3 . . . .} and a vector "P" of property items {p1, p2, p3, p4 . . . . . . }. The similarity between these two vectors can be calculated as: In other words: sim(U, P) = number of people who viewed both P1 and P2 number of people who viewed either P1 or P2 The cosine value or similarity in Equation (1) can range between −1 and 1. Based on this value, the articles are organised in descending order, and the top recommendations are made to the user.
The approach for content-based filtering is further explained in Figure 2, which shows how a tree-based criterion for item selection works. The concept is based on how much interactivity a user has with a specific item or category. Interest ratios are calculated between corresponding categories based on "incrementing the value of frequency". For example, buyers' interactions with rent or purchase categories define the interest ratio between the two categories. The flow of the function which performs frequency calculation is elaborated in Figure 3, which details another content-based filtering process, namely TF-IDF. For example, suppose a user searches for "the rise of analytics" on Google. In that case, it is inevitable that the word "the" will occur more frequently than "analytics", but the relative importance of analytics is higher than the search query point of view. In such cases, TF-IDF weighting negates the effect of high-frequency words in determining the significance of an item (document).

TF(t) =
Frequency o f term "t" in document Total number o f terms in document TF(t) is simply the frequency of a word in a document, whereas IDF(t) signifies the rarity of the word, so if the word occurring in the document is less, then the value of IDF increases. In Equation (4), the log parameter is used to dampen the effect of high-frequency words. We have utilised both the score tree process and TF-IDF approaches in formulating our content-based filtering algorithm. Initially, user-user similarity and item-item similarity are obtained in an array format. The next step in the process was the creation of the item-user similarity matrix.
( ) = log 10( ) " " TF(t) is simply the frequency of a word in a document, whereas IDF(t) signifies the rarity of the word, so if the word occurring in the document is less, then the value of IDF increases. In Equation (4), the log parameter is used to dampen the effect of highfrequency words. We have utilised both the score tree process and TF-IDF approaches in formulating our content-based filtering algorithm. Initially, user-user similarity and itemitem similarity are obtained in an array format. The next step in the process was the creation of the item-user similarity matrix.
TF(t) is simply the frequency of a word in a document, whereas IDF(t) signifies the rarity of the word, so if the word occurring in the document is less, then the value of IDF increases. In Equation (4), the log parameter is used to dampen the effect of highfrequency words. We have utilised both the score tree process and TF-IDF approaches in formulating our content-based filtering algorithm. Initially, user-user similarity and itemitem similarity are obtained in an array format. The next step in the process was the creation of the item-user similarity matrix.

Collaborative Filtering Approach
In our approach towards developing a collaborative filter for the portal, the test users were divided into segments based on their preferences and items were recommended as per mutual choices of users belonging to that segment. The more the user interacts with items on display and rates them, the more precisely the system can suggest appropriate items. The algorithms designed for collaborative filtering are mostly based on finding the similarities between users on the grounds of the rank or rating they have given to previous items. So, for predicting any item for user "u", calculations are made to compute the weighted sum of user "u" given by users to an item "i". The prediction PD u,i would then be calculated as: PD u,i is the prediction term for user 'u' against an item "i".
PD u,i is the prediction term for user 'u' against an item "i", i v,i is the interaction by the user say "v". with an item "i", s u,v is the likeness among the two users, i.e., user "u". and user "v".
As per Table 1, the interactions between users and properties is recorded, and suggestions to a new user "u1" are generated. At the same time, the symbol "x" represents any interaction between a user and a property item. It is evident that there is more similarity between user 1 and user 2 than user 3. Based on this, user 1 and user 2 will be grouped together for future recommendations. Algorithm 1 depicts a generalised algorithm that has been designed for grouping user 1 and user 2 together so that the same properties get recommended to them.

Location-Based Filtering
The purpose of a location-based recommendation system is to recommend items based on the geographical location of a user. In this case scenario, recommendations can also be made possible for a new user (cold start problem) where items get recommended based on users of nearby locations who may align with the new user based on other parameters such as age or gender etc. A location-based recommendation can immensely benefit people in saving time and travel costs when displayed effectively through an interactive interface. Equation (4) calculates the probability of interactivity of a user with an item "i" established based on distance from all previous interactions of the user, which, in our case, are other property items. Whereas in Algorithm 2 the algorithm for test user 1 has been specified.
In Algorithm 2, a generalised algorithm for calculating location-based recommendations for users is presented. It considers at least 50 users in a cluster for a similarity score calculation.

Price Prediction Model
The critical aspect to notice in the price prediction model is that the data used for this analysis is the "offered set of prices" by the real estate portal Zameen.com (accessed on 15 September 2021). These prices can change as per the market variations or any redundancy in the real estate sector.
For the prediction and analysis aspect, two regression techniques, namely (1) Multiple linear regression and (2) Keras regression, were selected. The cross-comparison and validation of these techniques were performed. The one that performed better in terms of variance score was selected as the final model for visualising house prices.

Multiple Linear Regression
This is a type of linear regression in which the supposition is that the independent variable y and the dependent variable x have a linear or direct relationship. We used the Sklearn library to import the Linear Regression module. As already mentioned, our dataset was divided into a test set and a train set.

Keras Regression
We use regression techniques to predict the independent variable y, which is price. We have 14 features (property_id, location_id, property_type, price in pkr, price in dollars, location, city, province, bedrooms, bathrooms, area purpose, date of addition to the portal, area in Marla, area in sq. ft,); therefore we selected 14 neurons as baseline along with one output and one input layer for the model. There are 4 hidden layers.
The model was trained for 400 epochs, with the training and validation precision being recorded during each cycle. Finally, the model was run on both train and test results, with the loss function being measured at each epoch to keep track of how well the model is performing.

Content-Based and Collaborative Filtering Model Building
We adopted the Sklearn library as it contains a module called pairwise distance, which identifies any two items which have similar characteristics or any two users who have similar interests. To apply such a distance, we defined a function that returns the parameters of interactions, similarity, and the type against which we are obtaining the similarity. The algorithm generates suggestions based on the user's profile (collaborative filtering model) for the first case. For the second case, the suggestions are based on the item's attributes (content-based filtering). In the end, we were able to obtain recommendations for both users and items. In Tables 2 and 3, it can be seen that for all users, "U", scores "S" are obtained in descending order, with the highest similarity scores at the top. In Tables 4 and 5, it is observed that the scores obtained against each user are not easy to interpret. It is not clear against which property ID the user is getting the suggested items of interest. To make our results clearer, we have utilised the Turicreate library. This made the results obtained easier to understand. Table 4 represents content-based recommendation model results. The model was assessed for five users of the portal, and recommendations were generated for them.
In Table 4 the set of 5 users are recommended the same 5 property items due to the popularity of those items as being the most interacted with. Table 5 shows the properties recommended to users based on grouping with other users having similar interests. Property IDs having a higher score are ranked higher. Each user is recommended a different set of properties, which clearly shows that personalisation exists for each user.

Location-Based Recommendation Model Building through K-Means Clustering
K-means clustering ascertains the "k" number of centroids within a dataset. After that, it assigns every data point with the closest cluster. These data points eventually end up being in the cluster with the nearest mean. In our approach, the purpose of applying K-Means clustering is to group similar users based on their respective locations. As the users get clustered, the top most searched or interacted item among that user group starts to get recommended to each user. Figure 4 shows auto-generated locations for users from different places in Islamabad, along with the corresponding cluster IDs. As one hovers above any cluster, it shows the most searched item in that cluster, which gets recommended to users of that cluster. For example, in one of the clusters, the count of searched property ID 689 is the highest. Since it is equal to the highest count for that cluster, all users falling within that cluster will get property ID 689 as the recommended property for view.

Location-Based Recommendation Model Building through K-Means Clustering
K-means clustering ascertains the "k" number of centroids within a dataset. After that, it assigns every data point with the closest cluster. These data points eventually end up being in the cluster with the nearest mean. In our approach, the purpose of applying K-Means clustering is to group similar users based on their respective locations. As the users get clustered, the top most searched or interacted item among that user group starts to get recommended to each user. Figure 4 shows auto-generated locations for users from different places in Islamabad, along with the corresponding cluster IDs. As one hovers above any cluster, it shows the most searched item in that cluster, which gets recommended to users of that cluster. For example, in one of the clusters, the count of searched property ID 689 is the highest. Since it is equal to the highest count for that cluster, all users falling within that cluster will get property ID 689 as the recommended property for view.

Recommender System Validation
To validate the recommendations, one can simulate the user behaviour and fill in the possible or missing ratings or, in our case, the interactions a user might have with any prospective property items. The simulated values can then be further evaluated with error metrics such as the mean squared error to determine the deviation of predicted over

Recommender System Validation
To validate the recommendations, one can simulate the user behaviour and fill in the possible or missing ratings or, in our case, the interactions a user might have with any prospective property items. The simulated values can then be further evaluated with error metrics such as the mean squared error to determine the deviation of predicted over observed values. The overall error of these values can provide us with an overview of the accuracy of our model. Table 6 shows the generated matrix for interactions a user would likely have with property items and the MSE calculation for the overall matrix. Other methods for model validation can be performed through recall and precision. Both are very useful, as they show how accurate the recommendations are. However, the issue with recall and precision is that after applying these metrics, the recommended items are not sorted by their weighted value. MAP@k (Mean Average Precision at k) is an evaluation metric that considers the order of the recommended items as well. In our case, we have recommended 5 items to the set of 5 users, so in our case k = 5. We set an experimental environment for our group of five portal users; they were provided with a list of recommended items in the order generated by our recommender engine. The users interacted with certain items and provided verbal feedback on whether the generated recommendations were of interest. For example, the statistical accuracy of the recommendation engine is as follows for a given User 1.
The precision is higher for the first three items which were interacted with, but for the last two items with which the user did not interact, the precision falls. Therefore, for user 1, the average precision is almost 80%. Whereas for all sets of users, this will be the mean average precision and can be calculated by taking the average precisions' mean.

House Prise Perdiiction Moderl
As previously mentioned, we have cross-compared and validated two property price prediction models (1) Multiple Linear Regression (2) Keras Regression. MLR is based on traditional regression techniques, whereas Keras has its basis in neural networks. We tested both approaches in a runtime Python environment. Figure 5 provides a high-level description of the procedure adopted for the model. Both models performed well with the given data and parameters. Still, the model with lower error rates and better variance score or coefficient of determination was chosen as the final model for deployment.

Multiple Linear Regression
After running the first multiple linear regression model, Figure 6 illustrates the toprecommended properties based on location recommendations in different localities of Islamabad city, while Figure 7 represents the price prediction visualisation. The actual price is the one that was already present in the test data, whereas the predicted price was obtained after running the model on the train data. Table 7 represents the MAE, MSE and RMSE errors of these predictions. We can also see the variance score to be 0.70397 approximately.

Multiple Linear Regression
After running the first multiple linear regression model, Figure 6 illustrates the toprecommended properties based on location recommendations in different localities of Islamabad city, while Figure 7 represents the price prediction visualisation. The actual price is the one that was already present in the test data, whereas the predicted price was obtained after running the model on the train data. Table 7 represents the MAE, MSE and RMSE errors of these predictions. We can also see the variance score to be 0.70397 approximately.

Multiple Linear Regression
After running the first multiple linear regression model, Figure 6 illustrates the toprecommended properties based on location recommendations in different localities of Islamabad city, while Figure 7 represents the price prediction visualisation. The actual price is the one that was already present in the test data, whereas the predicted price was obtained after running the model on the train data. Table 7 represents the MAE, MSE and RMSE errors of these predictions. We can also see the variance score to be 0.70397 approximately.   The variance score of Keras regression is approximately 0.8028. This is a better performance than the multiple linear regression approach. Furthermore, numerical errors for RMSE in the case of Keras have also been reduced, as shown in Tables 8 and 9. Therefore, our predictions in this case scenario are closer to actual pricing.

Keras Regression
The variance score of Keras regression is approximately 0.8028. This is a better performance than the multiple linear regression approach. Furthermore, numerical errors for RMSE in the case of Keras have also been reduced, as shown in Tables 8 and 9. Therefore, our predictions in this case scenario are closer to actual pricing.   and what areas will remain stagnant. This data has been analysed for the past 2 years, and the predictions show how prices will change or remain the same in the coming years. The blue area in the figure depicts how prices have increased significantly in the coming years in those neighbourhoods. In contrast, red areas indicate stagnancy in prices, providing a user with a clearer picture to assist in decision making.

Conclusion
Three different recommendation algorithms for the real estate portal "Estatech Maps" were developed along with two different models for house price prediction. First, we set our goals to analyse and implement content-based filtering for suggesting real estate items. The collaborative filtering approach was used for reducing the

Conclusions
Three different recommendation algorithms for the real estate portal "Estatech Maps" were developed along with two different models for house price prediction. First, we set our goals to analyse and implement content-based filtering for suggesting real estate items. The collaborative filtering approach was used for reducing the computational cost by suggesting similar items to a similar group of users. Then, we applied the location-based approach for predicting the areas of interest to the user based on the user's geographical location. All this was achieved with a minimum precision of 79%. Prediction models were created, and results were visualised by price increase, decrease or stagnancy in multiple sectors of Islamabad city to better assist people planning future land asset purchases. Our model was able to precisely predict the changes in house prices trends with a minimum accuracy of 80%, which was through our neural network-based prediction model. This work can be effectively utilised in any real-estate sale and purchase domain and will improve the overall user experience of real estate portals. This proves the viability of our map-based system in providing data and recommendations to users based on the popularity of an item, user similarity and geographical location.
While nowadays recommendation and predictive analysis are becoming a common trend in even the smallest of businesses, in Pakistan, the real estate industry is lacking when it comes to implementing these techniques not only in terms of a map-based interface but also in terms of presenting these items to the user in an effective way. Therefore, our approach for displaying an item of interest to the user on a map-based interface would be one of the pioneers in real estate portals in Pakistan.
We have used sequential NN models for our recommendation and prediction in this research. One area of improvement and basis for future work could be exploring and implementing these as parallel models to improve response time and efficiency. Another approach could be combining multiple techniques to create a hybrid model. The same approach was used in the study where the Cobb-Douglas and linear regression models were combined to form a mathematical model [24]. GIS was an additional tool to organise the regional data of the area under study. In turn, this can cover a broader spectrum of users' behaviours and avoid high computational costs at the server end.