Impact of Green Features on Rental Value of Residential Properties: Evidence from South Africa

: In recent years, scholars have called for an increase in the usage of green features in the built environment to address climate change issues. Governments across the developed world are implementing legislation to support this increased uptake. However, little is known about how the inclusion of green features influences the rental value of residential properties located in developing countries. Data on 389 residential properties were extracted and collected from a webpage. Text mining and machine learning models were used to evaluate the impact of green features on the rental value of residential properties. The results indicated that floor area, number of bathrooms, and availability of furniture are the top three attributes affecting the rental value of residential properties. The random forest model generated better predictions when compared with other modelling techniques. It was also observed that green features are not the most common words mentioned in rental adverts for residential properties. The results suggest that green features add limited value to residential properties in South Africa. This finding suggests that there is a need for stakeholders to create and implement policies targeted at incentivising the inclusion of green features in existing and new residential properties in South Africa.


Introduction
The construction industry assembles the constructed spaces required for productive activities within an economy.Several years of research have shown that investments in infrastructure stimulate economic growth [1].The products of the construction sector include buildings, highways, power plants and ports, among others.Despite its importance, activities in the construction sector generate about 40% of global greenhouse gas (GHG) emissions [2,3].Also, greenhouse gases (such as carbon dioxide) and other air pollutants (such as PM 10 ), which are generated during the construction process, have adverse effects on human health and the environment.For instance, exposure to air pollutants has been linked to an increased risk of respiratory illness [4].To address these problems, there have been calls for the adoption of sustainable practices in the construction sector.
The number of studies focused on identifying sustainable construction practices has been increasing.Research shows that the use of precast concrete components [5], a reduction in cement required for concrete through material replacement [6], the use of phase-change materials [7], and the inclusion of green areas into new developments [8] can reduce the volume of GHGs attributed to construction works.Also, governments across the globe are implementing policies to encourage the use of green technologies.For instance, the government of Hong Kong and South Korea grants gross floor area concessions for property developments that incorporate green features [9].The adoption of sustainable practices is beneficial to society and the environment.For instance, it would reduce emissions and improve air quality.Thus, there is a need to encourage stakeholders to embed green features in residential developments.
Researchers have reported that these green features are beneficial to the environment and occupants of buildings.For instance, gardens have been linked to improved food security, quality of life, productivity of people, and preservation of biodiversity within urban spaces [10][11][12].Also, the use of technologies such as solar panels reduces the energy required by the occupants of properties [13].In recent years, researchers have shown increased interest in examining the effect of green features on the value of real estate properties [14,15].Voicu and Been [14] evaluated the effect of communal gardens on the value of real estate properties located in New York City.Several models have been developed for the prediction of the value of residential properties in developing countries [16,17].However, the impact of embedding green features on the rental value of residential properties in developing countries remains unclear.
To address this gap in knowledge, data mining techniques were used to explore data and develop a predictive model for rental values of residential properties located in Cape Town, South Africa.The objectives of the study are (i) to examine the impact of green features on the rental value of residential properties and (ii) to assess the efficacy of using machine learning models for the prediction of rental values of residential properties.The data contribute to the existing knowledge in several ways.First, they provide insights into the effect of green features on the rental value of properties.Second, this study shows the efficacy of using machine learning models for the prediction of the rental value of properties.Third, the outcome of this study provides information to stakeholders for the development of strategies to encourage the inclusion of green features in residential properties.Finally, this study shows that the content of advertisements can be used to predict the value of real estate properties.

Materials and Methods
There is a large volume of research focused on modelling and predicting the value of residential properties.Existing research shows that structural features of real estate properties, the location of a building, and neighbourhood factors [18] are antecedents of the value of residential properties.The remainder of this section is structured into three subsections.The first section provides insights into the determinants of the value of residential properties.The second section highlights the effect of green features on the value of residential properties.The final section is a concise summary of methods used in previous research in predicting the value of residential properties.

Determinants of the Value of Residential Properties
Property values, the preferences of potential buyers, and investment decisions are determined by several factors.Over the years, research has shown that property values can be estimated as a function of these inherent determinants (features) that maximise utility for buyers and investors [19].These attributes can be broadly grouped into structural, locational, and neighbourhood factors [18,20,21].Based on evidence gleaned from the literature, it is evident that several factors (determinants) influence the value of residential properties.
Research into the determinants of residential property values has a long history.The determinants of the value of residential properties are summarised and presented in Table 1.The number of servants' quarters and number of bedrooms are the most important features influencing the value of residential buildings in Lagos, Nigeria [16].In contrast, gardens have the highest impact on the value of residential properties in Hong Kong [22].The variance in the Hong Kong and Nigerian studies could be linked to the differences in the property market characteristics, government policies, and attitudes of local market participants.

Impact of Green Features on Residential Property Values
Recently, more attention has focused on the effects of embedding green features on the value of residential properties.In the context of property economics, these studies were designed to provide empirical evidence to justify the inclusion of green features in the design of buildings.As mentioned in the introduction section of this paper, the inclusion of green features in buildings is beneficial to occupants and the environment.However, evidence showing economic benefits is essential for encouraging the inclusion of green features in residential buildings.
To date, several studies have investigated the effects of green features on the value of residential properties.The findings reported in previous research have been contradictory.Some studies have shown that urban green spaces (UGSs) have a positive effect on the value of residential properties [20,38].In contrast, other studies [39,40] indicated that green features do not contribute to the value of residential properties.The variance in findings reported in previous research could be attributed to the following reasons: (i) proximity, (ii) characteristics of the neighbourhood, and (iii) preferences of tenants or investors in certain property markets.For example, [38] revealed that property prices tend to decrease as the distance to a UGS increases.Also, the socio-economic background of the local populace influences the acceptance of green roof systems [41].The presence of communal gardens in poor neighbourhoods has been linked to crimes, anti-social behaviour, and offensive motifs or wall murals [20,36].Based on the foregoing, it is clear that the effect of green features on the values of residential properties tends to vary based on the aforementioned three reasons.
The current study builds on existing literature by developing a model for predicting the rental value of residential properties.Also, the estimated model measures the effects of embedding green features on the rental value of residential properties in South Africa.To date, little is known about the economic value of embedding green features in residen-tial buildings in the context of developing countries.This study fills the gap in current knowledge by providing insights into the impact of green features on the rental value of residential properties.The next section provides a summary of the literature on methods used for modelling and prediction of residential property values.

Methods Used for Predicting the Value of Residential Properties
Many methods have been used for modelling and prediction of the value of residential properties.A seminal study in this area is the work of [42], which utilised linear regression as a tool for modelling and prediction of the value of residential properties.The terms "regression model" and "Hedonic Price Models" (HPMs) are used interchangeably in the literature.Research has shown that the characteristics of data (such as non-linearity, multicollinearity, and outliers, among others) tend to affect the performance of regression models [43][44][45].For example, [45] showed that neural network (NN) algorithms outperform regression models when applied to the prediction of property values.The predictive performance of the NN model indicates that the algorithm has the capability to capture the non-linearity present in real-world data.Thus, there has been a shift towards the use of non-linear models, such as data mining techniques, for the prediction of the value of properties.
The advancement in the field of computer science has led to the emergence of several data mining techniques.These new methods tend to address the limitations associated with the use of older techniques.For instance, local minima and overfitting are some of the weaknesses associated with NN models [46].Newer methods, such as support vector machines, can overcome the limitations of the NN models.In recent years, newer methods (such as support vector machine and XGBoost) have been developed.Also, these newer methods generate better predictions when compared to older methods [47,48].However, studies focused on evaluating the efficacy of some of these new methods are limited.
Various data mining techniques have been used for property valuation.The techniques found in the existing literature include support vector machine and decision tree, among others [22,27].Previously, some of these data mining techniques were referred to as "black box" models, i.e., they provide no information on the effect of independent variables on the value of residential properties.However, this limitation has been addressed with the development of methods such as sensitivity analysis and variable importance [49].Hence, the current study seeks to use seven data mining techniques for modelling and predicting the rental values of residential properties (see Section 3.4).The predictive performance of these methods is compared.

Methods
A multiplicity of methods was used to model the relationship between the attributes of a residential property and its value.Subsequently, the estimated model could be used for the prediction of the value of residential properties.The methods used for modelling the value of properties were discussed in the literature review section.The use of models offers an effective way to explore the relationship between property attributes and their economic value [50].Two data mining techniques (text mining and logistic regression) were utilised in the current study.The coefficient of logistic regression provided insight into the strength of the relationship between the attributes of a residential property and its value.
Like previous research [16], the estimated model was validated by using it for outof-sample prediction, and the data were collected from reliable secondary sources.This is a form of triangulation that improves the validity and credibility of the outcomes of research [51].The combination of the quantitative findings from the logistic regression model and the qualitative findings from the text mining contribute to the rigour of the current study.Figure 1 shows the research framework utilised in this study.
of-sample prediction, and the data were collected from reliable secondary sources.This is a form of triangulation that improves the validity and credibility of the outcomes of research [51].The combination of the quantitative findings from the logistic regression model and the qualitative findings from the text mining contribute to the rigour of the current study.Figure 1 shows the research framework utilised in this study.

Data Collection
The data used in this study were collected from Cape Town, South Africa.Cape Town is one of the largest residential settlements in South Africa [52].The city attracts many tourists due to its shoreline and beaches [52].The city's seaport and a combination of public and private organisations provide employment and business opportunities to its residents [53].Thus, the attraction of people to the city can be attributed to the opportunities that it provides.
The data used for this research were collected from the Property24 webpage (https://www.property24.com/).Property24 is one of the largest online platforms for advertising residential and commercial properties that are available for rent and sale.Each page contains information about the features of each property and its rental value.This webpage was adopted in similar previous studies on the value of real estate properties [17,54].A total of 389 adverts were retrieved and used for analysis in this study.Because location has an impact on the value of residential properties, the data collected for this study were limited to three areas (i.e., City Centre, Sea Point, and Green Point) in Cape Town.These areas are within close proximity of Cape Town's City Centre.Each advert contains two types of information: numerical and text.The numerical data provide potential tenants with information on the property features, such as number of bedrooms and number of bathrooms, among others.The text data contain descriptive information about the residential property.For instance, text data provide information on the availability of furniture within the residential property.
The numerical and textual data for each residential property were collected.The data contained information on 13 independent variables which included the number of bedrooms, number of bathrooms, number of parking spaces, type of car parking facility (covered, open or street), dining room, lounge, availability of swimming pool, floor area, balcony, furniture (furnished or unfurnished), services (such as concierge), garden, and the distance between each advertised property and the nearest police station.The dependent variable was property rent.The numerical and textual data were analysed using several data mining techniques.

Data Collection
The data used in this study were collected from Cape Town, South Africa.Cape Town is one of the largest residential settlements in South Africa [52].The city attracts many tourists due to its shoreline and beaches [52].The city's seaport and a combination of public and private organisations provide employment and business opportunities to its residents [53].Thus, the attraction of people to the city can be attributed to the opportunities that it provides.
The data used for this research were collected from the Property24 webpage (https: //www.property24.com/).Property24 is one of the largest online platforms for advertising residential and commercial properties that are available for rent and sale.Each page contains information about the features of each property and its rental value.This webpage was adopted in similar previous studies on the value of real estate properties [17,54].A total of 389 adverts were retrieved and used for analysis in this study.Because location has an impact on the value of residential properties, the data collected for this study were limited to three areas (i.e., City Centre, Sea Point, and Green Point) in Cape Town.These areas are within close proximity of Cape Town's City Centre.Each advert contains two types of information: numerical and text.The numerical data provide potential tenants with information on the property features, such as number of bedrooms and number of bathrooms, among others.The text data contain descriptive information about the residential property.For instance, text data provide information on the availability of furniture within the residential property.
The numerical and textual data for each residential property were collected.The data contained information on 13 independent variables which included the number of bedrooms, number of bathrooms, number of parking spaces, type of car parking facility (covered, open or street), dining room, lounge, availability of swimming pool, floor area, balcony, furniture (furnished or unfurnished), services (such as concierge), garden, and the distance between each advertised property and the nearest police station.The dependent variable was property rent.The numerical and textual data were analysed using several data mining techniques.
The statistics that describe the data collected for each residential property are presented in Table 2.

Data Processing
Textual and numerical data were collected from the webpage of "Property24".The text data were saved in txt files and numerical data were saved in .xlsfile format.In South Africa, the rent paid by tenants to a property owner or investor is classed as personal income [55].The tax payable by the property owner is computed as a fraction of the annual rental income paid by tenants.In this study, the annual rents of the residential properties were computed by multiplying the monthly rental value by 12 months.Based on income tax brackets for South Africa [55], the residential properties were classified into two groups (A and B) using the annual rents payable by tenants to property owners (Table 3).

Text Mining
Due to advances in computing, the volume of unstructured (textual) data has increased significantly.Traditionally, textual data are analysed using content analysis [56].The growth in the quantity of textual data makes it difficult to use conventional methods to analyse these data.In recent years, text mining has emerged as an automated technique that can be used for the analysis of text.Text mining offers a robust approach to uncovering the underlying trends and patterns in text for making informed decisions.In previous studies, text mining was used to identify the determinants of hotel customers' satisfaction [57] and the vulnerability of software components [58], among others.In the current study, text mining was used to identify the features (determinants) of a residential property that influence its rental value.More specifically, text mining was used to extract and identify the determinants of the rental values of residential properties located in Cape Town.
A rising amount of research has shown that the content of an advert provides insight into the needs of the consumers and the value of the advertised product.For example, [59] found that the content of adverts influenced consumers' food choices.The findings presented in [60] indicate that the demand for a product is strongly influenced by the advertising content.It can be inferred that the content of adverts placed for residential properties on the Property24 webpage is designed to provide information on attributes that meet the needs of potential tenants.Thus, the application of text mining to adverts for residential properties will provide insights into the factors influencing their values.
For the present study, a total of 389 adverts relating to residential property rental were collected from the Property24.comwebpage.The process of using text mining for extracting relevant determinants of the rental value of residential properties is illustrated in Figure 1.R programming was used to implement the text mining phase of this study [61].

Machine Learning
As stated previously, text mining was used for the analysis of the collected textual data.In contrast, machine learning techniques were used for modelling numerical data.To assess the efficacy of using these methods for modelling and forecasting the rental prices of residential properties, seven techniques were applied in the current study.The techniques applied are neural network (NN), K-nearest neighbour (k-NN), logistic regression, boosted tree (XGBoost), random forest, and support vector machine.Additional details on the machine learning algorithms applied in the current study can be found in [62].The logistic regression model was used as the benchmark for evaluating the predictive performance of other machine learning techniques.
Machine learning techniques can be applied to two types of predictive tasks, i.e., classification and regression.The classification task refers to the prediction of class labels.For example, [63] used machine learning models to predict the two classes, i.e., the chances that a student will drop out or not.For regression tasks, machine learning models are used for the prediction of numbers.Abidoye and Chan [45] used the neural network model for the prediction of the sales price of residential properties.In the current study, machine learning techniques were used for a classification task.The model predicted two class labels, i.e., class "A" or "B" (see Table 2).
The process of developing machine learning models can be divided into four distinct but interrelated phases.First, the independent variables to be included in the estimated models were identified.The outcome of the initial review of the literature and text mining provided justification for the variables used in model estimation.Second, the data used for the development of the models were collected from the Property24 webpage.Data relating to 389 residential properties were collected.Third, the machine learning models were trained.For the training process, the data were divided into two groups, i.e., training and test data sets.
The ratio of training data to test data used in previous research can be grouped into three clusters, i.e., 90:10, 80:20, and 70:30 [64].In the current study, the collected data were randomly divided into two parts (70% training and 30% test data).To ensure that a robust model was identified, the cross-validation technique was used during the model training process.The test data set was used for model validation, i.e., to evaluate the ability of the model to predict unseen data.Finally (i.e., in the fourth stage), the trained machine learning models were used to predict the rental value of the residential properties in the test data set.The machine learning models were implemented using the R programming software [61] and the "tidymodels" package [65].

Methods for Evaluating Accuracy of Models
There are several metrics available for evaluating the predictive accuracy of classification models.However, the suitability of a particular metric is largely dependent on the circumstance of its application.Accuracy, sensitivity, specificity, recall, and precision are some examples of metrics used for evaluating models [66,67].In this study, the trained models were evaluated using accuracy, F1 score, and Matthews Correlation Coefficient (MCC).The formula used for computing these metrics can be found in these published studies [67][68][69].These metrics were adopted due to their usage in similar previous research [68,70].The values of accuracy and F1 score range between 0 and 1 (a value close to 1 indicates that the model is generating reliable predictions).Also, the value of MCC ranges between +1 and −1 (+1 refers to a perfect match between predicted and actual data; 0 indicates that the model is no better than a random flip of a coin; −1 represents a model in which there is no agreement between predicted and actual data).

Results
The results from the application of the two data mining techniques are presented in the subsequent subsections.

Text Mining
The files containing the textual data relating to the adverts were imported into the software.In the R programming environment, the text mining process was implemented.Based on the analysis of text, the most frequent words and co-occurrence of words are presented in the two subsequent sections (Figures 2 and 3).models were evaluated using accuracy, F1 score, and Matthews Correlation Coefficient (MCC).The formula used for computing these metrics can be found in these published studies [67][68][69].These metrics were adopted due to their usage in similar previous research [68,70].The values of accuracy and F1 score range between 0 and 1 (a value close to 1 indicates that the model is generating reliable predictions).Also, the value of MCC ranges between +1 and −1 (+1 refers to a perfect match between predicted and actual data; 0 indicates that the model is no better than a random flip of a coin; −1 represents a model in which there is no agreement between predicted and actual data).

Results
The results from the application of the two data mining techniques are presented in the subsequent subsections.

Text Mining
The files containing the textual data relating to the adverts were imported into the software.In the R programming environment, the text mining process was implemented.Based on the analysis of text, the most frequent words and co-occurrence of words are presented in the two subsequent sections (Figures 2 and 3).

Word Frequency
Figure 2 shows the 20 most frequent words in rental adverts for residential properties.Based on the data presented in Figure 2, it can be seen that "bathrooms" is the most frequent word mentioned in the rental adverts for residential properties.As expected, the number of bathrooms, number of bedrooms, parking, floor size, and furniture are the most mentioned words in the adverts for residential properties.The most frequently mentioned words give potential tenants insights into the size of spaces and the facilities available in the residential properties.Surprisingly, it was observed that sustainable features were not among the most frequent words included in the rental adverts of residential properties, except for the "garden".This finding suggests that potential tenants are not willing to pay a premium for green features incorporated into residential properties.
Real Estate 2024, 1, FOR PEER REVIEW 9 Figure 3. Word co-occurrence analysis of adverts for residential properties.

Word Frequency
Figure 2 shows the 20 most frequent words in rental adverts for residential properties.Based on the data presented in Figure 2, it can be seen that "bathrooms" is the most frequent word mentioned in the rental adverts for residential properties.As expected, the number of bathrooms, number of bedrooms, parking, floor size, and furniture are the most mentioned words in the adverts for residential properties.The most frequently mentioned words give potential tenants insights into the size of spaces and the facilities available in the residential properties.Surprisingly, it was observed that sustainable features were not among the most frequent words included in the rental adverts of residential properties, except for the "garden".This finding suggests that potential tenants are not willing to pay a premium for green features incorporated into residential properties.

Word Co-Occurrence
Co-occurrence analysis was applied to the rental adverts of residential properties.In this phase, the co-occurrence analysis was used to identify the words that are mentioned frequently in the rental adverts in the residential properties.The results of word co-occurrence analysis are presented in Figure 3.As shown in Figure 3, it can be seen that the words "bathrooms", "bedrooms", "parking", and "floor-size" (floor area) were mentioned in all 389 rental adverts for residential properties.This finding suggests that "bathrooms", "bedrooms", "parking", and "floor-size" (floor area) are the most important determinants of the rental value of residential properties.In South Africa, inefficient public transport services have contributed to high levels of car ownership [71].As expected, proximity to public transportation was rarely mentioned in the adverts for residential properties.

Predictive Performance of Machine Learning Models
To assess the efficacy of using machine learning models for the prediction of rental

Word Co-Occurrence
Co-occurrence analysis was applied to the rental adverts of residential properties.In this phase, the co-occurrence analysis was used to identify the words that are mentioned frequently in the rental adverts in the residential properties.The results of word cooccurrence analysis are presented in Figure 3.As shown in Figure 3, it can be seen that the words "bathrooms", "bedrooms", "parking", and "floor-size" (floor area) were mentioned in all 389 rental adverts for residential properties.This finding suggests that "bathrooms", "bedrooms", "parking", and "floor-size" (floor area) are the most important determinants of the rental value of residential properties.In South Africa, inefficient public transport services have contributed to high levels of car ownership [71].As expected, proximity to public transportation was rarely mentioned in the adverts for residential properties.

Predictive Performance of Machine Learning Models
To assess the efficacy of using machine learning models for the prediction of rental prices of residential properties, several models were trained, and the predictive performance of the models was compared.As stated previously, the collected data were divided into two groups: training (272/389-70%) and test sets (117/389-30%).The variables used to train the models are presented in Table 2. Table 3 shows the classification of rental prices that the models were trained to predict.The optimal parameters for the machine learning models were identified using grid search and 10-fold cross-validation of the training data.To prevent data leakage, only the training data were used for the process of training the models.
The optimal parameters of the machine learning models are shown in Table 4.The trained models were used to generate predictions of rental prices for the training data set.Also, the predictive performance of the developed models is summarised and presented in Table 4.For the in-sample evaluation of the trained models, two metrics (accuracy and F1 score) indicate that the neural network model generates a better forecast than other machine learning models.However, the value of the MCC correlation coefficient indicates that the predictions from the support vector machine model are the most accurate, i.e., providing an improved level of agreement between the predicted and actual rental value of residential properties.This finding reiterates the importance of using multiple metrics to evaluate the performance of prediction models as suggested in [69].The trained models were used to generate predictions for the test data set, i.e., the model validation process.Model validation is used to evaluate the ability of a model to predict previously unseen data.Using a similar approach, three metrics were used to evaluate the difference between the predictions and the actual rental value of residential properties in the test data set.Table 5 provides an overview of the error metrics emanating from the model validation process.The values of accuracy, F1 score, and MCC for the random forest model are closer to 1 when compared with those of the other model.This finding shows that the random forest model generates reliable predictions of the rental value of residential properties.Traditionally, machine learning models are referred to as "blackbox" models because the effect of the independent variables on the predicted variable is unknown.Sensitivity analysis has been proposed as one of the methods that can be used to uncover the strength of the relationship between variables incorporated into machine learning models [49].Sensitivity analysis quantifies the contribution of each independent variable to improvement in the model's ability to make reliable predictions of rental value.Due to the predictive performance of the random forest model, sensitivity analysis was carried out on the model (see Figure 4).It can be seen from Figure 4 that floor area, number of bathrooms, and furniture are the best predictors of the rental value of residential properties.In contrast, gardens, pools, and dining areas had the least impact on the ability of the model to predict the rental value of residential properties.
furniture are the best predictors of the rental value of residential properties.In contrast, gardens, pools, and dining areas had the least impact on the ability of the model to predict the rental value of residential properties.

Discussion
Very little was found in the existing literature on the impact of green building features on the rental values of residential properties.With respect to the gap in the current knowledge, the current study shows that gardens had little effect on the model's ability to predict the rental value of residential properties when compared with the floor area.Also, floor area, number of bathrooms, and availability of furniture were found to be the attributes that had the largest impact on the rental value of residential properties.Interestingly, the random forest model generates accurate predictions of the rental value of residential properties when compared to the other machine learning models that were developed in this study.Comparing the qualitative and quantitative data, it is evident that the frequently mentioned words in adverts are good predictors of the rental value of properties.
The results emanating from the current study are consistent with those reported in previous studies [16,72,73].Abidoye and Chan [16] showed that the number of rooms in boys' quarters, number of bedrooms, sea view, and number of bathrooms are good predictors of the value of residential properties.Also, Nor et al. [72] found that size, location, and security have significant impacts on the rental value of residential properties.A study conducted in Germany showed that the age and size of a residential property have significant impacts on its rental value [73].These results are consistent with those reported in Cespedes-Lopez's study [74], which showed that green features (such as green rating) do not influence the value of residential properties located in Alicante, Spain.Overall, the

Discussion
Very little was found in the existing literature on the impact of green building features on the rental values of residential properties.With respect to the gap in the current knowledge, the current study shows that gardens had little effect on the model's ability to predict the rental value of residential properties when compared with the floor area.Also, floor area, number of bathrooms, and availability of furniture were found to be the attributes that had the largest impact on the rental value of residential properties.Interestingly, the random forest model generates accurate predictions of the rental value of residential properties when compared to the other machine learning models that were developed in this study.Comparing the qualitative and quantitative data, it is evident that the frequently mentioned words in adverts are good predictors of the rental value of properties.
The results emanating from the current study are consistent with those reported in previous studies [16,72,73].Abidoye and Chan [16] showed that the number of rooms in boys' quarters, number of bedrooms, sea view, and number of bathrooms are good predictors of the value of residential properties.Also, Nor et al. [72] found that size, location, and security have significant impacts on the rental value of residential properties.A study conducted in Germany showed that the age and size of a residential property have significant impacts on its rental value [73].These results are consistent with those reported in Cespedes-Lopez's study [74], which showed that green features (such as green rating) do not influence the value of residential properties located in Alicante, Spain.Overall, the findings provide further support for the hypothesis that the size of residential properties is a good indicator of their value.
The findings reported here are contrary to previous studies, which have suggested that green features have a significant impact on the rental value of residential properties.For instance, Choy et al. [28] showed that a positive relationship exists between the presence of a garden and the value of residential properties.Davis et al. [75] showed that improving the energy performance of a residential building results in an increase in its value.Gardens were rarely mentioned in adverts when compared to other significant features, such as the number of bedrooms.In addition, sustainable features, such as photovoltaic systems, are not the most commonly mentioned words in the rental advert for residential properties.The finding emanating from the current study suggests that green features have a limited impact on the rental value of residential properties when compared to floor area.
The non-convergence of the findings could be due to the uniqueness of the property market in various countries.For instance, Hong Kong has laws to encourage developers to incorporate green features into new residential developments.However, such laws do not exist in this study area.The low impact of green features suggests that there is little or no incentive for home owners and property developers to incorporate them into existing and new homes.One of the main issues that emerge from these findings is that there is a need for the development and implementation of policies targeted at encouraging the inclusion of sustainable features in residential properties.

Conclusions
An understanding of the effect of green features on the value of residential properties is a topical issue within the built environment.This information would provide a justification for the inclusion of green features in the designs of residential properties.This study has identified that floor area has a great impact on the ability to predict the rental value of residential properties.The impact of gardens on the rental value of residential properties is limited.This study has also shown that the estimated random forest model is a good tool for predicting the rental value of residential properties.
Based on the findings of this study, the following conclusions can be drawn: (i) there is a need to incentivise the inclusion of green features in residential properties; (ii) machine learning models are useful tools for the prediction of the value of residential properties; and (iii) the content of advertisements gives insights into the features that influence the value of a product.The predictive models reported here have extended our knowledge of the impact of size and other features of a residential property on its rental value.Although this study focuses on residential property valuation, the findings may be able to explain the underlying reasons for the slow uptake of green technologies in developing countries.
Being limited to Cape Town, the developed model may not be applicable to the whole of South Africa.The data used for the development of the models were secondary data (i.e., data published on the website).The authors did not check if there were differences between the listed price and finally agreed price.Notwithstanding this limitation, the triangulation of qualitative and quantitative data indicates that a robust approach was utilised in the present study.Advancements in the field of data science have led to the emergence of new methods.Further research should be undertaken in other cities in the developing world to validate or disprove the findings of the current study.Unless governments incentivise the inclusion of green features in residential properties in the developing world, sustainable development will not be achieved.

Figure 2 .
Figure 2. Most frequent 20 words in the adverts for residential properties.

Figure 2 .
Figure 2. Most frequent 20 words in the adverts for residential properties.

Figure 3 .
Figure 3. Word co-occurrence analysis of adverts for residential properties.

Figure 4 .
Figure 4. Sensitivity analysis of the random forest model.

Figure 4 .
Figure 4. Sensitivity analysis of the random forest model.

Table 1 .
Summary of studies on residential property value determinants.

Table 2 .
Descriptive statistics relating to the collected data.
* This classification is based on income tax.See Section 3.2 and Table3for an explanation.

Table 3 .
Classification of rental price.

Table 4 .
Predictive performance of machine learning models on training set.

Table 5 .
Predictive performance of machine learning models on test set.