Exploring the Spatial Distribution Characteristics of Emotions of Weibo Users in Wuhan Waterfront Based on Gender Di ﬀ erences Using Social Media Texts

: The beneﬁts of the natural environment in urban space have been explored in numerous studies. However, only a few statistics and studies have been conducted on the correlation between emotion and urban waterfront space, especially considering gender di ﬀ erences. Taking Wuhan city as an example, this study puts forward a new approach and perspective. Text emotion analysis is combined with the spatial analysis technique based on big data of social media. Based on the emotions of the public of di ﬀ erent genders in urban space, suggestions are provided for urban planning and development from the perspective of POI (Point of Interest). The main steps are: (1) Analyzing the emotional score of Weibo texts published by citizens in the waterfront area of 21 lakes in Wuhan City; (2) exploring the public emotion characteristics of di ﬀ erent genders in the urban waterfront; (3) classifying the waterfront according to the emotional response (score) of the public of di ﬀ erent genders; (4) exploring the relationship between di ﬀ erent POI types and waterfront types and proposing planning suggestions. The results of this study provide evidence for gender di ﬀ erences and spatial distribution of public emotions in the Wuhan waterfront area. It can help decision-makers to judge the prior protection and development direction of waterfront space, thus demonstrating the feasibility of this approach.


Introduction
Waterfront refers to the land or buildings adjacent to rivers, lakes, and oceans in a city, specifically the part of the town adjacent to the water body [1]. Waterfront is defined as an area of water interaction between urban development and the needs of the city and its residents, and is considered an essential part of the urban public space system [2,3]. The open waterfront space between the land and the water can enhance the accessibility and intimacy between people and nature [4]. A waterfront functions similarly to a park, but it has unique charm [4]. In the modern development of Chinese towns, the rapid development of cities caused the destruction of various waterfront areas [5,6]. Subsequent waterfront development often causes social, economic, and environmental problems [7][8][9][10][11]. Thus, since the 1980s, waterfront areas have become the focus of intensified planning intervention and urban renewal [3]. Urban waterfront is currently one of the most sensitive areas in urban ecological environment and lifestyle [12]. In recent years, researchers have come to realize the need to protect green sanctuaries because of numerous unhealthy living conditions in cities [13][14][15][16][17][18][19][20].
Emotion is a complex multidimensional feature, which reflects people's personality and behavioral characteristics [21]. People use various forms of communication to convey their emotions to others [21]. Most of the early studies have collected data through social surveys, which have a limited sample

Data Sources and Preprocessing
The social media dataset used in this study is obtained from Sina Weibo. Sina Weibo is a platform for the sharing, dissemination, and access of information based on user relationships, launched by Sina.com. This online platform accounts for 57% of the total number of Chinese Weibo (including Tencent weibo, Netease weibo, Sohu weibo, etc.) users and 87% of the total number of Chinese Weibo activities, making it one of the most visited websites in Mainland China [1]. The rest of the references to Weibo in this paper are to Sina Weibo. Users can publish information through web pages, pages, external programs, and mobile phone text messages or MMS to achieve instant sharing. The Weibo dataset used in this study includes the text content posted by the user, the geographic location (i.e., longitude and latitude) of the user at the time of publishing, the time of publication, and the gender of the user. For space reasons, the following figure shows some of the raw data ( Figure 2).The data Wuhan has three ring roads, and the Third Ring Road is basically the boundary between the city and the suburbs [79]. The Third Ring Road has a total length of 91 km and surrounds the entire downtown area. The urban development within the Third Ring Road has brought a relatively high degree of maturity and the rapid development in the economic, cultural, and other fields. Similarly, the geographic location of Weibo data is mainly distributed within the Third Ring Road. Therefore, the study area in this study is a buffer zone of 21 major lakes in the Third Ring Road (Figure 1).

Data Sources and Preprocessing
The social media dataset used in this study is obtained from Sina Weibo. Sina Weibo is a platform for the sharing, dissemination, and access of information based on user relationships, launched by Sina.com. This online platform accounts for 57% of the total number of Chinese Weibo (including Tencent weibo, Netease weibo, Sohu weibo, etc.) users and 87% of the total number of Chinese Weibo activities, making it one of the most visited websites in Mainland China [1]. The rest of the references to Weibo in this paper are to Sina Weibo. Users can publish information through web pages, pages, external programs, and mobile phone text messages or MMS to achieve instant sharing. The Weibo dataset used in this study includes the text content posted by the user, the geographic location (i.e., longitude and latitude) of the user at the time of publishing, the time of publication, and the gender of the user. For space reasons, the following figure shows some of the raw data ( Figure 2).The data has covered the Wuhan area from 1 January to 31 December of 2018 and contains 997,832 texts, including 343,973 texts from male users and 653,859 texts from female users. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 3 of 23 or computer science (CS), we propose a trans-disciplinary method. This paper is structured as follows: After this introduction, the second part describes the basic natural and social background of the study area, the data content, and preprocessing process used in this study. The third part introduces the method of scoring the sentiment of Weibo texts in this study and the geographic weighted regression (GWR) and multiple linear regression methods used in the subsequent analyses. Results, discussions, and conclusion are presented in the next three sections.

Study Area
Wuhan (113°41'-115°05' E, 29°58'-31°22' N) is the central city of Central China (Figure 1), the largest city in the middle reaches of the Yangtze River and the most populous city in Central China [74]. The total land area of Wuhan is 8,589.15 km 2 , with a permanent population of 59.17 million in 2018 [75]. Wuhan is traversed by the world's third largest river, the Yangtze River, and its largest tributary, the Han River, and is geographically divided into three towns: Wuchang, Hankou, and Hanyang [76]. The rivers in the city are intertwined, and the water area accounts for a quarter of the city's total area [77]. According to the "Wuhan City Lake Protection Regulations" issued by the Wuhan Municipal Water Affairs Bureau, Wuhan City has a total of 166 lakes. Among them, Tangxun Lake with an area of 47.6 km 2 is currently the largest urban lake in China [78].
Wuhan has three ring roads, and the Third Ring Road is basically the boundary between the city and the suburbs [79]. The Third Ring Road has a total length of 91 km and surrounds the entire downtown area. The urban development within the Third Ring Road has brought a relatively high degree of maturity and the rapid development in the economic, cultural, and other fields. Similarly, the geographic location of Weibo data is mainly distributed within the Third Ring Road. Therefore, the study area in this study is a buffer zone of 21 major lakes in the Third Ring Road (Figure 1). The POI data was obtained in 2018 through Python programming calls to the Open API (Application Programming Interface) from the online mapping service Amap (http://lbs.amap.com/) with a total of 250,000 pieces. An API is a set of routines that make an application accessible based on some piece of software or hardware. Due to the limitation of data on Sina.com, the position coordinates of text data are either incomplete or beyond the boundary of Wuhan, and thus need to be cleaned and preprocessed. After processing, the complete and effective Weibo text data within the Third Ring Road is selected and distinguished according to gender attributes, with a total of 651,177 papers. According to the previous research results on the waterfront buffer zone in Wuhan City [80], the area of the 21 main lakes along the third ring road extending 1000 m outward is selected as the waterfront area in this study. All references to "buffer" in this study refer to this range. The buffer has a total of 266,294 Weibo text data, which include 87,861 males and 178,433 females. These data are visualized using ArcGIS 10.5 ( Figure 3). Simultaneously, the same method is used to intercept the POI data in the buffer. After cleaning and finishing, 90,000 POI data are obtained. According to their attributes, these POI data are divided into five categories: shopping, leisure, infrastructure, culture, and restaurant.

Extended Weibo Text Emotion Dictionary
Different from ordinary spoken, written, and natural languages, Weibo texts contain considerable online vocabulary and network symbols, such as hyperlinks, popular words, multiple symbols, spaces, hashtags, and mentions of other users. These words increase the difficulty of Weibo sentiment analysis, and the existing common dictionaries are still lacking in these aspects.
In terms of Chinese sentiment dictionary resources, HowNet is a relatively comprehensive knowledge base in China. Given the help of HowNet, scholars have tried to build emotional dictionaries in specific fields. In the sentiment analysis task, the sentiment information expressed by sentiment words is extremely important for accurately determining the sentiment polarity of a sentence. To make full use of the sentiment information of a sentence, this study calculates the sentiment score according to the frequency of sentiment words in documents of different polarities using a more suitable and targeted dictionary. First, based on HowNet Dictionary [81], social emotion categories are determined based on existing social emotion-related literature and analysis goals. Moreover, existing emotional dictionaries are integrated, typical Weibo emotion words are supplemented, and a small-scale benchmark emotion dictionary is established. Second, the deep learning tool Word2Vec is used to analyze the corpus of social hot events and comments on the Weibo platform and expand the benchmark dictionary in an incremental manner. The construction process is shown in Figure 4.

. Extended Weibo Text Emotion Dictionary
Different from ordinary spoken, written, and natural languages, Weibo texts contain considerable online vocabulary and network symbols, such as hyperlinks, popular words, multiple symbols, spaces, hashtags, and mentions of other users. These words increase the difficulty of Weibo sentiment analysis, and the existing common dictionaries are still lacking in these aspects.
In terms of Chinese sentiment dictionary resources, HowNet is a relatively comprehensive knowledge base in China. Given the help of HowNet, scholars have tried to build emotional dictionaries in specific fields. In the sentiment analysis task, the sentiment information expressed by sentiment words is extremely important for accurately determining the sentiment polarity of a sentence. To make full use of the sentiment information of a sentence, this study calculates the sentiment score according to the frequency of sentiment words in documents of different polarities using a more suitable and targeted dictionary. First, based on HowNet Dictionary [81], social emotion categories are determined based on existing social emotion-related literature and analysis goals. Moreover, existing emotional dictionaries are integrated, typical Weibo emotion words are supplemented, and a small-scale benchmark emotion dictionary is established. Second, the deep learning tool Word2Vec is used to analyze the corpus of social hot events and comments on the Weibo platform and expand the benchmark dictionary in an incremental manner. The construction process is shown in Figure 4. Point-to-point mutual information (PMI) is a common information measurement method. PMI is mainly used to calculate the semantic similarity among words [82]. The main principle is to count the probability of two words appearing in the text simultaneously. The higher the probability is, the higher the correlation. The PMI calculation method is as follows: where N represents the total number of documents in the corpus, refers to word1 and word2 in the corpus where the frequency of the documents appear. The PMI method is introduced into SO to capture sentiment words. The basic idea of the SO-PMI algorithm based on mutual information among points is explained as follows. First, sets of commendatory and derogatory words are selected as reference words, assuming that Pwords and Nwords are used to represent these two sets of words. These emotional words must be extremely evident and highly representative in the field. If the mutual information between the points of word1 and Pwords is subtracted from the mutual information between points of word1 and Pwords, a difference will be obtained, and the emotional tendency of word1 can be judged according to such difference. The calculation method of SO-PMI is as follows: Generally, 0 is used as the neutral sentiment word of the SO-PMI algorithm. The positive and negative values are set as positive and negative words, respectively. Point-to-point mutual information (PMI) is a common information measurement method. PMI is mainly used to calculate the semantic similarity among words [82]. The main principle is to count the probability of two words appearing in the text simultaneously. The higher the probability is, the higher the correlation. The PMI calculation method is as follows: (1) where N represents the total number of documents in the corpus, d f (word1), d f (word2) is the frequency of documents where word1 and word2 appear in the corpus, and d f (word1&word2) refers to word1 and word2 in the corpus where the frequency of the documents appear. The PMI method is introduced into SO to capture sentiment words. The basic idea of the SO-PMI algorithm based on mutual information among points is explained as follows. First, sets of commendatory and derogatory words are selected as reference words, assuming that Pwords and Nwords are used to represent these two sets of words. These emotional words must be extremely evident and highly representative in the field. If the mutual information between the points of word1 and Pwords is subtracted from the mutual information between points of word1 and Pwords, a difference will be obtained, and the emotional tendency of word1 can be judged according to such difference. The calculation method of SO-PMI is as follows: Generally, 0 is used as the neutral sentiment word of the SO-PMI algorithm. The positive and negative values are set as positive and negative words, respectively.
Given the existence of a large number of network terms in Weibo, this study conducts artificial emotion annotations on the commonly used words, emotion symbols, and emoticons in these network terms. Moreover, this study merges the marked emotion words with the HowNet emotion dictionary, which lead to 4968 emotional words.

Scoring of Sentiment Words in Weibo Texts
To eliminate errors and improve the efficiency of word segmentation, filtering the text is necessary. Regular expression operations in Python ("re" module) remove these types of interference information. The short script of Python 3.5 is utilized to clean and preprocess the Weibo text, and the corresponding word segmentation tool is applied to intercept the words.
Based on the extended microblog sentiment dictionary obtained through the above method, the specific sentiment value is calculated on the collected microblog text. First, text preprocessing is performed on a single microblog. A punctuation mark is used as a segmentation mark to divide the single microblog into n sentences, and the emotion words in each sentence are extracted. The processing in the following two steps is based on clauses. The second step refers to the search for sentiment words in the sentiment vocabulary list. Based on each sentiment word, degree adverbs and negative words are searched in turn, and the corresponding score calculation is conducted. Second, the total score of each sentiment word in the clause is calculated. The third step is where the sentences are being judged as to whether they are exclamatory sentences, rhetorical sentences, or sentences with emoji. If one of the above conditions are identified, the clause adds or subtracts the corresponding weight based on the original score. Finally, the scores of all clauses of the microblog are accumulated to obtain the final score of the microblog. The flow chart can be expressed as Figure 5. Given the existence of a large number of network terms in Weibo, this study conducts artificial emotion annotations on the commonly used words, emotion symbols, and emoticons in these network terms. Moreover, this study merges the marked emotion words with the HowNet emotion dictionary, which lead to 4968 emotional words.

Scoring of Sentiment Words in Weibo Texts
To eliminate errors and improve the efficiency of word segmentation, filtering the text is necessary. Regular expression operations in Python ("re" module) remove these types of interference information. The short script of Python 3.5 is utilized to clean and preprocess the Weibo text, and the corresponding word segmentation tool is applied to intercept the words.
Based on the extended microblog sentiment dictionary obtained through the above method, the specific sentiment value is calculated on the collected microblog text. First, text preprocessing is performed on a single microblog. A punctuation mark is used as a segmentation mark to divide the single microblog into n sentences, and the emotion words in each sentence are extracted. The processing in the following two steps is based on clauses. The second step refers to the search for sentiment words in the sentiment vocabulary list. Based on each sentiment word, degree adverbs and negative words are searched in turn, and the corresponding score calculation is conducted. Second, the total score of each sentiment word in the clause is calculated. The third step is where the sentences are being judged as to whether they are exclamatory sentences, rhetorical sentences, or sentences with emoji. If one of the above conditions are identified, the clause adds or subtracts the corresponding weight based on the original score. Finally, the scores of all clauses of the microblog are accumulated to obtain the final score of the microblog. The flow chart can be expressed as Figure  5. A Weibo text (i.e., plain text) can be divided into three categories according to its elements: text, punctuation, and emoji. Text is the core of sentiment analysis. Emotional text can be divided into adjectives, adverbs, negative words, and interrogative words. Different vocabularies have distinct emotional tendencies [83]. The specific weight is determined as shown in Table 1.  A Weibo text (i.e., plain text) can be divided into three categories according to its elements: text, punctuation, and emoji. Text is the core of sentiment analysis. Emotional text can be divided into adjectives, adverbs, negative words, and interrogative words. Different vocabularies have distinct emotional tendencies [83]. The specific weight is determined as shown in Table 1.
For the use of punctuation marks, this study mainly records the use of exclamation marks and greetings. The exclamation mark "!" is also known as the sentiment sign, which is mainly used at the end of exclamatory sentences to express strong feelings. Exclamatory sentences in the Weibo messages are mostly the enhancement of the emotion expressed by the users, and the degree of their emotional tendency has undergone change. Exclamatory sentences are usually attached to the emotional polarity of the sentiment sentence in which it can belong, and it can be a deeper degree of positive or negative emotions. Most of the rhetorical questions in Weibo messages have a strong emotional color and are expressing negative emotions, mostly for questions about an event, product, a person, or an organization. The tone used in these expressions is relatively strong. Therefore, when the Weibo message does not contain the presence of emotional words, the emotional tendency can be obtained by judging whether the interrogative sentence is a rhetorical sentence. This study sets the weight of exclamatory sentences containing emotional words to enhance the relationship. To some extent, emotional words are the sign of exclamatory sentences. Simple processing can be performed by setting the weight of "!" to 2. When specifically dealing with exclamatory sentences, read first the feature word w in the string 1S after text preprocessing, and then determine whether w is "!". If w is not "!", then the next feature word 1w of the feature word w must be read. If w is "!", then the nearest sentiment word is searched forward. If the sentiment word exists, its weight is multiplied by the weight of "!". If no sentiment words exist, directly continue the subsequent processing. The processing of rhetorical questions is the same as that of exclamatory sentences. The existence of rhetorical questions is marked by "?". First, find "?", and then according to "?"find whether rhetorical marker words exist. The existence of rhetorical marker words proves the existence of a rhetorical question. Then, read the weight of "?". Otherwise directly continue the subsequent processing. In this study, the weight of "?" is set to −2, specifically when a rhetorical question exists, the weight is directly −2.

Weibo Text Sentiment Weighting Calculation
Some Weibo messages have emotion words and emojis, and they may also contain various modifiers or even mood sentences. Therefore, to simplify the calculation and shorten the time overhead, emoticons can be incorporated into the Weibo sentiment dictionary according to their sentiment polarity. After preprocessing the text of the microblog text, the feature items of different polar categories are identified first through the constructed microblog emotional, negative word, degree adverb, and rhetorical question mark vocabularies for corresponding processing. The total weight value of each feature item in Weibo is calculated to obtain the sentiment tendency value of the entire message, and then judge its sentiment tendency.
Given that each obtained Weibo message is a short text message no longer than 140 words, so each sentence of the microblog message D j is divided into n sentences: S 1 , S 2 , . . . , S n using sentence units as punctuation marks. Then extract the sentiment word w i in each sentence. If the adverb of degree w a or negative word w b modifies the sentiment word w i or the sentence is an exclamatory sentence containing sentiment words, the algorithm is: where R w a represents the weight of the degree adverb or exclamation mark, R w b represents the weight of the negative word, and S w i represents the weight of the emotion word w i in the text.
In each Weibo text, sentence S i contains k sentiment words w 1 , w 2 , . . . , w k , and the overall sentiment value is: Consistent with the sentiment dictionary, 0 is set as the central sentiment value as follows: Given that the adjustable parameter is set, the user's emotional tendency score is mapped to the (−6, 6) interval to facilitate the calculation as follows: where Score D j represents the sentiment score of the user's microblog text; and µ is an adjustable parameter, which can adjust the importance of the sentiment score of different polarities.

GWR
By considering their spatial correlation and difference in specific values and more accurately characterizing their relationship with spatial differentiation, GWR methods are introduced to measure the correlation across different levels of emotional values.
Fortheringham et al. have proposed the GWR Model based on the summary of local regression analysis and variable parameter research, drawing on the idea of local smoothing [84].
Similar to general regression analysis, GWR defines a research area. Using different spatial positions of each element is important to calculate the attenuation function, which is a continuous function. Given this attenuation function, when the spatial position of each element and the value of the element are introduced into this function, a weight value can be obtained. This value can be integrated into the regression equation. In space, the GWR model is expressed as follows: where β 0 is a constant term, (u i , v i ) is the sampling point i coordinate, and β k (u i , v i ) is the characteristic elastic coefficient at the sampling point i. For any point (u i , v i ) in the study area, the estimation uses weighted least squares. This study uses the most commonly used Gauss function as the estimation weight function and the cross determination method to the bandwidth between the weight and the distance.

Multiple Linear Regression
In the classical regression model, ordinary least squares (OLS) are often used to perform best-fit analysis and prediction on known data to establish the quantitative relationship of multiple factors. The general form of the multiple linear regression model is as follows: where x i (i = 1, 2, . . . , d) is the dependent variable, w i (i = 1, 2, . . . , k) is the partial regression coefficient, and b is the intercept.
When multiple linear regression equations or regression coefficients are established, they can be interpreted as "the dependent variable affected differently by different independent variables" or "the component of the dependent variable on different independent variables and constants." In this way, the differences in the distribution of men's and women's emotions in the lakeside districts can be verified.
We provide a relatively new perspective to conduct research in urban-environmental emotional feedback. To dig out the mood characteristics, the Weibo texts are scored by a newly-proposed method, which needs a new mood word dictionary and scoring standards. Then, with some statistical methods, the quantity distribution and spatial distribution can be visualized to grab their characteristics. After that, the OLS method and GWR method are simultaneously adopted to verify the gender differences with POIs added. The flow chart is shown in Figure 6.

Score Distribution of Emotional Points
To investigate the distribution characteristics of emotions in the waterfront, this study has scored the preprocessed Weibo texts. Each piece of text is presented as a point on the map according to its geographic coordinates, and it is scored according to the sentiment of the text content. In this way, each point has a sentiment score, which is called an emotional point in this study. A positive emotion score indicates that the emotion of the text content is positive. The larger the score is, the more positive the emotion. A negative emotion score indicates that the text content is negative. The greater the absolute value of the score is, the more negative the emotion. The scores of emotional points in the buffer zone are counted ( Table 2). The statistical results show that the sentiment points of men and women in the buffer zone are distributed between −3 and 4. The data is normally distributed. Both genders have more positive emotional points than negative emotional points, and the number of emotional points in the range of 0-1 is the largest. The relative deviation is also used to calculate the gender difference in the distribution of sentiment points in different score segments ( Table 3). Given that the number of microblog texts published by men in the source data is lower than that of women, the ratio of the number of female emotional points to the number of male emotional points in each score segment is greater than or equal to one. Except for the score bands with the smallest amount of data from −3 to −2, the gender differences in the remaining score bands are all greater than 0.3. The ratio of female emotional points

Score Distribution of Emotional Points
To investigate the distribution characteristics of emotions in the waterfront, this study has scored the preprocessed Weibo texts. Each piece of text is presented as a point on the map according to its geographic coordinates, and it is scored according to the sentiment of the text content. In this way, each point has a sentiment score, which is called an emotional point in this study. A positive emotion score indicates that the emotion of the text content is positive. The larger the score is, the more positive the emotion. A negative emotion score indicates that the text content is negative. The greater the absolute value of the score is, the more negative the emotion. The scores of emotional points in the buffer zone are counted ( Table 2). The statistical results show that the sentiment points of men and women in the buffer zone are distributed between −3 and 4. The data is normally distributed. Both genders have more positive emotional points than negative emotional points, and the number of emotional points in the range of 0-1 is the largest. The relative deviation is also used to calculate the gender difference in the distribution of sentiment points in different score segments ( Table 3). Given that the number of microblog texts published by men in the source data is lower than that of women, the ratio of the number of female emotional points to the number of male emotional points in each score segment is greater than or equal to one. Except for the score bands with the smallest amount of data from −3 to −2, the gender differences in the remaining score bands are all greater than 0.3. The ratio of female emotional points is compared with male emotional points. Likewise, the ratio of the total number of female emotions is compared with the total number of male emotional points (average) in each score segment (Figure 7). Figure 7 shows that the ratio of female to male emotions in the range of −2.8 to −1 and 2.2 to 3.2 is significantly different from the average. It indicates that compared with other levels of emotions, the gender differences in the slight positive emotions displayed by users in the waterfront area through Weibo texts are relatively subtle.

Spatial Distribution of Emotional Points
Using the geographic coordinate information of emotional points, their spatial distribution is further explored. As shown in the spatial distribution map of emotional points of both genders, no big difference is observed in the overall distribution of emotional points of men and women. More distributions exist in the buffer zone of western lakes in Hankou and Wuchang, and relatively less in other regions.
Hot spot analysis is also conducted on emotional points with score values and geographic coordinates (Figure 8). According to the scores of each emotional point and its surrounding emotional points, a statistically significant Z score is calculated. The higher the positive Z score is, the closer the clustering of high values (hot spots). By contrast, the lower the negative Z score is, the closer the clustering of low values (cold spots). The calculation results show that there are differences in the distribution of hot clusters of male and female emotions. Women's positive emotional hot spots are also clustered around the lake in the south of Wuchang. Women's negative emotional hot spots are clustered in the buffer zone of Hankou and Southern Wuchang. Men's emotional hot spots, however, were not clustered in these areas.

Spatial Distribution of Emotional Points
Using the geographic coordinate information of emotional points, their spatial distribution is further explored. As shown in the spatial distribution map of emotional points of both genders, no big difference is observed in the overall distribution of emotional points of men and women. More distributions exist in the buffer zone of western lakes in Hankou and Wuchang, and relatively less in other regions.
Hot spot analysis is also conducted on emotional points with score values and geographic coordinates (Figure 8). According to the scores of each emotional point and its surrounding emotional points, a statistically significant Z score is calculated. The higher the positive Z score is, the closer the clustering of high values (hot spots). By contrast, the lower the negative Z score is, the closer the clustering of low values (cold spots). The calculation results show that there are differences in the distribution of hot clusters of male and female emotions. Women's positive emotional hot spots are also clustered around the lake in the south of Wuchang. Women's negative emotional hot spots are clustered in the buffer zone of Hankou and Southern Wuchang. Men's emotional hot spots, however, were not clustered in these areas.

Gender Differences of Positive and Negative Emotions in Each Lake
Given the further development of this study, the positive and negative values of emotion points of different genders are separately calculated to obtain the average score and produce the absolute value, which is also called the lakes positive/negative emotion scores. The calculation results are sorted statistically. The results show that differences are observed in the positive and negative emotional scores of different genders in each lake.
The rankings of each lake's emotion score among users regardless of gender are compared with one another (Table 4). The ranking of positive sentiment scores shows that there were significant differences in the ranking of the lakes by gender. In the buffer zones of Fruit Lake, Longyang Lake, North Lake, West Lake, and Machine Pond, a significant difference exists in the degree of positive emotions expressed by men and women through microblog texts. In the buffer zone of Small South Lake and Machine Pond, a clear difference exists between the degree of negative emotions expressed by men and women through microblog texts. Around other lakes, the intensity of positive and negative emotions expressed by men and women through Weibo texts is similar.

Gender Differences of Positive and Negative Emotions in Each Lake
Given the further development of this study, the positive and negative values of emotion points of different genders are separately calculated to obtain the average score and produce the absolute value, which is also called the lakes positive/negative emotion scores. The calculation results are sorted statistically. The results show that differences are observed in the positive and negative emotional scores of different genders in each lake.
The rankings of each lake's emotion score among users regardless of gender are compared with one another (Table 4). The ranking of positive sentiment scores shows that there were significant differences in the ranking of the lakes by gender. In the buffer zones of Fruit Lake, Longyang Lake, North Lake, West Lake, and Machine Pond, a significant difference exists in the degree of positive emotions expressed by men and women through microblog texts. In the buffer zone of Small South Lake and Machine Pond, a clear difference exists between the degree of negative emotions expressed by men and women through microblog texts. Around other lakes, the intensity of positive and negative emotions expressed by men and women through Weibo texts is similar.

Gender Differences in Comprehensive Emotions of Various Lakes
As the research scale further advances to the microcosm, the positive and negative emotion scores of both genders in each lake is normalized, and a comparative analysis is conducted to obtain the comprehensive emotional state of each lake (Figure 9). Among users, regardless of gender, some lakes have large differences in positive and negative emotion scores, and the positive or negative emotion scores are much higher than the other emotions. The sentiment expressed by the corresponding microblog users in the lake buffer through the text is obviously biased toward positive or negative. The difference between the positive and negative sentiment scores of some lakes is small, indicating that the emotions expressed by the microblog users in the lake buffer are contradictory, or that the positive and negative emotions are balanced.
There are different ways of categorizing emotions, the most basic of which is based on its bipolarity into positive and negative emotions. Further, some existing classifications such as Max Entropy [85] and SVMs [86] can prove that distinguishing between neutral classes can help improve the overall accuracy of the classification algorithm. Based on this, the difference between the positive and negative emotions of each lake under both genders is compared with the average of the difference, and the lakes were divided into three categories. The lakes with a difference between positive and negative sentiment scores higher than the mean and positive sentiment scores higher than negative sentiment scores are called PLs (Positive Lakes). By contrast, the lakes with negative sentiment scores higher than positive sentiment scores are called as NLs (Negative Lakes). Lakes with a regional mean of positive and negative sentiment score differences are called BLs (Balanced Lakes). ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 13 of 22 There are different ways of categorizing emotions, the most basic of which is based on its bipolarity into positive and negative emotions. Further, some existing classifications such as Max Entropy [85] and SVMs [86] can prove that distinguishing between neutral classes can help improve the overall accuracy of the classification algorithm. Based on this, the difference between the positive and negative emotions of each lake under both genders is compared with the average of the difference, and the lakes were divided into three categories. The lakes with a difference between positive and negative sentiment scores higher than the mean and positive sentiment scores higher than negative sentiment scores are called PLs (Positive Lakes). By contrast, the lakes with negative sentiment scores higher than positive sentiment scores are called as NLs (Negative Lakes). Lakes with a regional mean of positive and negative sentiment score differences are called BLs (Balanced Lakes).
According to statistics, five out of 21 lakes have different attributes between male and female users (Table 5). In other words, in terms of comprehensive emotions, lakes with gender differences account for 23.8%. In these 21 lakes, no positive and negative attributes of the same lake have been observed under both genders. Within the scope of this research, a gender difference exists in the According to statistics, five out of 21 lakes have different attributes between male and female users ( Table 5). In other words, in terms of comprehensive emotions, lakes with gender differences account for 23.8%. In these 21 lakes, no positive and negative attributes of the same lake have been observed under both genders. Within the scope of this research, a gender difference exists in the emotions expressed by users in the waterfront area through microblog texts, but no complete opposition of comprehensive emotions is observed. The lakes with negative attributes in male and female users are completely the same. In the waterfront area, the negative emotions expressed by users through microblog texts are smaller than the gender difference between positive and balanced emotions.

Method Comparison
The OLS method and GWR method were constructed by combining the Weibo text mood values as dependent variables and using the five kinds of POIs as explanatory variables, to describe the difference between men's and women's emotions with POI changes.
The entire comparison structure is that the OLS method is used to illustrate the gender differences from 21 lakes and their lakeside areas while the GWR method is from the maximum 71 official statistical units to eliminate errors in overlapping areas. Or rather, in this research, the OLS focused more on the overall characteristics, but the GWR was able to reflect more detailed spatial characteristics.
As for the method itself, OLS is to find the best function match for data by minimizing the sum of squares of errors. It reflects the quantitative relationship between the explanatory variable and the explained variable, but only the point-to-point characteristic relationship. By considering the relationship between the emotional value and POI as a whole, OLS is used to verify the gender difference in the distribution of emotional values around the entire lake. GWR explores the spatial variation and related driving factors of the research object at a certain scale by establishing a local regression equation at each point in the spatial range, and can be used to predict future results. It is based on the OLS method itself, adding the difference in the amount of space to the overall quantitative characteristics, which can reflect the difference between the variables on the geographic unit. Because it takes into account the local effects of spatial objects, its advantage is higher accuracy. Using the GWR method to analyze the relationship between emotional value and POI for the smallest statistical unit in the lakeside district, the spatial difference in emotional gender distribution can be depicted.

Lake Weight of Emotional Gender Differences
Similar to Weibo sentiment big data, POI data reflects the urban area function in an extremely deep manner. POI data can thoroughly describe the characteristics and changes of urban function in the time and space dimensions. The POIs used in this study are included in the 2018 POI big data collected through Gaode maps. The total number of records located in the Third Ring Road of Wuhan is 250,000 records, and approximately 90,000 records are located in the 1 km buffer zone of 21 lakes in the main city. Five categories, namely, shopping, leisure, infrastructure, culture, and restaurant, are selected according to the attributes of POIs to verify the difference in POI components of the waterfront mood. Assuming female positive and negative emotions, male positive and negative emotions can be expressed as the difference in the number of different POIs. The independent variables are sho, lei, inf, cul, and res, which represent shopping, leisure, infrastructure, culture, and restaurant. Using the method in Section 3.3, OLS can be used to estimate the "weight" of lake emotions expressed on POI. The collinearity test of five types of POI has found that in the regression of lakes, the variance expansion factor among some variables is significantly too large and a significant collinearity characteristic exists. The method used to calculate the dependency between the two is shown in Table 6.
According to the parameters in the In positive emotions, the POI component of both sexes can be expressed in culture and restaurant. The culture coefficient is positive and promotes the emotion value, whereas the restaurant coefficient is negative and inhibits the emotion value. Male emotions are more likely to be expressed in culture, whereas female emotions are more likely to be expressed in restaurant. This finding means that men's positive emotions are more likely to get better because of culture, and women's emotions are less likely to get worse because of restaurant. In negative emotions, men's emotions are expressed in culture, leisure, and shopping, whereas women's emotions are expressed in culture, restaurant, leisure, and infrastructure. Culture and infrastructure coefficients are positive values, whereas others are negative values. This finding indicates that women's negative emotional sources are more concentrated on POIs, emotional changes are more likely to be inflated due to a specific POI, and cultural facility emotions have evident catalytic effect.

Spatial Component of Emotional Gender Differences in the Waterfront
The emotions of men and women in different waterfront areas have different sensitivities to different POIs and show more distinctive characteristics. Performing the GWR calculation using the method in Section 3.3 shows that the five types of POI have no significant collinearity on the refined waterfront area and can be used simultaneously. Positive emotions can show more obvious gender differences in POI weight ( Figure 10).
are less likely to get worse because of restaurant. In negative emotions, men's emotions are expressed in culture, leisure, and shopping, whereas women's emotions are expressed in culture, restaurant, leisure, and infrastructure. Culture and infrastructure coefficients are positive values, whereas others are negative values. This finding indicates that women's negative emotional sources are more concentrated on POIs, emotional changes are more likely to be inflated due to a specific POI, and cultural facility emotions have evident catalytic effect.

Spatial Component of Emotional Gender Differences in the Waterfront
The emotions of men and women in different waterfront areas have different sensitivities to different POIs and show more distinctive characteristics. Performing the GWR calculation using the method in Section 3.3 shows that the five types of POI have no significant collinearity on the refined waterfront area and can be used simultaneously. Positive emotions can show more obvious gender differences in POI weight ( Figure 10).  The regression coefficients of the model regression are 0.823, and the adjusted R 2 of 0.808 are 0.762 and 0.744, which are all greater than 0.7. The model regression fitting effect is better and has a higher degree of credibility.
In general, a significant difference exists in the geographical distribution of POI components between male and female emotions. Female emotions are more sensitive and flexible, whereas males are relatively calm. The partial regression coefficients of the five types of POI for women are (−0. 39, 9. (12.20, 29.58), and (−8.32, −2.46) (i.e., the redder the color in the picture, the greater the value). The female interval is often larger than the male interval, showing a larger range of change, more advantages in the maximum and minimum values, and more variability and ups and downs of emotion. By contrast, the performance of male emotion is relatively weaker, more concentrated in a certain range, and further reflects the stability and peace of emotion.
Judging from the weight of emotions in different POIs, the spatial differences in the emotional components of men and women have converged, showing strong anisotropy and trend consistency. In terms of shopping factors, the influence from the southeast side of East Lake is high, and the change to the Tazi Lake is low. In particular, in Hankou North Lake, West Lake, Machine Lake, Small South Lake, Houxiang River, Chestnut Lake, Wuchang Sand Lake, and other surroundings, women's emotions seem to "have a greater shopping preference" and are more prone to emotional fluctuations for shopping facilities. For the leisure factor, the emotional component at the location of East Lake has a large value, whereas Longyang Lake-Moshui Lake has a small value. The male emotional component significantly increases in South Lake. This finding indicates that male emotions are more concerned about the leisure facility and shows the sensitivity of female emotions in South Lake. Regarding the infrastructure factors, the difference in the weights of gender emotions is evident. In East, Fruit, and Shai Lakes, women's emotions are significantly higher than those of men. This finding means that in these lakes, women's emotions are more likely to be affected by infrastructure than those of men. Regarding the cultural factor, whether it is in trend or range, the weights of the two genders have a great degree of consistency, indicating that culture is important to emotional influence. Similarly, the positive emotions of men and women reflect a slight difference in restaurant. In the surroundings of East, Fruit, and Sand Lakes, the influence of restaurant factors on female emotions is significantly higher than on those of men. This finding indicates that female emotions are more likely to be produced by restaurants around East Lake, and changes in women's emotions may have something to do with East Lake as a scenic tourist area.
However, the POI component (i.e., coefficient value) of men and women's negative emotions is highly concentrated near the mean, and only a slight change in gender is observed.
The above verification process can explain that "a difference exists between men and women's emotions in the waterfront." Emotional fluctuations of men and women have a greater correlation with the number of POI facilities in the waterfront. In the analysis where POIs are used as the independent variable, the weight of POI on emotions is significantly associated with gender differentiation, wherein gender factors greatly affect the generation, dissolution, and change of emotions.

Discussion
Research on analyzing emotions and public space through social media data is emerging in the context of other cities around the world. Lim [89]. However, existing studies have focused on green spaces in urban areas, paid little attention to the waterfronts in cities, and used short research periods or small numbers of data sets. Our study takes Wuhan city as an example and puts forward a people-oriented approach and perspective. Combining multiple disciplines based on the public's emotion towards the urban environment, this study proposes suggestions for the planning and development of urban waterfront space. The research results in Wuhan city demonstrate the feasibility of this approach. Besides, Wuhan is a metropolis with developed transportation and a dense population. There are many lakes and abundant water resources in the city. These representative characteristics enable the research results of Wuhan to be extended to similar cities.
This study has used iconic vocabulary and symbols to score emotions in Weibo text. However, this kind of tool cannot fully and accurately identify the emotions expressed by users. Future research can improve methods to accurately capture emotional goals and determine how people feel about various topics, people, events, places, and objects. Weibo users do not represent all actual waterfront tourists. Social media data using geolocation must not replace traditional survey methods, but can be regarded as a powerful complementary tool provided by modern technology. The accuracy of user positioning is limited by the GPS accuracy of mobile devices, and the nature of 2D geographic data may cause measurement errors when determining the actual location of the users. For example, a user may be in an underground subway station or sitting in a waterfront area.
Finally, the influence of gender on public emotion is discussed to focus on individual differences in the study. In fact, there are, however, many individual factors that affect the public's emotional response to urban space, such as age, occupation, income and so on. We chose to study the gender difference, but the influence of other factors cannot be ignored. In the follow-up research, we will further improve our approach and idea to quantitatively explore the influence of other factors and the relationship between each factor. We hope that this kind of research will enable planners to focus on individual differences in the public and make urban planning more sophisticated and humane.

Conclusions
Recall that the main novel contributions of this paper are a new approach and perspective combining textual emotion analysis and spatial analysis based on big data of social media. Moreover, in this process, attention is paid to the gender difference in the public's feelings towards urban space. The approach we provide consists of four steps: (1) Emotional ratings are given to microblog texts published by citizens in the study area; (2) exploring the emotional characteristics of the public of different genders in the urban waterfront; (3) classifying waterfront according to the emotional response of the public of different genders; (4) exploring the relationship between different types of POI and various types of waterfront, so as to make planning suggestions. The results can be used in the domain of urban planning for decision support and the evaluation of ongoing planning processes.
The study in Wuhan city proves the feasibility of the approach. The results show that there are differences in the score and spatial distribution of the public's emotions in the waterfront space in Wuhan. And the distribution of POI is associated with this change in public sentiment. Taking Wuhan as an example, the preliminary proposal is to improve the environmental quality of lakes and balance the layout of POI. To be specific, to improve the positive emotions of men in Wuhan waterfront space, attention can be paid to the construction of culture facilities, while the reduction in the negative emotions of women can be started from the restaurant facilities. Spatially, the impact of each POI is different, and more targeted suggestions can be put forward. For example, for shopping facilities, the construction of shopping facilities on the southeast side of East Lake has a greater impact on public emotion, while the construction of shopping facilities on Tazi Lake has a smaller impact. In particular, in the surrounding areas of North Lake, West Lake, Machine Lake, Small South Lake, Houxiang River, Chestnut Lake and Sand Lake, the construction of shopping facilities is more likely to make women have emotion swings. These results of this study can be used for further discussion by urban planning managers and decision makers.
The innovation and uniqueness of our approach is fourfold: First, the concept improves previous research in that it proposes a trans-disciplinary approach combining methods from GIScience, CL and urban sociology by merging the concepts of semantic, geographic and gender difference. Second, using social media data provides the public emotions in near real time in an urban context. Social media data is continuous 24/7, which provides a conscious stream of information and a collective depiction of the social response to specific situations and environments [84]. Therefore, contrary to the traditional practice of focusing on specific problems at specific timelines, social media data can be used as a tool to assist overall design decision-making and planning. These data also represent the users' cognition. Thus, these data have helped the continuous progress in the understanding of human interaction with the environment. Third, our approach focuses on gender differences among the public. Focusing on individual differences among the public can help to identify the seemingly invisible urban problems. Moreover, this will provide support for the future development direction of more humanized and detailed urban planning. Fourth, unlike other research efforts, our approach offers direct feedback to real-world processes in urban management and planning, and will help to detect previously unseen urban patterns. Finally, this approach of combining textual emotion analysis with spatial techniques is generic so that it is usable in other areas like public health, traffic analysis and management, public safety, tourism, etc.