Abstract
Location-based social media allows people to communicate and share information on a popular landmark. With millions of data records generated, it provides new knowledge about a city. The identification of land use intends to uncover accurate positions for future urban development planning. The purpose of this research is to investigate the use of social networking check-in data as a source of information to characterize dynamic urban land use. The data from this study were obtained from the social media application i.e., Twitter. Three kinds of data that are prioritized in this research are check-ins (specific location), timestamps, and a user’s status text or post activities. In this study, we propose a grid-based aggregation method to divide the urban area. Two different approaches are compared—rank and clustering methods to group the place’s activities. Then we utilize time distribution frequency to attain the land-use function. In this case, Makassar City, Indonesia, has been selected as the case study. An analysis shows that the check-in activity and the method we proposed can be used to group the actual land-use types.
1. Introduction
Urban planning is a technical process in the formation, arrangement, and development of a city. One kind of study on urban planning is land-use mapping, related to accurate land determination for urban zoning. The problem on urban land-use mapping is deciding upon the particular region for certain land use. Previous studies have been conducted to detect land use over time, such as the use of aerial photographs for mapping and quantifying the change in forest land-use patterns [1], remote sensing [2], geographic information systems techniques [3], and Landsat images via satellite, which provide an efficient means for land-use detection [4,5]. However, these approaches have some weaknesses, such as the inability of numerous sensors to obtain data and information in cloudy areas. Clouds make the resolution of the satellite imagery too coarse for detailed mapping and for distinguishing small contrasting areas, yet high-resolution satellite imagery is very costly and time-consuming [6].
With the development of an embedded system planted on the smartphone, a user’s movement could be tracked [7]. Researchers use the mobile phone’s footprint to predict the user’s behavior [8], Bluetooth traces [9], Global Position System (GPS) hint [10], and smart card data [11]. In the literature, we find that some researchers use these devices for land-use identification—for instance, the demonstration of GPS data for discovering a region and sensing human activity [12], urban Wi-Fi characterization [13], land-use and landscape identification using cell-phone data [14,15,16]. However, these models concentrate on a particular region in a specific area, the lack of information from this data [17] and difficult to identify the user's footprint.
To overcome these research challenges, some scientists use location-based-on-social-network (LBSN) data to capture people’s travel behavior as an alternative approach. These data contain information on their interests, hobbies, and place activities. Recently, the data source of social media’s geolocation has provided new information in terms of understanding an individual’s activity pattern. In the literature, we found that some researchers discuss social media—that is, foursquare check-in data—to catch people’s social events distribution, such as by investigating human travel activity patterns [18], inferring individual lifestyle patterns [19], and predicting the next venue [20]. Additionally, many researchers have used Twitter’s check-in data to capture the individual’s activity in the urban area, such as in home-location identification [21], and to estimate the user’s location [22,23].
On the basis of the above description, the information on the people visiting a particular place will be pertinent to form a new area. In the perspective of urban planning, geolocation becomes an indicator to identify a specific urban area. In this paper, we analyze social media data from Twitter for detecting the dynamics of urban land use. The data includes the period (time-stamped), the user’s status text or post information (tweet), and the geolocation or specific location that is the point of interest where and when people undergo check-in activity. To analyze the data, we propose a grid-based aggregation method and text mining to split the Twitter land map. The proposed method uses a grid to divide the urban area and text-mining activity to count popular keywords among different categories. We compare two distinct methods: a rank method and k-means clustering to classify different areas. To validate the analysis, we combine the individual’s travel time spread on weekdays and weekends as the parameters to define the land-use.
2. Related Work
Various studies has been conducted to describe urban structure. For example, a study [24] used large-scale taxicab data to characterize the urban dynamics in New York City on the basis of three aspects. First, they checked the urban activity pattern by aggregating pick-up and drop-off locations using trip dynamics. Second, they analyzed similar taxi travel patterns. Third, they explored the connection between the taxi trips and people’s mobility. They used a clustering algorithm to classify the trip origin and the destination data. They concluded that there is a tendency for taxi travel to represent human mobility. With other datasets, another study [25] presented GPS trajectory data to discover the region of different cities: New York, Tokyo, and Paris. They analyzed the individual’s movement using the probability model to categorize the point of interest (POI). As a result, they created a framework that could produce some applications for urban planning, business location, and social recommendation.
The development of social media data geolocation has provided new insights into the shape of a city. From the literature review, the authors found some studies that have used location-based social media data to catch the individual’s journey pattern. For instance, a study [26] demonstrated Twitter check-in data for land-use identification in three cities: Madrid (Spain), Manhattan (USA), and London (UK). They used the spectral clustering technique to analyze the individual’s travel pattern every 20 minutes, deducing the user’s trip average on weekday and weekend activities. They concluded that Twitter geolocation is a useful data source for urban planning application and could potentially provide information for urban land use. Another study [27] explores the Flickr location tag to describe the city’s center. To deal with this issue, a kernel density method was used to estimate the number of check-ins in each area. They argued that these data did not only cover all the city activities but could also describe the city boundaries. Similarly, a study [28] presented location base social network data to identify the city’s center. Three methods were used to find an accurate location with a vital and precise boundary: local Getis-Ord (LGOG), density-based spatial clustering of applications with noise (DBSCAN), and the Girvan–Newman (GN) algorithm. They deduced that the three methods could describe the geometrically regular boundaries of a monocentric city and that the last method was suitable for polycentric cities. In the reviews, we found some weaknesses in the previous studies, such as a lack of validation, and that there is a particular bias of using only specific data sources to characterize land-use types. In general, previous studies have focused on the geolocation check-in (latitude and longitude coordinates) as the only criterion to make the measurement, where this feature is not explained in detail, as well as the name of the existing location.
In this research, our focus is not only the check-in data but also involving the Twitter text record, where we use a specific filter on the location name search. The use of both features (check-in and user text posting) has been studied by some researchers. For instance, a study [29] used Twitter data to characterize a places activity. The researchers proposed an unsupervised learning algorithm with the latent Dirichlet allocation (LDA) approach to classify geotagged Tweets. Then another study [30] used geotagged Chinese social media (Sina Weibo) to model urban land use. To define land use, both articles used one parameter, namely the time distribution pattern on weekdays and weekends. Besides the grid-based aggregation method and status update posts as additional criteria, we compare two techniques to characterize a place’s activity. We regard the comparison between these two approaches, which distinguishes our work from others’.
3. Data and Methodology
3.1. Data Collection
Twitter is an application operated by Twitter Inc. (San Francisco, CA, USA) that offers a social networking microblogging service, allowing users to post and read text-based messages of up to 140 characters, called tweets. From its members, Twitter has gathered a vast amount of personal information, such as names, genders, phone or e-mail addresses, and passwords. For data collection, we utilized the Twitter streaming application program interface (API), a Windows application that allows developers to access the user’s profile data displayed in the JavaScript Object Notation (JSON) format. Through the service, Twitter provides the data to be downloaded, such as names, locations, profile locations, descriptions, follower counts, friend counts, account creation dates, and time-zone coordinate positions (latitude and longitude) [31]. One important Twitter feature is that users can display a location map that reveals the time and place at which the status was posted or where they were. This feature becomes a key to catch the individual’s behavioral activity in urban areas. For our research, we focused on Makassar City, Indonesia. We analyzed 170,595 user check-in data records consisting of 43 days (6 weeks) of Twitter activity from 24 August to 5 October 2016.
Makassar is a city with the largest population in eastern Indonesia. The 2010 census of population registered 1.34 million residents in an area of 175.7 km2 [32]. From the data collection, we identified that Twitter users have an average age of 15–40 years, where 34% are males and 66% are females [33]. This research is essential, as the land-use map of Makassar City is not up-to-date, while the current design for projection is 20 years ahead [34]
3.2. Text Mining for Place-Name Identification
The main purpose of text mining is to support the process of knowledge discovery on large document collection. In principle, text mining is a science field that involves information retrieval, text analysis, natural language processing, and a logic-based learning machine [35]. In this regard, text mining specifies the places at which the individuals make the tweets. Through this service, the check-in locations are grouped using the clustering method, and the place-names are individually identified from the user’s status post on Twitter marked with the symbols # and @ to define the place-name (e.g., “eating at #thexxxrestaurant” and “playing soccer at @theyyystadium”). Because of this, the Twitter application does not insert the location name on the APIs’ search engines but includes the geographic location in the form of latitude and longitude coordinates. We use a Voyan tool, an open-source web-based application used to discover most frequently used words, to analyze and count the documented texts and to ease text separation.
Figure 1 shows the data flow and methods proposed for urban land-use identification, where two data-grouping methods are compared. To conclude the land hypothesis, we used the daily time distribution on weekdays and weekends activities.
Figure 1.
Data flow diagram of method used.
3.3. Aggregation Grid for Dividing Land Area
In a Twitter dataset, check-ins are separate; thus the issue arises of how to unite their spreads in one or several information units. We propose a grid-based aggregation method to identify each area for detecting urban land use, a technique to combine distinct objects into different groups. Figure 2 shows the 16 × 6.5 km2 tweet distribution map of Makassar City.
Figure 2.
Grid distribution of check-ins with 500 × 500 m2 blocks. The dots represent the user location tags, and the color describes the Twitter activity frequency.
To facilitate the analysis, we divided the grid into 500 × 500 m2 areas and produced 558 blocks. We then removed the blocks without check-in activity. A total of 160 blocks were removed, and 398 blocks with tweet activity were used. The figure below illustrates the spread of twitter check-ins. The dots represent the locations, and the block gradations indicate the frequency of each block.
To recognize the place type on each block, we used the user’s text-posting activity on Twitter. A total of 85 venues were found from the whole blocks. We then divided the area into six categories (Table 1). From this result, we could see the description about the information of the land.
Table 1.
Location categories visited by user.
To calculate the number of check-ins on each block, we grouped each block into 32 classes with an interval of 100 check-ins. The grouping provided a description of the frequency of data diversity. Figure 3 shows the graph of block allocation based on each class (e.g., the class C100 contains 112 blocks).
Figure 3.
Frequency distribution classes with each group of 100 check-ins.
3.4. K-means Clustering for Land-Use Characterizing
Clustering is a method to group objects into classes with identical characteristics [36]. The k-means clustering is one algorithm of unsupervised learning that uses a nearest mean approach. This reliable algorithm can quickly process huge amounts of data [37]. The k-means clustering attempts to group objects into two or more clusters so that the objects within one cluster share similarities. To measure the similarity among objects, k-means clustering utilizes the distance function as the parameter to determine the group members. The k-means algorithm uses the following steps:
- Decide the number of clusters (in this research, five clusters are specified).
- Determine the centroid value (center of measurement) randomly.
- Calculate the distance between the centroid points and the point of each object. To measure, we use the Euclidean distance:where De is the Euclidean distance, i is the number of the object, (x, y) are the object coordinates, and (s, t) are the centroid coordinates.
- Assign object to closest cluster.
- Go back to step 2 and recalculate the centroid value until the cluster members do not move to other clusters.
From the place activity (see Appendix A), we then grouped the data and produced five clusters. Table 2 shows the different places visited by people.
Table 2.
K-means clustering result for land use type.
Figure 4 shows the time distribution pattern on weekdays and weekends from k-means clustering. To analyze the land use type, the method will be compared with the group result from the ranking method to determine the potential land use.

Figure 4.
The graph results of k-means clustering in different time frequencies on weekdays and weekends. (a) Clusters’ comparison on weekdays and weekends; (b) cluster 1; (c) cluster 2; (d) cluster 3; (e) cluster 4; (f) cluster 5.
4. Land-Use Segmentation
We tried a method of grid-based aggregation to divide the urban area. After class grouping (see Table 3), we then characterized each region to understand the type of land use. To identify the land area, we grouped the check-in activity blocks on the basis of the following:
Table 3.
Class partition.
- We determined the frequency of places visited by comparing the percentage data of each block. We then combined the blocks into several classes and grouped the classes into several clusters. In this case, each cluster was decided by the place with the highest frequency as a decision-making indicator. For example, on the basis of tweets, we found that classes C100, C200, C300, and C400 were dominated by the individual’s activities in residential areas (see Table A1). Thus, the combination of these classes was called cluster 1.
- To identify the land-use type, we ranked every place on each cluster to determine the most visited venue (see Table 4).
Table 4. Place ranking for land-use-type clustering. - We then analyzed the time distribution frequency on each class to determine the trends of each region by comparing weekday and weekend check-in patterns. In doing so, the identification of land use could be detected.
On the basis of the above criteria, we classified the class interval (see Table 3) into four clusters. The clusters illustration can be seen in the following table:
Figure 5a illustrates the user’s daily frequency times. We observe that the peak of individual activity occurs at 10 p.m. and the lowest check-in activity at 6 a.m. On Figure 5b, we see that majority of user frequency is between 20 up to 100 check-ins.
Figure 5.
Daily time distribution activity (a) and trip flow distribution for each user (b).
From the 85 places (see Table 1), we then identified the venue type and found 31 places with significant check-ins. Table 4 depicts the spatial distribution cluster showing the check-in numbers and percentages in each place. This cluster would provide an overview of potential land use.
4.1. Housing Area (Cluster 1)
To understand the land use of this region, we compared the classes by considering the most frequently visited places. We observed that in general, the tweet activity in cluster 1 was closely related to the activities of people who were around the residential area (see Figure 6a). We found about 26% of the tweet activity covered by this group (see Figure 7b). We then analyzed the daily tweet pattern and found that the peak of tweet activity occurs at 10:00 p.m. (Figure 7a), related to the individual’s activity before bed. Meanwhile, other activities, such as being in or going to a university, a café, and others, were done during the day and peaked from 11 a.m. to noon. We observed about 70% of this area was covered by this cluster. Thus, this cluster can be associated with the housing area. If we compare this to the k-means clustering, then this group is identical with clusters 1 and 3 (see Table 2 and Figure 4). Thus we associate this area to housing.

Figure 6.
The words frequency (a) and housing distribution map (b).
Figure 7.
The daily time spread (a) and percentage of check-ins in different places in cluster 1 (b).
4.2. Education Area (Cluster 2)
As shown in Figure 8d, we compared the pattern of weekday and weekend activities. During weekday, tweet activity increased at 8 a.m. We observed a changing trend between 10:00 a.m. and 2:00 p.m. Then on the weekends, the peak activity was at 8:00 a.m. and 11:00 p.m. We compared the pattern of weekdays and weekends and found a very significant difference in that, on the weekends, the tweet activities decreased. This was because on weekdays, the frequency of university visits increases, while on weekends, only a handful of individuals come to the university.

Figure 8.
The analysis of user text posted (a), the map of the education area (b), a graph of different visited places in the education cluster (c), and difference activity on weekdays and weekend (d).
In general, this cluster was more populated in places such as universities and schools. The existence of other venues such as restaurants—Pizza Hut and McDonald’s—malls, and others was because of the university and was not influenced by other regions. On the basis of this analysis, we then concluded that this cluster is related to education. This can be seen in the word frequency and graph percentage of each place (Figure 8a,c). This group is similar to clusters 2 and 5 (see Table 2 and Figure 4) from the k-means result. If we observe the difference between Figure 4c,f, we find that there are contrasting activities during weekdays and weekends, except for during night.
4.3. Commercial, Business, and Work Area (Cluster 3)
In cluster 3, we divided the time spread into two parts (evening and morning). In the evening, the peak of tweet activity occurred at 9 p.m. We observed that this cluster was dominated by individual activity at places such as culinary venues, coffee, and restaurants (see Figure 9c). It is therefore most likely that people go out for dinner. We would argue that this cluster represents the commercial area for eating or other culinary activities, which can be proven by the decrease of check-in activity one hour later (see Figure 9d).

Figure 9.
The word frequency analysis (a), the user distribution map in cluster 3 (b), check-in activity in different places (c), and the time difference of user distribution on weekdays and weekends (d).
Then in the morning, the peak occurred at around 8–9 a.m., and then the trend fluctuated until noon or 2 p.m. (see Figure 9d). We observed that this cluster was populated in places such as hotels, offices, and malls. We argue that in addition to visitors, this check-in was also made by employees and office staff. We therefore concluded that this was a working or business area. There was a large difference when we compared the tweet pattern on weekdays and weekends; weekends showed a decrease in tweet activity when compared to weekdays. Thus, we concluded that check-in at work places started from the morning and continued until noon. Then in the afternoon (returning home from work), people would look for other activities, such as shopping or going to dinner. Comparing this with the k-means result, we find that cluster 4 (see Table 2) has a similarity with the group pattern of the rank method. We concluded that this is a work area.
4.4. Mixed Area (Cluster 4)
We could not explain specifically the land use of this region. We called this the mixed cluster, because in this region, there were various activities in venues such as hotels, shopping centers, office centers, and sports centers (see Figure 10a). In the morning, check-in activity for this cluster began at 7 a.m. and increased until the afternoon. The spread of time on weekdays and weekends had similar patterns. We concluded that this area was the most active area as the tendency of check-in activity did not decrease until 10:00 p.m. (see Figure 10b).

Figure 10.
The graph of check-ins at different places (a), user time deployment activity over 24 h (b), word frequency for analysis and place identification (c), and the physical layout of tweeting activity in cluster 4 (d).
5. Conclusions
In this study, we used Twitter as a source of data to analyze urban land use. To investigate the regional profile, we collected information from Twitter in the form of users’ text posts, time zones, and coordinates. In this paper, we proposed a grid-based aggregation method to explore urban areas. The proposed approach divided the region in the form of a grid, where on each grid, there was a 500 × 500 m2 block, thus yielding 398 blocks. We divided the area into 32 classes, where each class had 100 check-in intervals, and then classified the existing classes into some clusters. Land identification was determined on the basis of, firstly, the highest number of check-ins and, secondly, the result of a comparison of check-in patterns on weekdays and weekends.
Our proposed method could characterize the urban area, particularly for land-use identification. The model used produces a polycentric area—not centered on one particular region—which means that in the city, there will be more than one similar land-use type (see Figure 11). For example, the education and commercial areas are not only centered on one area but also spread over several regions. We concluded that Twitter check-in data can be used to understand the actual urban land use. Our new method can contribute additional data or input for city planners and stakeholders to solve these problems, specifically the analysis of urban land use. As such, the method we propose is cheap to implement and easy to use. In this regard, this research could become a part of the city’s sustainability, specifically for the development of urban land use. To obtain maximum measurement results, this method depends and relies on the size of the used grid. For this, larger grid sizes will provide at least twice as many land-use functions in a region. In this regard, grid-size standardization is necessary for the partition of land types. This challenge needs to be considered for future research.
Figure 11.
Land use hypothesis (education, commercial and mixed area).
If we compare the ranking and k-means clustering methods, we found that the rank method measures on the basis of the order of data; the highest-ranking order became a standard to determine the state of the region. Meanwhile, the k-means clustering method used a similarity-and-distance approach to group the data. Other than being reliable, both methods can solve huge amounts of data.
Acknowledgments
This research was supported by the University of Kitakyushu, the Directorate General of Higher Education of Indonesia (DIKTI), and STMIK Handayani Makassar, Indonesia.
Author Contributions
Yuyun performed the experiment and wrote the paper. Fritz Akhmad Nuzir reviewed the writing. Bart Julien Dewancker contributed to the conceptual design and as a supervisor in guiding this research.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the writing of this paper.
Appendix A
Table A1.
The classes group of place activity.
Table A1.
The classes group of place activity.
| C200 | Check-In | C300 | Check-In | C400 | Check-In | C500 | Check-In |
| Housing | 1350 | Housing | 1135 | Housing | 1060 | School | 866 |
| University | 469 | University | 885 | Street | 877 | University | 696 |
| Office | 400 | School | 407 | University | 600 | Housing | 457 |
| Street | 303 | Street | 305 | Coffee | 619 | Coffee | 556 |
| School | 299 | Coffee | 267 | Office | 427 | Hospital | 318 |
| Restaurant | 269 | Café | 253 | School | 349 | Office | 189 |
| Coffee | 454 | Office | 248 | Culinary | 268 | Hotel | 119 |
| Pool | 133 | Park | 194 | KFC | 268 | Bank | 109 |
| Seafood | 117 | Meatball | 165 | Meatball | 255 | Ice | 107 |
| Beach | 116 | Culinary | 154 | Hospital | 222 | Street | 107 |
| Culinary | 112 | Restaurant | 149 | Beach | 188 | Unhas | 106 |
| Shop | 96 | Hospital | 119 | Eating | 181 | Meatball | 104 |
| Park | 92 | Noodle | 113 | Hotel | 150 | Eating | 97 |
| Cinema21 | 88 | Hotel | 109 | Noodle | 114 | Culinary | 66 |
| Meatball | 82 | Hall | 92 | Seafood | 108 | Chicken | 62 |
| Field | 81 | Mosque | 83 | Mosque | 97 | ||
| C900 | Check-In | C1100 | Check-In | C5200 | Check-In | C700 | Check-In |
| KFC | 427 | McDonald | 390 | Mall | 845 | Coffee | 763 |
| Coffee | 450 | Ice | 217 | KFC | 631 | Hospital | 268 |
| Housing | 288 | Stadium | 191 | Cinema21 | 364 | University | 366 |
| Hospital | 242 | Restaurant | 319 | McDonald | 333 | Street | 212 |
| Eating | 329 | Office | 181 | Eating | 245 | Office | 206 |
| Mall | 164 | Coffee | 105 | Coffee | 501 | Meatball | 175 |
| Noodle | 126 | Noodle | 101 | Pizza | 216 | Housing | 174 |
| Pizza | 97 | Meatball | 91 | Street | 148 | Seafood | 164 |
| Office | 149 | Café | 68 | Hotel | 133 | Eating | 124 |
| Soccer | 73 | Hotel | 64 | Karaoke | 121 | School | 118 |
| Hotel | 64 | Karaoke | 51 | Restaurant | 117 | Restaurant | 101 |
| Street | 60 | Shop | 45 | Culinary | 117 | Skincare | 98 |
| Porridge | 59 | Mall | 44 | Supermarket | 100 | Cheese | 88 |
| Noodle | 55 | Housing | 39 | Office | 165 | Eating | 85 |
| Cinema21 | 53 | Church | 37 | Shop | 91 | ||
| C1200 | Check-In | C3300 | Check-In | C600 | Check-In | C1000 | Check-In |
| School | 592 | Coffee | 606 | Hotel | 1354 | Coffee | 488 |
| Church | 91 | KFC | 219 | Hall | 341 | University | 344 |
| Coffee | 86 | Cinema21 | 194 | University | 314 | School | 322 |
| Culinary | 147 | Market | 142 | Café | 168 | Culinary | 140 |
| Coffee | 73 | Mall | 122 | School | 136 | Restaurant | 119 |
| Restaurant | 138 | Hotel | 52 | Corner | 87 | Housing | 108 |
| Office | 51 | Street | 87 | Office | 66 | Cinema21 | 107 |
| Hotel | 42 | Bar | 80 | Street | 65 | Noodle | 73 |
| Culinary | 38 | Tea | 77 | School | 65 | Shop | 67 |
| Mall | 38 | Eating | 72 | Building | 57 | Mall | 56 |
| Clinic | 35 | Karaoke | 65 | Wedding | 56 | Office | 48 |
| Store | 29 | Culinary | 182 | Swimming | 55 | Eating | 45 |
| Mall | 28 | Pizza | 63 | Garden | 47 | Bank | 38 |
| Donuts | 26 | Snack | 58 | ||||
| C1800 | Check-In | C2500 | Check-In | C6600 | Check-In | C1300 | Check-In |
| Pizza | 291 | Culinary | 544 | Mall | 1145 | University | 928 |
| Coffee | 361 | Hotel | 263 | Cinema21 | 1025 | McDonald | 371 |
| University | 224 | Office | 114 | Tea | 347 | KFC | 199 |
| Culinary | 190 | Bar | 92 | Supermarket | 250 | Hospital | 186 |
| School | 305 | Mall | 159 | Pizza | 191 | Office | 153 |
| Beach | 187 | Culinary | 102 | Mall | 188 | Street | 152 |
| Restaurant | 482 | Tower | 82 | Coffee | 209 | Coffee | 247 |
| Bar | 131 | Park | 60 | Eating | 102 | Restaurant | 77 |
| Meatball | 119 | Bank | 52 | Restaurant | 313 | Monument | 161 |
| Office | 106 | Hospital | 73 | Bank | 61 | Pizza | 60 |
| Hall | 84 | Eating | 44 | Bookstore | 70 | Noodle | 56 |
| Bank | 70 | Coffee | 32 | ||||
| C7600 | Check-In | C3600 | Check-In | C2900 | Check-In | C1900 | Check-In |
| Mall | 1720 | Hotel | 1106 | Restaurant | 444 | McDonald | 914 |
| Cinema | 935 | Office | 571 | Fort | 296 | Coffee | 344 |
| KFC | 159 | University | 325 | Office | 239 | Office | 172 |
| Tea | 243 | Café | 165 | Coffee | 200 | Eating | 163 |
| Eating | 183 | School | 187 | Park | 67 | Culinary | 271 |
| Coffee | 294 | Ballroom | 100 | Food | 44 | Hotel | 77 |
| Pizza | 141 | Happy | 98 | Bar | 87 | Steak | 77 |
| Restaurant | 180 | Corner | 87 | Culinary | 76 | Ice | 74 |
| Snack | 95 | Wedding | 56 | Hotel | 36 | University | 64 |
| Bookstore | 70 | Street | 131 | Eating | 137 | ||
| C1700 | Check-in | C2100 | Check-in | C2400 | Check-in | C800 | Check-in |
| Mall | 687 | Field | 332 | School | 236 | University | 2138 |
| Cinema21 | 318 | KFC | 125 | KFC | 137 | Office | 564 |
| Restaurant | 101 | School | 246 | Culinary | 179 | School | 392 |
| Coffee | 119 | Field | 86 | Hotel | 142 | Culinary | 280 |
| Tea | 39 | Office | 152 | Field | 67 | KFC | 221 |
| Dinner | 33 | Mall | 70 | Bank | 60 | Seafood | 139 |
| Lunch | 27 | Street | 70 | Coffee | 137 | Pizza | 123 |
| Bank | 25 | Pizza | 53 | Hospital | 41 | Coffee | 116 |
| Snack | 25 | Coffee | 96 | Restaurant | 64 | Soccer | 108 |
| Fitness | 18 | Bank | 42 | ||||
| C2000 | Check-in | C2200 | Check-in | C1500 | Check-in | C100 | Check-in |
| Restaurant | 568 | University | 1897 | University | 1715 | Housing | 779 |
| Hotel | 175 | Beach | 438 | Café | 992 | University | 135 |
| Café | 169 | Restaurant | 258 | Cinema21 | 228 | Street | 111 |
| Bar | 139 | KFC | 241 | Mall | 241 | School | 101 |
| Guesthouse | 84 | Culinary | 200 | Building | 130 | Café | 41 |
| Office | 42 | Coffee | 325 | Library | 233 | Coffee | 37 |
| Hospital | 34 | Hotel | 508 | Meatball | 61 | Restaurant | 36 |
| Eating | 52 | Hall | 84 | Hotel | 59 | Office | 29 |
| Culinary | 62 | Hospital | 106 | School | 57 | Culinary | 28 |
| C1600 | Check-in | C3000 | Check-in | C1400 | Check-in | C2300 | Check-in |
| Stadium | 849 | Mall | 1600 | University | 2117 | University | 574 |
| Office | 101 | Restaurant | 342 | Office | 603 | School | 958 |
| Photography | 62 | Cinema21 | 225 | Hospital | 191 | Futsal | 39 |
| Soccer | 57 | Coffee | 150 | Building | 92 | Hospital | 13 |
| School | 45 | Bar | 77 | School | 126 | Mosque | 18 |
| University | 87 | Snacks | 133 | Hall | 164 | ||
| Culinary | 22 | Eating | 31 | Canteen | 52 | ||
| Television | 17 | Fitness | 30 |
References
- Al-Tahir, R.; Rajack, F.; Oatham, M. Aerial photographs for detecting land use changes in Valencia Wildlife Sanctuary and Forest Reserve, Trinidad. Caribb. J. Earth Sci. 2005, 38, 35–42. [Google Scholar]
- Modara, M.; Belaid, M.A. Mapping and assessing land use/land cover change in Muharraq island based on GIS and remote sensing integration. Remote Sens. Spat. Inf. Sci. 2013, XL-4/W1, 57–63. [Google Scholar] [CrossRef]
- Reis, S. Analyzing land use/land cover changes using remote sensing and GIS in Rize, North-east Turkey. Multidiscip. Digit. Publ. Inst. 2008, 8, 6188–6202. [Google Scholar] [CrossRef] [PubMed]
- Fonji, S.F.; Taff, G.N. Using satellite data to monitor land-use land-cover change in North-eastern Latvia. Springerplus 2014, 3, 1–15. [Google Scholar] [CrossRef] [PubMed]
- Hu, T.; Yang, J.; Li, X.; Gong, P. Mapping urban land use by using landsat images and open social data. Remote Sens. Spat. Inf. Sci. 2016, 8, 151. [Google Scholar] [CrossRef]
- Kawakubo, F.S.; Morato, R.G.; Nader, R.S.; Luchiari, A. Mapping changes in coastline geomorphic features using landsat TM and ETM imagery: examples in South Eastern Brazil. Int. J. Remote Sens. 2011, 32, 2547–2562. [Google Scholar] [CrossRef]
- Haeusler, M.H. Enabling low cost human presence tracking. In Proceedings of the International Conference of the Association for Computer-Aided Architectural Design Research in Asia CAADRIA, Melbourne, ON, Australia, 30 March–2 April 2016; pp. 45–54. [Google Scholar]
- Song, J.; Tang, E.Y.; Liu, L. User behavior pattern analysis and prediction based on mobile phone sensors. In Proceedings of the 2010 IFIP International Conference on Network and Parallel Computing, Zhengzhou, China, 13–15 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 177–189. [Google Scholar]
- Zhang, A. Detecting Human Behavior Patterns from Mobile Phone. J. Comput. Inf. Syst. 2012, 8, 2671–2679. [Google Scholar]
- Jankowska, M.M.; Schipperijn, J.; Kerr, J. A framework for using GPS data in physical activity and sedentary behavior studies. Exerc. Sport Sci. Rev. 2015, 43, 48–56. [Google Scholar] [CrossRef] [PubMed]
- Munizaga, M.; Devillaine, F.; Navarrete, C.; Silva, D. Validating travel behavior estimated from smartcard data. Transp. Res. Part C 2014, 44, 70–79. [Google Scholar] [CrossRef]
- Van der Spek, S.; van Schaick, J.; de Bois, P.; de Haan, R. Sensing human activity: GPS tracking. Sensors 2009, 9, 3033–3055. [Google Scholar] [CrossRef] [PubMed]
- Farshad, A.; Marina, M.K.; Garcia, F. Urban wifi characterization via mobile crowdsensing. In Proceedings of the IEEE NOMS, Krakow, Poland, 5–9 May 2014. [Google Scholar]
- Soto, V.; Martinez, E.F. Automated land use identification using cell-phone records. In Proceedings of the 3rd ACM International Workshop on MobiArch, Bethesda, MA, USA, 28 June 2011; ACM: New York, NY, USA, 2011; pp. 17–22. [Google Scholar]
- Toole, J.L.; Ulm, M.; González, M.C. Inferring land use from mobile phone activity. In Proceedings of the ACM SIGKDD International Workshop on Urban Computing, Beijing, China, 12 August 2012; ACM: New York, NY, USA, 2012; pp. 1–8. [Google Scholar]
- Ratti, C.; Pulselli, R.M.; Williams, S.; Frenchman, D. Mobile Landscapes: Using location data from cell-phones for urban analysis. Environ. Plan. 2006, 33, 727–748. [Google Scholar] [CrossRef]
- Hasan, S.; Zhan, X.; Ukkusuri, S.V. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. In Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, Chicago, IL, USA, 11–14 August 2013; ACM: New York, NY, USA, 2013; p. 6. [Google Scholar]
- Sun, Y.; Li, M. Investigation of travel and activity patterns using location-based social network data: A case study of active mobile social media users. ISPRS Int. J. Geo. Inf. 2015, 4, 1512–1529. [Google Scholar] [CrossRef]
- Hasan, S.; Ukkusuri, S.V. Location contexts of user check-ins to model urban geo life-style patterns. PLoS ONE 2015, 10, e0124819. [Google Scholar] [CrossRef] [PubMed]
- Noulas, A.; Scellato, S.; Lathia, N.; Mascolo, C. Mining user mobility features for next place prediction in location-based services. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining (ICDM), Brussels, Belgium, 10–13 December 2012. [Google Scholar]
- Mahmud, J.; Nichols, J.; Drews, C. Home Location Identification of Twitter Users. ACM Trans. Intell. Syst. Technol. 2013, 5, 1–47. [Google Scholar] [CrossRef]
- Williams, E.; Gray, J.; Dixon, B. Improving geolocation of social media posts. J. Pervasive Mob. Comput. 2017, 36, 68–79. [Google Scholar] [CrossRef]
- Kong, L.; Liu, Z.; Huang, Y. SPOT: Locating social media users based on social network context. Proc. VLDB Endow. 2014, 7, 1681–1684. [Google Scholar] [CrossRef]
- Qian, X.; Zhan, X.; Ukkusuri, S.V. Characterizing Urban Dynamics Using Large Scale Taxicab Data. In Engineering and Applied Science Optimization; Springer: Berlin, Germany, 2015; pp. 17–33. [Google Scholar]
- Yuan, J.; Zheng, Y.; Xie, X. Discovering regions of different functions in a city using human mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; ACM: New York, NY, USA, 2012; pp. 186–194. [Google Scholar]
- Frias-Martinez, V.; Frias-Martinez, E. Spectral clustering for sensing urban land use using Twitter activity. Eng. Appl. Artif. Intell. 2014, 35, 237–245. [Google Scholar] [CrossRef]
- Hollenstein, L.; Purves, R.S. Exploring place through user-generated content: Using Flickr tags to describe city cores. J. Spat. Inf. Sci. 2010, 1, 21–48. [Google Scholar]
- Sun, Y.; Fan, H.; Li, M.; Zipf, A. Identifying the city center using human travel flows generated from location-based social networking data. Environ. Plan. B Plan. Des. 2015, 43, 480–498. [Google Scholar] [CrossRef]
- Lansley, G.; Longley, P.A. The geography of Twitter topics in London. Comput. Environ. Urban Syst. 2016, 58, 85–96. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, T.; Tsou, M.H.; Li, H.; Jiang, W.; Guo, F. Mapping dynamic urban land use patterns with crowdsourced geo-tagged social media (sina-weibo) and commercial points of interest collections in Beijing, China. Sustain. Urban Rural Dev. 2016, 8, 1202. [Google Scholar] [CrossRef]
- Open Twitter Streaming Api. Available online: https://dev.twitter.com/docs/streaming-api (accessed on 26 August 2016).
- Central Bureau of Statistic. Available online: http://sp2010.bps.go.id/ (accessed on 18 October 2017).
- Wabula, Y.; Dewancker, B.J. Analysis of urban population using twitter distribution data: Case study of Makassar city, Indonesia. Int. Comput. Electr. Autom. Control Inf. Eng. Waset 2016, 10, 1627–1631. [Google Scholar]
- Land Use Map of Makassar City. Available online: http://darimakassar.com/rtrw-kota-makassar-2010-2030-2/ (accessed on 18 October 2017).
- Irfan, R.; King, C.K.; Es, G.; Ewen, S.; Khan, S.U.; Madani, S.A.; Kolodziez, J.O.; Wang, L.; Chen, D.; Rayes, A.; et al. A survey on text mining in social networks. Knowl. Eng. Rev. 2015, 30, 157–170. [Google Scholar] [CrossRef]
- Varghese, B.M.; Jose, J.T.; Unnikrishnan, A.; Poulose, K.J. Clustering Student Data to Characterize Performance Patterns. Int. J. Adv. Comput. Sci. Appl. Spec. Issue Artif. Intell. 2011, 2, 138–140. [Google Scholar]
- Mihai, D.; Mocanu, M. Statistical considerations on the k-means algorithm. Ann. Univ. Craiova Math. Comput. Sci. Ser. 2015, 42, 365–373. [Google Scholar]
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).