Inferring the Economic Attributes of Urban Rail Transit Passengers Based on Individual Mobility Using Multisource Data

: Socioeconomic attributes are essential characteristics of people, and many studies on economic attribute inference focus on data that contain user proﬁle information. For data without user proﬁles, like smart card data, there is no validated method for inferring individual economic attributes. This study aims to bridge this gap by formulating a mobility to attribute framework to infer passengers’ economic attributes based on the relationship between individual mobility and personal attributes. This framework integrates shop consumer prices, house prices, and smart card data using three steps: individual mobility extraction, location feature identiﬁcation, and economic attribute inference. Each passenger’s individual mobility is extracted by smart card data. Economic features of stations are described using house price and shop consumer price data. Then, each passenger’s comprehensive consumption indicator set is formulated by integrating these data. Finally, individual economic levels are classiﬁed. From the case study of Beijing, commuting distance and trip frequency using the metro have a negative correlation with passengers’ income and the results conﬁrm that metro passengers are mainly in the low- and middle-income groups. This study improves on passenger information extracted from data without user proﬁle information and provides a method to integrate multisource big data mining for more information.


Introduction
With the development of information and smart city technology, urban big data, such as smart card data (SCD), call detail records (CDRs), and data from social media, such as Twitter, has become a new source for analysis of transportation demand and travel patterns [1]. These data are timely, offer wide coverage and relatively lower cost than traditional survey data [2,3]. However, these passively collected big urban datasets with heterogeneous structures lack some information, particularly individual attributes. Passengers' socioeconomic attributes are important essential data for transportation demand analysis and forecasting [4]. Therefore, the investigation of individual attributes using urban big data is an important endeavor that can enrich data and offer the potential more in-depth study.
Urban big data mainly contain spatial and temporal information. This is especially true of SCD, which are normally nonregistered, contain no personal information at the raw data level, and exhibit home and work as the main activity locations with high regularity [5]. Therefore, much work so far has focused on identification of individual home and work locations using these data [6][7][8][9],

Literature Review
The lack of individual attributes is a general disadvantage of urban big data. Therefore, the inference of individual attributes has been actively studied in urban data research. Most studies focus on data with user profiles such as social media data, which are produced by registered entities.
For advertising purposes, user profile data and tweets have generally been used to mine latent user attributes [14]. Daniel et al. [15] mapped Twitter users' job titles with an annual survey of hours and earnings for various job classes to discover individuals' mean yearly income and analyzed the interplay between user emotions and sentiment and income. In addition, the tweet content and behavior of Twitter users are strongly related to their socioeconomic status [16]. Based on this relationship, machine learning algorithms [17] and deep learning methods [18] are used to predict certain personal attributes, and the results can achieve satisfactory precision. Meanwhile, Aletras and Chamberlain [19] conducted an in-depth analysis of the relationship between income and social network structure and written content and used the model to predict user income. Multisource data can provide more perspectives than a single data source from which to derive solutions. Examining the CDRs of the caller and the person called can establish a stronger and truer relationship network. Meanwhile, the phone number can be used to combine this social network with banking information [20]. Consequently, a Bayesian approach could be used to infer other users' economic status based on known users' communication network characteristics [20,21]. Location features can also be integrated with social media check-in data to infer users' demographic attributes, and the results have been verified by registered user profiles [12].
Data without user profiles produced by nonregistered entities, such as SCD, has received less attention in terms of the inference of individual economic attributes. However, many studies have demonstrated that economic attributes are related to individual trip patterns. Populations with higher socioeconomic levels are strongly linked to larger mobility ranges than populations with lower income levels [22], with diversity of mobility exhibiting a strong correlation with socioeconomic attributes [23]. Besides, trip chain type choices and non-work stops in a trip chain are strongly distinct between income groups [24]. Meanwhile, trip pattern analysis based on SCD has shown that the activity sequence structure category is associated with income for full-time employment [25]. In addition, traditional trip survey data has shown that passengers' income is correlated with their commuting distance [11,26]. Because the urbanization rate significantly impacts the commuting time for different income levels [27], different relationships have been found in studies of different countries [11]. Therefore, this relationship provides a method to infer individual economic attributes using data without user profiles.
From the above, most previous studies have overlooked the relationship between individual mobility and economic attributes, which can be used to infer individual economic status. In practice, individuals' economic and consumption attributes are related to the economic characteristics of the places they visit, such as living or entertainment costs, and to mobility features such as visit frequency, mobility diversity, and commuting distance. Integrating location features with passengers' mobility characteristics could be a novel solution to the inference of individual economic attributes.
Based on the relationship between individual economic attributes and passenger mobility [13], this study built a mobility to attribute (M2A) framework to detail the economic attributes of smart card users. It was developed based on three kinds of urban data: SCD, shop consumer data, and house price data [28]. SCD, which record visited stations of each passenger, can be used to generate individual mobility. Shop consumer data and house price data, which are collected from the Internet, can identify location economic features. The framework integrates these data to formulate passengers' consumption attributes. Based on the consumption attributes, their economic levels can be inferred. We used this framework to systematically infer individual economic attributes and analyze individual mobility using urban data of Beijing.

Methods
The inference framework, M2A, is shown in Figure 1. It includes two steps: individual mobility formulation and economic attributes inference. In order to formulate individual mobility, trip chains should be generated from SCD. Then individual mobility is described using commuting distance, mobility diversity, home location, and other types of activity locations extracted from the trip chains. Next, from individual mobility to individual attribute, location economic profile should be calculated using house price and shop consumer price. Then, passengers' economic attributes are described by their mobility indicators and visited stations, mapped with the location economic profiles, including inferior good consumption, normal good consumption, and superior good consumption. Finally, a comprehensive consumption attribute is formulated for each passenger and his/her consumption levels are classified using a clustering method. attributes [23]. Besides, trip chain type choices and non-work stops in a trip chain are strongly distinct between income groups [24]. Meanwhile, trip pattern analysis based on SCD has shown that the activity sequence structure category is associated with income for full-time employment [25]. In addition, traditional trip survey data has shown that passengers' income is correlated with their commuting distance [11,26]. Because the urbanization rate significantly impacts the commuting time for different income levels [27], different relationships have been found in studies of different countries [11]. Therefore, this relationship provides a method to infer individual economic attributes using data without user profiles. From the above, most previous studies have overlooked the relationship between individual mobility and economic attributes, which can be used to infer individual economic status. In practice, individuals' economic and consumption attributes are related to the economic characteristics of the places they visit, such as living or entertainment costs, and to mobility features such as visit frequency, mobility diversity, and commuting distance. Integrating location features with passengers' mobility characteristics could be a novel solution to the inference of individual economic attributes.
Based on the relationship between individual economic attributes and passenger mobility [13], this study built a mobility to attribute (M2A) framework to detail the economic attributes of smart card users. It was developed based on three kinds of urban data: SCD, shop consumer data, and house price data [28]. SCD, which record visited stations of each passenger, can be used to generate individual mobility. Shop consumer data and house price data, which are collected from the Internet, can identify location economic features. The framework integrates these data to formulate passengers' consumption attributes. Based on the consumption attributes, their economic levels can be inferred. We used this framework to systematically infer individual economic attributes and analyze individual mobility using urban data of Beijing.

Methods
The inference framework, M2A, is shown in Figure 1. It includes two steps: individual mobility formulation and economic attributes inference. In order to formulate individual mobility, trip chains should be generated from SCD. Then individual mobility is described using commuting distance, mobility diversity, home location, and other types of activity locations extracted from the trip chains. Next, from individual mobility to individual attribute, location economic profile should be calculated using house price and shop consumer price. Then, passengers' economic attributes are described by their mobility indicators and visited stations, mapped with the location economic profiles, including inferior good consumption, normal good consumption, and superior good consumption. Finally, a comprehensive consumption attribute is formulated for each passenger and his/her consumption levels are classified using a clustering method.

Extraction of Trip Chains
Trip chains are base data for analyzing passengers' mobility. Some studies have verified that trip chains inferred from urban big data are different from traditional survey data and more reasonable than traditional survey data [29,30]. For SCD, a trip chain of each card number without trip purpose can be formulated in a chronological order by trip time. Therefore, the major task is to infer trip purpose for each trip in all trip chains. Most existing methods that model individual mobility are based on the Markov Chain (MC) [31] and, hence, this study formulated a hidden Markov model (HMM) to infer trip purpose, as illustrated in Figure 2.
In a trip chain, trip purpose, or activity type [32] x t is a hidden state in hidden state space HS with J states. A hidden state may perform as observation state k in observation state space OS with K states and its emission possibility of g x t k . For hidden states, the following state can only be affected by the current state, so the transition probability between hidden states is a x t x t+1 .

Extraction of Trip Chains
Trip chains are base data for analyzing passengers' mobility. Some studies have verified that trip chains inferred from urban big data are different from traditional survey data and more reasonable than traditional survey data [29,30]. For SCD, a trip chain of each card number without trip purpose can be formulated in a chronological order by trip time. Therefore, the major task is to infer trip purpose for each trip in all trip chains. Most existing methods that model individual mobility are based on the Markov Chain (MC) [31] and, hence, this study formulated a hidden Markov model (HMM) to infer trip purpose, as illustrated in Figure 2.
In a trip chain, trip purpose, or activity type [32] xt is a hidden state in hidden state space HS with J states. A hidden state may perform as observation state k in observation state space OS with K states and its emission possibility of gx t k. For hidden states, the following state can only be affected by the current state, so the transition probability between hidden states is ax t x t+1 . Trip purpose or activity type mainly relates to three elements: activity start time, land usage of destination, and stay duration, especially for commuting trips [6][7][8][9]. Therefore, an activity with all three elements is chosen to analyze trip purpose. More specifically, an activity is identified when the exit station of a trip is the same as the entry station of the next trip, according to the records of one card number. Activity start time can be defined as the exit time, while staying duration can be described as the exit time of the trip subtracted from the entry time of the next trip.
Mapping to the HMM, the observations are the three elements of activity and, hence, the observable parameter set o l t of the tth activity in the lth trip chain can be constructed as Equation (1), in which s l t is the activity start time, d l t is the stay duration, and c l t is the vector of the degree of land usage mixture of the station that can be inferred from passenger flow distribution at the station [33,34]; for more detailed information, see Yue et al. [35].
In this model, observations are continuous variables and, thus, they should be classified into a discrete observation state space OS with K states. Assuming that observations of state k in every trip chain are Gaussian distributions with µk as the mean value and σk as the variance, observation ot is classified under state k using Equation (2): Then, a discrete time-homogeneous Markov model is formulated [36]. Comparing with standard HMM, a discrete processing using Equation (2) is added into it. In addition, the optimal parameter set is λ = [π, A, G, µ, σ], in which the initial activity is π = (π1, π2, …, πM), the transition probability is A = (a11, a12, …, aJ,J−1, aJ,J), the emission possibility is G = (g11, g12, …, gJ,K−1, gJ,K), the observations' mean value is µ = (µ1, µ2, …, µK−1, µK), the variance is σ = (σ1, σ2, …, σK−1, σK), and the total number of trip Trip purpose or activity type mainly relates to three elements: activity start time, land usage of destination, and stay duration, especially for commuting trips [6][7][8][9]. Therefore, an activity with all three elements is chosen to analyze trip purpose. More specifically, an activity is identified when the exit station of a trip is the same as the entry station of the next trip, according to the records of one card number. Activity start time can be defined as the exit time, while staying duration can be described as the exit time of the trip subtracted from the entry time of the next trip.
Mapping to the HMM, the observations are the three elements of activity and, hence, the observable parameter set o l t of the tth activity in the lth trip chain can be constructed as Equation (1), in which s l t is the activity start time, d l t is the stay duration, and c l t is the vector of the degree of land usage mixture of the station that can be inferred from passenger flow distribution at the station [33,34]; for more detailed information, see Yue et al. [35].
In this model, observations are continuous variables and, thus, they should be classified into a discrete observation state space OS with K states. Assuming that observations of state k in every trip chain are Gaussian distributions with µ k as the mean value and σ k as the variance, observation o t is classified under state k using Equation (2): Then, a discrete time-homogeneous Markov model is formulated [36]. Comparing with standard HMM, a discrete processing using Equation (2) is added into it. In addition, the optimal parameter set is λ = [π, A, G, µ, σ], in which the initial activity is π = (π 1 , π 2 , . . . , π M ), the transition probability is A = (a 11 , a 12 , . . . , a J,J−1 , a J,J ), the emission possibility is G = (g 11 , g 12 , . . . , g J,K−1 , g J,K ), the observations' Sustainability 2018, 10, 4178 5 of 17 mean value is µ = (µ 1 , µ 2 , . . . , µ K−1 , µ K ), the variance is σ = (σ 1 , σ 2 , . . . , σ K−1 , σ K ), and the total number of trip chains is M. According to the observed trip chains, the optimal parameter set can be estimated using the Baum-Welch and forward-backward algorithms [9,36].
To infer trip purpose, the Viterbi algorithm [36] was improved and adapted in this model, as indicated, in Equation (3). V t,x t is the probability value of hidden state x t corresponding to observation o at t based on the former states. After all trips in a trip chain are calculated, the state chain with the largest probability is the trip purpose of the corresponding trip chain and the trip chains with their trip purposes are extracted from SCD.

Individual Mobility
Commuting distance and mobility diversity have a correlation with each passenger's economic attributes [11,23,26]. Commuting distance is calculated as the shortest path in the network between the station where home is located and the station where work is located for the passenger. All other types of activities, including shopping, entertainment, and eating, can be used to describe mobility diversity that is measured using the Shannon entropy [13,23].
In this study, three trip purposes are identified: going home (H), going to work (W), and other-type (O). Consequently, home location (S H ) and work (S W ) location are identified to calculate commuting distance using rule-based method, as shown in Figure 3. chains is M. According to the observed trip chains, the optimal parameter set can be estimated using the Baum-Welch and forward-backward algorithms [9,36].
To infer trip purpose, the Viterbi algorithm [36] was improved and adapted in this model, as indicated, in Equation (3). Vt,x t is the probability value of hidden state xt corresponding to observation o at t based on the former states. After all trips in a trip chain are calculated, the state chain with the largest probability is the trip purpose of the corresponding trip chain and the trip chains with their trip purposes are extracted from SCD.

Individual Mobility
Commuting distance and mobility diversity have a correlation with each passenger's economic attributes [11,23,26]. Commuting distance is calculated as the shortest path in the network between the station where home is located and the station where work is located for the passenger. All other types of activities, including shopping, entertainment, and eating, can be used to describe mobility diversity that is measured using the Shannon entropy [13,23].
In this study, three trip purposes are identified: going home (H), going to work (W), and othertype (O). Consequently, home location (SH) and work (SW) location are identified to calculate commuting distance using rule-based method, as shown in Figure 3. As for home and work location, they generally are fixed for each passenger [7,37]. Therefore, destination stations and frequency of trip purpose H and W are computed for every passenger. Then, the locations are identified as the stations with the highest corresponding frequency. However, if there are multiple stations with the highest frequency for trip purpose H/W, the corresponding location is identified as the station with the highest proportion of residential/office land usage. Finally, each passenger's commuting distance (cd) is calculated using the Dijkstra algorithm.
Meanwhile, destination stations with trip purpose O {S1, …, SM} are also extracted to measure mobility diversity. The mobility entropy E(u) of individual u is shown in Equation (4). D is the set of all trip destination stations, p(d) is the probability of station d, and H is the total number of trips.
Finally, for each passenger u, we can formulate his/her individual mobility characteristic IM u by Equation (5).  As for home and work location, they generally are fixed for each passenger [7,37]. Therefore, destination stations and frequency of trip purpose H and W are computed for every passenger. Then, the locations are identified as the stations with the highest corresponding frequency. However, if there are multiple stations with the highest frequency for trip purpose H/W, the corresponding location is identified as the station with the highest proportion of residential/office land usage. Finally, each passenger's commuting distance (cd) is calculated using the Dijkstra algorithm.
Meanwhile, destination stations with trip purpose O {S 1 , . . . , S M } are also extracted to measure mobility diversity. The mobility entropy E(u) of individual u is shown in Equation (4). D is the set of all trip destination stations, p(d) is the probability of station d, and H is the total number of trips.
Finally, for each passenger u, we can formulate his/her individual mobility characteristic IM u by Equation (5).

Attributes Inference Model
Individual mobility characteristics are related to economic attributes. The visited locations contain far more information than just the category [12], and the location's economic feature is related to its visitors' economic attributes. More specifically, the price of their family home can reflect their affordable living cost, and the price of goods in their visited shops can reflect their consumption level. Consequently, location's economic features can be used to further detail passengers' economic attributes. In this study, a location's economic feature is derived from two aspects: living cost and entertainment cost.

Location Economic Feature
Living cost is captured by the rental or sale price of a house in the range of a station's catchment area, as shown in Equation (6). In the equation, av s is the average sale price, av r is the average rental price, v s is the variance value of the sale price, and v r is the variance value of the rental price.
The entertainment cost at a station can be formulated by the average price and variance of each shop type in its catchment area, as indicated by Equation (7). In the equation, av c is the average price value of shop type c, v c is the price variance of shop type c, and N c is the total number of shop types c, c = (1, 2, 3).

Individual Consumption Characteristic
From the above, each passenger's consumption characteristic can be formulated by integrating his/her individual mobility characteristic and location feature. For analyzing the relationship between consumption behavior and individual economic attributes (income level), income elasticity of demand (IED) is introduced from economics. IED is defined as the ratio of the percentage change in the demand for a good to the percentage change in consumer income measured by the income expenditure (price) for the good [38][39][40] to describe the economic feature of the good. In accordance with value of IED, three types of goods can be classified as inferior goods (IED < 0), such as bus travel and canned food, normal goods (0 < IED < 1) such as a house and food, and superior goods (IED >> 1) such as entertainment and fashion items [40].
This study derived smart card user's consumption characteristic from three aspects, inferior good consumption, normal good consumption, and superior good consumption.
• Inferior good consumption (IGC) The most common view of public transit is as an inferior good, which means as income rises, people will travel less by public transit; however, this is often debated [41,42]. Holmgren [41] reviewed 22 IED values for public transit, and found a range of −0.82 to 1.18, with a mean of 0.17. This shows that the usage of public transit is highly dependent on a passenger's income, but the relationship is ambiguous.
Here, we assume that public transit is an inferior good and the expected IED is negative. Indicators of commuting distance and total number of trips using the subway for each passenger are chosen to describe inferior good consumption. Then, the relationship between these indicators and individual economic attributes are analyzed to test this assumption.
• Normal good consumption (NGC) Housing has been verified as a normal good, and the IED value ranges from 0.69 to 1.43 [38,39,43]. More specifically, IED values in China range from 0.786 to 1.430 with an average of 1.044 [43], which indicates that as income rises, people increase housing expenditure, and the increment is approximately equal to the income increment. In this study, the living cost of each passenger's home location is chosen to measure the normal good consumption. In addition, we assume that the expenditure distribution on normal goods for all passengers is consistent with their income distribution, and the relationships between other factors and income are consistent with the relationships between them and passengers' living costs.

• Superior good consumption (SGC)
A superior good is generally viewed as a special normal good that has a higher IED value. Dining out, shopping, and entertainment have been verified as superior goods in some research [44,45] and are always related to activities other than working and staying at home. Therefore, the mobility of the other-type activity and the corresponding location's economic features are used to formulate each passenger's superior good consumption. More specifically, for each passenger, entertainment expenditure is calculated using the station entertainment costs of all stations visited for other-type activities by Equation (8) [46]. The passenger's superior good consumption is described using entertainment expenditure and mobility diversity.

Economic Attributes Inference
Based on these consumption indicators, a comprehensive consumption indicator set C u is formulated for passenger u, as shown in Equation (9), and all terms are as previously defined.
To reduce the correlation among variables, the principal component analysis (PCA) is used to extract the principal components of the indicators. Then, a k-means clustering method is used to obtain passengers' consumption levels, and the optimal number of clusters is chosen by the Davies-Bouldin criterion [47]. These levels are passenger economic attribute levels.

Model Implementation and Results
This study utilizes one-week SCD from the Beijing subway for March 2016, consumer data for the shops around the subway stations for 2016, and house sale and rental price data for 2016 to implement the M2A framework.

Data Preparation
Urban big data are collected passively and include abundant bad and redundant data. Therefore, they need to be cleaned and preprocessed initially. To simplify, we assume passengers come from or are destined for a location within walking range of a station, and do not require other transportation modes.
From SCD, card ID, entry line and station ID, entry time, exit line and station ID, and exit time are extracted from a large number of fields. Then, line and station ID are replaced by station name to identify passengers going to an interchanging station from different entrances from one station. Finally, the data are cleaned by deleting the records in which the entry and exit stations are identical or the exit time is earlier than the entry time. After preprocessing and data integrating, 50,141 passengers' trip records remain to be analyzed.
Shop consumer data are collected from a business review website, dianping.com, in China that is similar to Yelp in the US. The data contain five items: station name, every shop name, every shop location, average price of each shop, and review score of each shop. First, we calculate the Euclidean distance between shops and the nearest stations and filter the shops by a distance of more than 800 m. Then, according to shop categories (c), which are catering, entertainment, and shopping, we calculate the average price (av) and price variance (v) of consumption for every station using Equations (10) and (11): where N c is the number of shops in category c in the catchment area of a station, pr i is the average price of the ith shop, and cf i is the ratio of the review score of the ith shop to the sum score of all category c shops, which reflects the attractiveness of a shop. Shop consumer data of Xizhimen station, available online, are used to show the preprocessing. First, we group the shops into three categories based on their types; then, we calculate the average price and price variance for each category. Considering the example of catering-type shops, there are three items to review on a 10-point scale for each shop; these items describe taste, environment, and service. For each item, we calculate the average score and assign it to the missing shops. We sum all of the scores to obtain the total score ts i for each shop and the average total score for the catering-type shops in the Xizhimen area, as = 22.29. The attractiveness of each shop is calculated as cf i = ts i /as. Integrated with the price of each shop, the average price of catering in the Xizhimen area is formulated using Equation (1), av cater = 45.92. Based on the average price, the price variance can be calculated using Equation (2), v cater = 1261.67. The same processing is used for entertainment and shopping in the Xizhimen area, the average price and price variance can be calculated as, av enter = 36.02, v enter = 999.82, av shop = 477.5, v shop = 415,973.4.
The house price data are collected from a real estate website. They contain three items: station name, house location, and rental or selling price. Houses that are located in the catchment area of 800 m are chosen for the analysis.

Model Implementation
First, we implement HMM based on the algorithm flow, as shown in Figure 4; a separate numerical simulation study was conducted [48]. From the results, we identified six observation clusters, as shown in Table 1. Based on the observation clusters, four activity types were inferred with three trip purposes, as presented in Table 2. Activities 1 and 4 are included in the trip purpose of "Work", which may be because of different attendance management of different companies in Beijing.  Based on these results, each passenger's trip purpose is inferred from the SCD to generate trip chains using HMM; the percentages of passengers traveling to work and home at different times of the day are shown in Figure 5. Due to a lack of detailed survey data for Beijing, the results of the Household Interview Travel Survey (HITS) and the Future Mobility Survey (FMS) in Singapore [49] are used to verify the results of this study. Based on these, the results of this study using HMM are consistent with the results of the FMS during peak hours, while they match better with the results of HITS during off-peak hours. Because trip purposes during peak hours mainly are going home or to work [6,7], they have higher regularity and predictability than other periods [5]. In this study, HMM can capture these trips with high accuracy as does the FMS [49]. During off-peak hours, trip purposes vary; however, this study only analyzed continuous trips by metro using smart cards. Therefore, its limited sample size may lead to under-reporting of related trips as in the HITS [49]. Notably, in the results of the HITS and FMS, the morning peak is earlier than in the HMM result. This is because work time is earlier in Singapore than in Beijing. From the above, accuracy of the results of HMM is similar to or even higher than that of the HITS and are suitable for in-depth analysis.  Based on these results, each passenger's trip purpose is inferred from the SCD to generate trip chains using HMM; the percentages of passengers traveling to work and home at different times of the day are shown in Figure 5. Due to a lack of detailed survey data for Beijing, the results of the Household Interview Travel Survey (HITS) and the Future Mobility Survey (FMS) in Singapore [49] are used to verify the results of this study. Based on these, the results of this study using HMM are consistent with the results of the FMS during peak hours, while they match better with the results of HITS during off-peak hours. Because trip purposes during peak hours mainly are going home or to work [6,7], they have higher regularity and predictability than other periods [5]. In this study, HMM can capture these trips with high accuracy as does the FMS [49]. During off-peak hours, trip purposes vary; however, this study only analyzed continuous trips by metro using smart cards. Therefore, its limited sample size may lead to under-reporting of related trips as in the HITS [49]. Notably, in the results of the HITS and FMS, the morning peak is earlier than in the HMM result. This is because work time is earlier in Singapore than in Beijing. From the above, accuracy of the results of HMM is similar to or even higher than that of the HITS and are suitable for in-depth analysis.  Based on these results, each passenger's trip purpose is inferred from the SCD to generate trip chains using HMM; the percentages of passengers traveling to work and home at different times of the day are shown in Figure 5. Due to a lack of detailed survey data for Beijing, the results of the Household Interview Travel Survey (HITS) and the Future Mobility Survey (FMS) in Singapore [49] are used to verify the results of this study. Based on these, the results of this study using HMM are consistent with the results of the FMS during peak hours, while they match better with the results of HITS during off-peak hours. Because trip purposes during peak hours mainly are going home or to work [6,7], they have higher regularity and predictability than other periods [5]. In this study, HMM can capture these trips with high accuracy as does the FMS [49]. During off-peak hours, trip purposes vary; however, this study only analyzed continuous trips by metro using smart cards. Therefore, its limited sample size may lead to under-reporting of related trips as in the HITS [49]. Notably, in the results of the HITS and FMS, the morning peak is earlier than in the HMM result. This is because work time is earlier in Singapore than in Beijing. From the above, accuracy of the results of HMM is similar to or even higher than that of the HITS and are suitable for in-depth analysis.  Subsequently, individual mobility is derived from the trip chains. The location economic feature data are combined, the comprehensive consumption indicator set of each passenger is formulated, and five principal components are extracted, as show in Table 3. The first component includes the information of commuting distance and living cost. However, commuting distance shows negative relationship with living cost significantly. This result means as income rises, people will travel shorter to commute by public transit, and it confirms that public transit is an inferior good for the commuting trip. Catering, entertainment, and shopping consumption are three different components, respectively. It shows they have a different relationship with income. At last, trip frequency and mobility diversity are in the same component, which means they have the same relationship with income.

Results
At last, six consumption levels are classified using k-means method, as shown in Figure 6. Subsequently, individual mobility is derived from the trip chains. The location economic feature data are combined, the comprehensive consumption indicator set of each passenger is formulated, and five principal components are extracted, as show in Table 3. The first component includes the information of commuting distance and living cost. However, commuting distance shows negative relationship with living cost significantly. This result means as income rises, people will travel shorter to commute by public transit, and it confirms that public transit is an inferior good for the commuting trip. Catering, entertainment, and shopping consumption are three different components, respectively. It shows they have a different relationship with income. At last, trip frequency and mobility diversity are in the same component, which means they have the same relationship with income.

Results
At last, six consumption levels are classified using k-means method, as shown in Figure 6. From Figure 6a,b, we can claim that passengers in cluster 3 have the highest income among all the metro passengers, followed by cluster 2, based on our assumptions. Furthermore, clusters 1 and 4 are the middle-income groups and clusters 5 and 6 are the low-income groups. Considering the From Figure 6a,b, we can claim that passengers in cluster 3 have the highest income among all the metro passengers, followed by cluster 2, based on our assumptions. Furthermore, clusters 1 and 4 are the middle-income groups and clusters 5 and 6 are the low-income groups. Considering the distribution of commuting distance in Figure 6c, it has a negative relationship with metro passenger income. As for superior consumption, clusters 3, 5, and 6 maintain a low expenditure on catering, entertainment, and shopping. High-income passengers have more options, such as private cars or taxis, for flexible trips like shopping, while low-income passengers are constrained by ratio of expenditure to income [50]. Therefore, they take fewer superior consumption trips via the metro than other passengers and their expenditure is lower than others. Interestingly, other passengers have different consumption preferences for superior goods. Passengers in cluster 2, who have higher income than those passengers of clusters 1 and 4, prefer to take the metro to high consumption areas of shopping, while passengers in cluster 1 prefer to take the metro to expensive eating areas and spend more money on dining outside, and passengers in cluster 4 prefer to travel to expensive entertainment by metro. Individual trip frequency and mobility diversity by metro have no significant correlation with passenger income, as evident from Figure 6d,h. More specifically, the average trip frequency is about nine for all clusters, excluding cluster 6; this implies that these passengers mainly use the metro for commuting.

Discussion
In-depth analysis on home locations of different income levels is shown in Figure 7. Homes of high-income passengers are mainly located in the north of the city center, as shown in clusters 2 and 3. Homes of low and middle-income passengers are mainly located outside the city center; and, some of them in clusters 5 and 6 are located in the north of the suburbs. This result is consistent with the observation from Mohamed et al. [10] which also shows high-income passengers mainly live in downtown areas and low-income passengers mainly live in suburban areas.
Sustainability 2018, 10, x FOR PEER REVIEW 11 of 17 distribution of commuting distance in Figure 6c, it has a negative relationship with metro passenger income. As for superior consumption, clusters 3, 5, and 6 maintain a low expenditure on catering, entertainment, and shopping. High-income passengers have more options, such as private cars or taxis, for flexible trips like shopping, while low-income passengers are constrained by ratio of expenditure to income [50]. Therefore, they take fewer superior consumption trips via the metro than other passengers and their expenditure is lower than others. Interestingly, other passengers have different consumption preferences for superior goods. Passengers in cluster 2, who have higher income than those passengers of clusters 1 and 4, prefer to take the metro to high consumption areas of shopping, while passengers in cluster 1 prefer to take the metro to expensive eating areas and spend more money on dining outside, and passengers in cluster 4 prefer to travel to expensive entertainment by metro. Individual trip frequency and mobility diversity by metro have no significant correlation with passenger income, as evident from Figure 6d,h. More specifically, the average trip frequency is about nine for all clusters, excluding cluster 6; this implies that these passengers mainly use the metro for commuting.

Discussion
In-depth analysis on home locations of different income levels is shown in Figure 7. Homes of high-income passengers are mainly located in the north of the city center, as shown in clusters 2 and 3. Homes of low and middle-income passengers are mainly located outside the city center; and, some of them in clusters 5 and 6 are located in the north of the suburbs. This result is consistent with the observation from Mohamed et al. [10] which also shows high-income passengers mainly live in downtown areas and low-income passengers mainly live in suburban areas. Note that home and work locations are fixed for each passenger, and that other-type activities can reflect passengers' mobility diversity more adequately. Therefore, we analyzed the other-type activities for each cluster; their spatial distributions, shown in Figure 8 suggest significant relations with the economic groups. All of them show significant positive spatial autocorrelation (Moran's I > 0, p = 0), but for high-income passengers in clusters 2 and 3, the activity locations have a stronger spatial autocorrelation than the other clusters because of larger Moran's I values, implying that these locations have a higher spatial aggregation. Furthermore, the frequently visited stations are mainly Note that home and work locations are fixed for each passenger, and that other-type activities can reflect passengers' mobility diversity more adequately. Therefore, we analyzed the other-type activities for each cluster; their spatial distributions, shown in Figure 8 suggest significant relations with the economic groups. All of them show significant positive spatial autocorrelation (Moran's I > 0, p = 0), but for high-income passengers in clusters 2 and 3, the activity locations have a stronger spatial autocorrelation than the other clusters because of larger Moran's I values, implying that these locations have a higher spatial aggregation. Furthermore, the frequently visited stations are mainly located in the north of the city; this spatial character is similar to the spatial distribution of their homes. For middle-income groups (clusters 1 and 4), the frequently visited stations are mainly located in the Guomao (GM), Wangfujing (WFJ), and Chaoyangmen (CYM) regions. These regions have high consumption levels, especially of catering and entertainment, and therefore these passengers' expenditure on catering and entertainment is high. Low-income groups have numerous frequently visited locations in the entire city region, including the south of the city, and they have a low spatial aggregation. Other-type trips of low and middle-income passengers show that the comprehensive development of land around public transit stations in suburban areas is lower than that around those stations in the city center. Therefore, passengers are obliged to take public transit from their suburban homes to the city center for catering, entertainment or shopping. Consequently, for government decision-makers, comprehensive development of land around suburban stations may afford greater convenience to low and middle-income citizens.
Sustainability 2018, 10, x FOR PEER REVIEW 12 of 17 located in the north of the city; this spatial character is similar to the spatial distribution of their homes. For middle-income groups (clusters 1 and 4), the frequently visited stations are mainly located in the Guomao (GM), Wangfujing (WFJ), and Chaoyangmen (CYM) regions. These regions have high consumption levels, especially of catering and entertainment, and therefore these passengers' expenditure on catering and entertainment is high. Low-income groups have numerous frequently visited locations in the entire city region, including the south of the city, and they have a low spatial aggregation. Other-type trips of low and middle-income passengers show that the comprehensive development of land around public transit stations in suburban areas is lower than that around those stations in the city center. Therefore, passengers are obliged to take public transit from their suburban homes to the city center for catering, entertainment or shopping. Consequently, for government decision-makers, comprehensive development of land around suburban stations may afford greater convenience to low and middle-income citizens. Next, a Pearson correlation coefficient is calculated between living consumption indicators and other consumption indicators for quantitative analysis, as shown in Tables 4 and 5. Overall, commuting distance and trip frequency have a significantly negative correlation with passengers' living consumption expenditure that can be used to represent their income based on our assumption. This result verifies that public transit is an inferior good. Average expenditures for catering, entertainment, and shopping have significantly positive correlations with passengers' living consumption expenditure, while mobility diversity has a negative relationship with living consumption but this relationship is not significant for housing rental expenditure. This reveals that passengers with high-income do not prefer to take the metro to a new place as they have alternative modes to choose from [50]. Next, a Pearson correlation coefficient is calculated between living consumption indicators and other consumption indicators for quantitative analysis, as shown in Tables 4 and 5. Overall, commuting distance and trip frequency have a significantly negative correlation with passengers' living consumption expenditure that can be used to represent their income based on our assumption. This result verifies that public transit is an inferior good. Average expenditures for catering, entertainment, and shopping have significantly positive correlations with passengers' living consumption expenditure, while mobility diversity has a negative relationship with living consumption but this relationship is not significant for housing rental expenditure. This reveals that passengers with high-income do not prefer to take the metro to a new place as they have alternative modes to choose from [50].  Based on group dimensions, as a high-income group, cluster 3 has a lower correlation between income and commuting distance, as marked by bold. This indicates that high-income passengers may have a longer trip to work; this relationship has been indicated in previous studies as well [11,26,27]. Further, the correlation with shopping expenditure is lower or even negative, which indicates that high-income passengers do not prefer to take the metro to shop. For cluster 2, which is the second highest income group, this group has a higher correlation with all types of superior consumption because they have more disposable income to afford such consumption. For the middle-income groups, there is only one type of superior consumption with a high correlation-for cluster 1 it is shopping-although they generally go to expensive eating areas and spend more money on eating away from home; and for cluster 4, it is entertainment. In addition, their mobility diversity shows no significant correlation with their income. From the data in the two tables, passengers in cluster 6 have higher income than those in cluster 5. Because cluster 5 has a positive correlation with trip frequency, this indicates that these passengers have a strong dependency on public transit. Meanwhile, passengers in cluster 6 have a higher correlation with shopping expenditure than most other groups. However, cluster 5 represents a large share of the studied passengers, indicating that 37.55% of the passengers are low-income and they have a strong dependency on the metro for their daily trips. This result is consistent with the survey result of Beijing that claims that metro passengers are mainly low and middle-income people [51]. From an overall perspective, shopping consumption is strongly related to passenger's income, especially for low and middle-income passengers. In combination with the home location distribution of these passengers, this indicates that comprehensive shopping malls are needed for suburban stations. An improved land usage mix could reduce these other-type trips, and afford greater convenience. This would also embody the principles of transit-oriented development (TOD).
Based on our analysis and discussion, this model framework can perfectly infer passengers' economic attributes from SCD without user profiles, and some results have been verified by previous studies. However, several limitations exist in this study. The first is result validation from an individual perspective. This study focuses on economic attributes inference using big data without user profiles, and the raw data lack individual information. Based on this limitation, we introduce the second, which is that the income amount cannot be calibrated, and we are able to obtain only the relative income level. Therefore, more detailed survey data should be considered for in-depth analysis in the future. Finally, as a preliminary work to infer travelers' economic attributes using urban big data without user profile information, this study formulates the model framework using certain default assumptions such as activity location, which is considered as the walking-distance range from an exit station without transferring to other modes. This limitation can be avoided by taking more data sources into consideration.

Conclusions
Most studies mainly focus on the inference of individual economic attributes using big data with user profile information, like occupation and phone number, which can be used to integrate other economic data. Data without user profile information, like SCD, have not been considered for individual economic attributes inference. This study fills this gap by formulating a M2A framework based on the relationship between individual mobility and economic attributes.
The M2A framework integrates individual mobility characters with location features to infer passenger economic attributes. Using this framework, a case study of Beijing is implemented. From the results, we confirm that commuting distance and trip frequency using the metro have a negative correlation with passengers' income. However, some high-income passengers may have a longer trip distance to work, which has also been found in previous studies. High-income passengers mainly live in the city center, while low and middle-income passengers mainly live in suburban areas. However, low and middle-income passengers prefer to shop in the city center, because suburban stations generally lack comprehensive land development. Therefore, improving the land usage mixture around suburban station is needed for TOD. In addition, for the middle-income group and the second highest income passengers, they can afford more types and more expensive superior goods. As for low-income passengers, who make up a larger part of the metro ridership, they have a strong dependency on the metro for their daily trip.
Based on the limitations of this study, acquiring more data sources such as SCD of the bus system, bike sharing system data, or even CDRs would be suggested for future work. These data can provide more detail on travelers' activity characteristics for inferring more accurate economic attributes. Further, long-period data also can effectively improve the results, which can reflect individual travel preference for various activities. Besides, more location features, such as work location features, would be considered in the future improved framework for more accurate individual economic attributes.