Uncovering Spatio-temporal Travel Patterns Using a Tensor-Based Model from Metro Smart Card Data in Shenzhen, China

: Individual mobility patterns are an important factor in urban tra ﬃ c planning and tra ﬃ c ﬂow forecasting. How to understand the spatio-temporal distribution of passengers deeply and accurately, so as to provide theoretical support for the planning and operation of the metro network, is an urgent issue of wide concern. In this paper, we applied NCP decomposition to uncover the characteristics of travel patterns from temporal and spatial dimensions in the metro network of Shenzhen City. Utilizing matrix factorization and correlation analysis, we extracted several stable components from the collective mobility and ﬁnd that the departure and arrival mobility patterns have di ﬀ erent characteristics in both the temporal and spatial dimension. According to the point of interest (POI) data in the Shenzhen City, the function attributes of the station are identiﬁed and then we found that the spatial distribution characteristics of di ﬀ erent patterns are di ﬀ erent. We explored the distribution of travel time classiﬁed according to the spatio-temporal characteristics of stable patterns. The proposed method can decompose stable travel patterns from the collective mobility and the results in this study can help us to better understand di ﬀ erent mobility patterns in both spatial and temporal dimensions.


Introduction
Urban transportation is an important support for the economic development of the cities, which profoundly affects the living quality of citizens, the efficiency of logistics, as well as other factors related with transport. With the increase of population and the continuous expansion of the urban city, urban traffic is increasing rapidly and this massive traffic puts enormous pressure on road networks. Traffic congestion has become more serious due to the imbalance between infrastructure capacity and travel demands. Due to their high-capacity, speed and reliability, metro systems can effectively alleviate the problems of urban traffic congestion. Therefore, how to understand the spatial and temporal distribution of passengers deeply and accurately, so as to provide theoretical support for the planning and operation of the metro network, is an urgent issue of wide concern.
As an essential part of the public transportation systems in big cities, metro systems can address the massive travel demand of citizens thanks to their low price and arrival accuracy. Recently, a large amount of works have focused on passenger forecasting [1][2][3], network planning [4] and human travel analysis, we establish a correlation among different days and extract some stable components from the individual mobility. Excluding the occasional factors such as weather and activity, the characteristics of the stable mobility patterns can be explored both in the temporal and spatial dimension. Next, based on the POI data collected in Shenzhen City, the function of the station is set and then we further analyze the spatial distribution characteristics of different travel patterns. Furthermore, we explored the distribution of travel time classified according to the distribution characteristics based on the three basic stable patterns in the temporal and spatial dimension. The individual records are classified into different patterns and the most likely travel time is analyzed by category.
The remainder of this study is organized as follows: In Section 2, we describe the data sources for the study. In Section 3, we introduce the framework of probabilistic factorization based on NCP decomposition and construct the tensor model. In Section 4, we analyze the spatial and temporal characteristics of the daily collective mobility patterns based on the tensor decomposition results. In Section 5, we use correlation analysis to identify stable basic patterns for the spatio-temporal distribution under the stable basic patterns. Finally, Section 6 summarizes the key findings.

Data Source Collection
This section briefly introduces the metro network, the smart card data collected during three weeks and the point of interest (POI) information of Shenzhen City. Smart card data reflect urban passengers' travel behavior. Meanwhile, POI information reflects land use in a city.

Smart Card Dataset
Shenzhen City, located in the south of China, is one of the fast-growing and densely populated metropolitan cities in the world. The range of the general study area in Shenzhen City is limited in longitude [113.684206, 114.658294] and altitude [22.243608, 22.862324]. The metro smart card data were collected from the automatic fare collection (AFC) system. The dataset covers a population of millions of passengers for three weeks from 3 April 2017 to 23 April 2017, which include the codes of entry and exit metro stations and the corresponding time when getting in and out of the stations. However, there are some problems such as missing information, invalid or repeated card As f For the structure information of metro system, it has a total of 167 stations and eight metro lines (including Shenzhen Metro Lines 1-5, Line 7, Line 9, and Line 11), shown in Figure 1. Furthermore, it is essential to match the latitude and longitude of the metro stations with the data analysis results, the latitude and longitude information of the metro station were also collected from the open source map.

POI Dataset
The travel behaviors of passengers are closely linked with the type of land use in the area around the metro stations. Urban POI data refers to all geographical entity objects that can be abstracted into points without area or volume which are closely related to people's daily life and can reflect the functional attribution of urban areas. The POIs of cities are distributed with different density around arrival stations. The places with more attractive points of interest are generally the locations with higher possibility to travel. In this paper, we extract the POI information of Shenzhen City to identify the characteristics of the urban functional area, and then use this information to analyze the spatial patterns of the travels. At present, application programming interfaces (APIs) such as Baidu Map, Google Maps and Gaode Map can be used to obtain the POI information. We utilize the requests module in Python to obtain the POI information of Shenzhen City provided by the Baidu map development platform [36]. The platform divides the POI information of cities into 19 categories and 140 subcategories according to default function attributes. As we cannot collect the travel purpose from the data in smart card, the travel behavior we introduced in this study is the combination of daily travel trips with different purpose under metro network. We divide different types of POI into seven major categories: residential area (represented by R), corporate companies (indicated by C), transportation hub (indicated by T), leisure and entertainment (indicated by L), medical institution (indicated by M), education institution (indicated by E) and gourmet restaurant (indicated by F). The specific contents of the seven categories of POIs are shown in Table 1. Records within each category include category attributes, POI names and the longitude and latitude information. Figure 2 shows the seven major categories of POI information on the map.

POI Dataset
The travel behaviors of passengers are closely linked with the type of land use in the area around the metro stations. Urban POI data refers to all geographical entity objects that can be abstracted into points without area or volume which are closely related to peopleʹs daily life and can reflect the functional attribution of urban areas. The POIs of cities are distributed with different density around arrival stations. The places with more attractive points of interest are generally the locations with higher possibility to travel. In this paper, we extract the POI information of Shenzhen City to identify the characteristics of the urban functional area, and then use this information to analyze the spatial patterns of the travels. At present, application programming interfaces (APIs) such as Baidu Map, Google Maps and Gaode Map can be used to obtain the POI information. We utilize the requests module in Python to obtain the POI information of Shenzhen City provided by the Baidu map development platform [36]. The platform divides the POI information of cities into 19 categories and 140 subcategories according to default function attributes. As we cannot collect the travel purpose from the data in smart card, the travel behavior we introduced in this study is the combination of daily travel trips with different purpose under metro network. We divide different types of POI into seven major categories: residential area (represented by R), corporate companies (indicated by C), transportation hub (indicated by T), leisure and entertainment (indicated by L), medical institution (indicated by M), education institution (indicated by E) and gourmet restaurant (indicated by F). The specific contents of the seven categories of POIs are shown in Table 1. Records within each category include category attributes, POI names and the longitude and latitude information. Figure 2 shows the seven major categories of POI information on the map.

Methodology
As the amount of data collected from smart cards in metro systems is huge, and the complexity of the data increases with the dimensions of its contents, it is difficult for traditional and basic

Methodology
As the amount of data collected from smart cards in metro systems is huge, and the complexity of the data increases with the dimensions of its contents, it is difficult for traditional and basic statistical Sustainability 2020, 12, 1475 5 of 16 methods to deal with this data source. The tensor-based model we applied in this study can extract multi-dimensional features from an original dataset. The passengers' spatio-temporal characteristics of different travel patterns can be quantified based on tensor decomposition. Tensor decomposition is a high-dimensional generalization of singular value decomposition (SVD) and principal component analysis (PCA), which includes two types of tensor decomposition techniques: Tucker decomposition and CP decomposition. Tucker decomposition is a high-order principal component analysis method that decomposes a tensor into a core tensor multiplied by a corresponding factor matrix along each dimension. The CP decomposition decomposes a high-dimensional tensor into a sum of component rank-one tensors. The rank one tensor is a special tensor type. If an N-way tensor can be expressed by the outer product of N vectors, then this is a rank one tensor. In this study, we apply the CP decomposition to analyze the spatial-temporal characteristics of urban travels.
The CP decomposition process can be explained by a third-order tensor X ∈ R I×J×K show in Figure 3, where I, J, K are the dimension along the pth way (p ∈ {1,2,3}). The approximate decomposition of a three-dimensional tensor X can be defined [32] as follows: where R is a positive integer and a r ∈ R I , b r ∈ R J , c r ∈ R K , r = 1, . . . . . . , R. The symbol • means vector outer product and a r • b r • c r is a rank-one tensor. The CP decomposition of a tensor is a sum of R rank-one component tensors (we call R the rank of tensor X).
Sustainability 2020, 12, x FOR PEER REVIEW 5 of 16 statistical methods to deal with this data source. The tensor-based model we applied in this study can extract multi-dimensional features from an original dataset. The passengers' spatio-temporal characteristics of different travel patterns can be quantified based on tensor decomposition. Tensor decomposition is a high-dimensional generalization of singular value decomposition (SVD) and principal component analysis (PCA), which includes two types of tensor decomposition techniques: Tucker decomposition and CP decomposition. Tucker decomposition is a high-order principal component analysis method that decomposes a tensor into a core tensor multiplied by a corresponding factor matrix along each dimension. The CP decomposition decomposes a highdimensional tensor into a sum of component rank-one tensors. The rank one tensor is a special tensor type. If an N-way tensor can be expressed by the outer product of N vectors, then this is a rank one tensor. In this study, we apply the CP decomposition to analyze the spatial-temporal characteristics of urban travels. The CP decomposition process can be explained by a third-order tensor X ∈ R I×J×K show in Figure 3, where I, J, K are the dimension along the pth way (p ∈ {1,2,3}). The approximate decomposition of a three-dimensional tensor X can be defined [32] as follows: where R is a positive integer and ∈ R I , ∈ R J , ∈ R K , r = 1,……,R. The symbol  means vector outer product and   is a rank-one tensor. The CP decomposition of a tensor is a sum of R rank-one component tensors (we call R the rank of tensor X). The element of the tensor X can be expressed as: where i = 1,...,I, j = 1,...,J, k = 1,...,K, we use i, j, k to denote the index of each dimension and we denote by the value of the element of the tensor X. Fibers [32] are the higher order analogue of matrix rows and columns. A fiber is defined by fixing every index but one. Third-order tensors have column, row, and tube fibers, denoted by X : , X : and X : , respectively. Slices are two-dimensional sections of a tensor, defined by fixing all but two indices. Third-order tensors have the horizontal, lateral, and frontal slides, denoted by X ∷ , X : : and X ∷ , respectively. The kth frontal slice X ∷ of a third-order tensor may also be denoted as X .
The factor matrix refers to the combination of the vectors from the rank-one component, A = [a1 a2 … aR], B = [b1 b2 …bR], C = [c1 c2 … cR]. Matricization is the process of recording the elements of a multi-dimension tensor into a matrix. Using the factor matrix, the three matricized versions on one per dimension are: where ⊙ denotes the Khatri-Rao product. Given matrices A ∈ R and B ∈ R , their Khatri-Rao product is denoted by A ⊙ B. The result is a matrix of size (IJ) × K. The element of the tensor X can be expressed as: where i = 1, ..., I, j = 1, ..., J, k = 1, ..., K, we use i, j, k to denote the index of each dimension and we denote by x ijk the value of the element of the tensor X.
Fibers [32] are the higher order analogue of matrix rows and columns. A fiber is defined by fixing every index but one. Third-order tensors have column, row, and tube fibers, denoted by X :jk , X i:k and X ij: , respectively. Slices are two-dimensional sections of a tensor, defined by fixing all but two indices. Third-order tensors have the horizontal, lateral, and frontal slides, denoted by X i:: , X :j: and X ::k , respectively. The kth frontal slice X ::k of a third-order tensor may also be denoted as X (k) .
The factor matrix refers to the combination of the vectors from the rank-one component, Matricization is the process of recording the elements of a multi-dimension tensor into a matrix. Using the factor matrix, the three matricized versions on one per dimension are: where denotes the Khatri-Rao product. Given matrices A ∈ R I×K and B ∈ R J×K , their Khatri-Rao product is denoted by A B. The result is a matrix of size (IJ) × K. The columns of A, B and C are normalized to length one with the weights of lambda. Where λ ∈ R R , X can be approximately decomposed into Equation (6): The three-dimensional tensor can be extended to N-dimensional tensor [32] and written as: However, the matrix elements of tensor decomposition can be positive or negative. The existence of negative values in the decomposition results is correct from the point of view of mathematical operation, but negative values are difficult to explain in practical problems. We apply NCP decomposition [35] with alternating Poisson regression to uncover the characteristic of travel patterns in the Shenzhen metro network. The method we applied is appropriate for nonnegative integer values used in smart card data, which adds non-negative constraints to the tensor decomposition process.
We have extracted a total of 167 metro stations from eight metro lines, and the original data collected from 5:00 to 23:00 is divided into 18 time periods of one hour. Meanwhile, the passenger volumes were collected at entry and exit stations in the metro system at different time periods. Therefore, the departure station tensor Mo = (departure station, time slot) and arrival station tensor Md = (arrival station, time slot) are constructed, and its collective mode is M ∈ R+ 167×18 . The collective mobility is decomposed into a linear combination of several basic modes of movement by NCP decomposition: Equation (8) decomposes the matrix M ∈ R + m×n into R components, each component i includes one eigenvalue and two vectors, a r ∈ R + m×1 and b r ∈ R + n×1 . In this study, the collective mobility patterns M ∈ R + 167×18 are decomposed into several basic patterns. b r ∈ R + 18×1 represents the temporal distribution.
a r ∈ R + 167×1 represents the corresponding spatial information, which indicates the probability of arriving or leaving the metro stations. Therefore, we obtain the components representing the basic mobility patterns from the collective mobility patterns through NCP decomposition. Taking the departure station tensor Mo = (departure station, time slot) in a specific day as example, its mobility patterns can be expressed as Mo ∈ R + 167×18 . We define the NCP decomposition of the departure station tensor Mo as Equation (9): This can be further expressed as Equation (10):

The Spatial-Temporal Characteristics Analysis of Travel Patterns
Using the tensor toolbox [35] in MATLAB, we performed NCP decomposition to explore the number of basic mobility patterns. Through repeated experiment and analysis of tensor composition Sustainability 2020, 12, 1475 7 of 16 results, we found that when the number of basic mobility patterns R = 3, the extracted travel modes expressed obvious differences in spato-temporal dimension. If we set R = 4 or more, the travel modes extracted from tensor decomposition are unreasonable and cannot be in accordance with daily travel regularity. The three basic mobility patterns have been confirmed by scholars [31,37] to be appropriate for understanding the spatial-temporal characteristics of travel patterns.
The collective model is decomposed into three components, and the characteristics of travel behavior of the three components in the temporal dimension show in Figure 4a,c,e. Departure-1 indicates the temporal distribution in the morning peak hours of passengers traveling in Shenzhen metro system during the weekday. Departure-2 indicates the temporal distribution in the evening peak hours during the weekday. Departure-3 indicates the temporal distribution of the daily hours in weekday. All the results shown in Figure 4 are the average calculating results from three weeks. regularity. The three basic mobility patterns have been confirmed by scholars [31,37] to be appropriate for understanding the spatial-temporal characteristics of travel patterns. The collective model is decomposed into three components, and the characteristics of travel behavior of the three components in the temporal dimension show in Figure 4a,c,e. Departure-1 indicates the temporal distribution in the morning peak hours of passengers traveling in Shenzhen metro system during the weekday. Departure-2 indicates the temporal distribution in the evening peak hours during the weekday. Departure-3 indicates the temporal distribution of the daily hours in weekday. All the results shown in Figure 4 are the average calculating results from three weeks.
According to the decomposition result, the spatial distribution characteristics of travel behavior corresponding to the temporal distribution of the three components can be analyzed simultaneously. In the spatial dimension, we use a thermodynamic chart to show the spatial distribution of each metro station. The right part of Figure 4b,d,f shows the spatial distribution of travel behavior.
(a) morning peak hours (b) spatial distribution for departure-1 (c) evening peak hours (d) spatial distribution for departure-2 (e) daily hours (f) spatial distribution for departure-3 The Shenzhen metro network covers most areas from the city center to the suburbs. The network structure of the route makes travel in Shenzhen City become more convenient. The results in Figure  4 show that Departure-1 has obvious peak traffic during the morning period, the peak hours range between 7:30 and 9:00, and the value reaches peak at 8:00. At this time, passenger trip is mainly According to the decomposition result, the spatial distribution characteristics of travel behavior corresponding to the temporal distribution of the three components can be analyzed simultaneously.
In the spatial dimension, we use a thermodynamic chart to show the spatial distribution of each metro station. The right part of Figure 4b,d,f shows the spatial distribution of travel behavior.
The Shenzhen metro network covers most areas from the city center to the suburbs. The network structure of the route makes travel in Shenzhen City become more convenient. The results in Figure 4 show that Departure-1 has obvious peak traffic during the morning period, the peak hours range between 7:30 and 9:00, and the value reaches peak at 8:00. At this time, passenger trip is mainly distributed in the surrounding areas and scattered in the downtown areas of the city, specifically Pengzhou, Wuhe, Buji, Qinghu and Minzhi are the main stations. According to the extracted POI data of Shenzhen City, Figure 5 shows that most of the residential areas in Shenzhen are located in the surrounding areas and scattered downtown areas of the city. We can find that traffic travel is mainly distributed in residential areas during the morning peak hours on weekdays.
distributed in the surrounding areas and scattered in the downtown areas of the city, specifically Pengzhou, Wuhe, Buji, Qinghu and Minzhi are the main stations. According to the extracted POI data of Shenzhen City, Figure 5 shows that most of the residential areas in Shenzhen are located in the surrounding areas and scattered downtown areas of the city. We can find that traffic travel is mainly distributed in residential areas during the morning peak hours on weekdays.
Departure-2 expresses peak passenger volume during the evening hours, and the peak hours range between 17:00 and 20:00, and the value reaches maximum at 18:00. At this time, traffic travel is mainly distributed in the downtown area of the city, with Chegongmiao, Shenda, Gaoxinyuan, Convention and Exhibition Center and Huaqiang North being the main stations. According to the extracted POI data of Shenzhen city, Figure 5 shows that companies are mostly distributed around the center of the city. We can find that traffic travel is mainly distributed in companies' areas during the evening peak hours on weekdays. So, on weekdays, passengers leave the residential area around the city during the morning peak hours, and return to the residential areas from the work area of the city center during the evening peak hours.
Departure-3 shows a significant decline in the proportion of passenger volume in each period, and there is no obvious peak volume. The urban travel is mainly distributed in the central area and scattered around city, which belongs to the daily transition mode between Departure-1 and Departure-2.

Analysis of Travel Characteristics Combined with POI Data
We have decomposed the departure station tensor Mo = (departure station, time slot) into several basic patterns in Section 4. However, for the components of the basic patterns, some are stable, and others are occasional due to specific reasons (such as weather and activities). Utilizing correlation analysis, we establish correlation among different days and extract some stable components from the collective mobility to analyze the temporal distribution in stable basic mobility patterns in Section 5.1. Next, the functions of the station are set based on the POI data collected in Shenzhen City, and then we further analyze the spatial distribution characteristics in stable basic mobility patterns in Section 5.2. Furthermore, we classify individual records into different patterns and the travel time is analyzed for each category in Section 5.3.

Analysis of the Correlation Among Daily Travels
After constructing the departure station tensor Mo = (departure station, time slot) and arrival station tensor Md = (arrival station, time slot), through the process of nonnegative CP decomposition, different mobility patterns are explored. Next, in order to compare the basic patterns of daily travels, it is necessary to establish correlations among different days so that we can examine the changes in passenger flow at the stations on different days. For all the stations, we used the average of passenger Departure-2 expresses peak passenger volume during the evening hours, and the peak hours range between 17:00 and 20:00, and the value reaches maximum at 18:00. At this time, traffic travel is mainly distributed in the downtown area of the city, with Chegongmiao, Shenda, Gaoxinyuan, Convention and Exhibition Center and Huaqiang North being the main stations. According to the extracted POI data of Shenzhen city, Figure 5 shows that companies are mostly distributed around the center of the city. We can find that traffic travel is mainly distributed in companies' areas during the evening peak hours on weekdays. So, on weekdays, passengers leave the residential area around the city during the morning peak hours, and return to the residential areas from the work area of the city center during the evening peak hours.
Departure-3 shows a significant decline in the proportion of passenger volume in each period, and there is no obvious peak volume. The urban travel is mainly distributed in the central area and scattered around city, which belongs to the daily transition mode between Departure-1 and Departure-2.

Analysis of Travel Characteristics Combined with POI Data
We have decomposed the departure station tensor Mo = (departure station, time slot) into several basic patterns in Section 4. However, for the components of the basic patterns, some are stable, and others are occasional due to specific reasons (such as weather and activities). Utilizing correlation analysis, we establish correlation among different days and extract some stable components from the collective mobility to analyze the temporal distribution in stable basic mobility patterns in Section 5.1. Next, the functions of the station are set based on the POI data collected in Shenzhen City, and then we further analyze the spatial distribution characteristics in stable basic mobility patterns in Section 5.2. Furthermore, we classify individual records into different patterns and the travel time is analyzed for each category in Section 5.3.

Analysis of the Correlation Among Daily Travels
After constructing the departure station tensor Mo = (departure station, time slot) and arrival station tensor Md = (arrival station, time slot), through the process of nonnegative CP decomposition, different mobility patterns are explored. Next, in order to compare the basic patterns of daily travels, it is necessary to establish correlations among different days so that we can examine the changes in passenger flow at the stations on different days. For all the stations, we used the average of passenger volume in one specific day (from Monday to Sunday) of several weeks, and use correlation coefficients to describe the similarity of each basic pattern. In order to establish correlation between day m and day n, the tensor decomposition results are as shown in Equations (11) and (12): Then, we calculate the correlation coefficient of the spatial distribution of the i th component of day m and the j th component of day n by: In this formula, a m ip denotes the p th element of the i th component of day m, and a n jp denotes the p th element of the j th component of day n. We set that if the correlation coefficient r mn ij is greater than the threshold, then the i th component of day m and the j th component of day n are considered to be the same basic pattern. If for all n, there is j, the correlation coefficient r mn ij is greater than the threshold, then the i th component of day m and the j th component of day n are considered to be a stable basic pattern. If, for all n, all j makes the correlation coefficient r mn ij less than the threshold, then the i th component of day m and the j th component of day n are considered to be an occasional basic pattern.
We decompose the departure station tensor Mo = (departure station, time slot) and arrival station tensor Md = (arrival station, time slot) constructed from the Shenzhen Metro smart card data for three weeks. The specific days (from Monday to Sunday) in the three weeks have similar spatial and temporal distribution based on tensor decomposition results. According to the correlation coefficient formula, the correlation coefficients between the components of average Mo and Md in specific days of three weeks were calculated respectively. Take Monday as an example, According to the calculation results of the correlation coefficient, the stable patterns appear almost every day. We extract the data of the stable basic patterns of the departure station tensor Mo and the arrival station tensor Md, and take the average value of each component as the departure and arrival stable mobility pattern. Figure 6 shows the temporal distribution of stable departure and arrival patterns for Metro passengers on weekdays and weekends. We find the stable basic patterns of weekdays and weekends have different characteristics for travel departure and arrival. and temporal distribution based on tensor decomposition results. According to the correlation coefficient formula, the correlation coefficients between the components of average Mo and Md in specific days of three weeks were calculated respectively. Take Monday as an example, and are constructed from the average of passenger volume of Monday in three weeks. The calculation results are shown in the Table A1 and Table A2 of appendix. The threshold of correlation coefficient is set at 0.8. The positions with correlation coefficient greater than 0.8 are marked in red in the tables. According to the calculation results of the correlation coefficient, the stable patterns appear almost every day. We extract the data of the stable basic patterns of the departure station tensor Mo and the arrival station tensor Md, and take the average value of each component as the departure and arrival stable mobility pattern. Figure 6 shows the temporal distribution of stable departure and As can be seen from Figure 6, during the weekday, in the departure station, passenger volume reaches the peak at 8:00, the probability accounts for nearly 0.3. On the weekend, the peak volume is expressed at 8:00, but the proportion of the trip numbers greatly reduced. The trips are mainly concentrated during the 7:00-9:00 period. During the weekday, passenger volume in arrival station has obvious evening peak at 18:00. While trip in arrival station during weekends is mainly concentrated between 17:00-22:00, and the time is relatively scattered.
There are still some differences in the temporal distribution. During peak hours on weekdays, whether it is departing or arriving at the metro station, urban trips mainly occur at 8:00 am and 6:00 pm. The peak value is relatively high and the duration of the relatively high capacity is short. This means that passenger volume is generated in the morning and evening peak hours on weekdays, and a large number of people take the metro network as the first choice for commute in a short time. However, during the weekends, the peak hours appear slightly later than on weekdays. The peak value is lower and duration of the relatively high capacity is longer. Since passengers are not limited by rigid working timetables on weekends, the change of passenger flow over time is slower than that of weekdays.

Analysis of Travel Spatial Characteristics
The analysis of the temporal distribution of travel is studied in Section 5.1. In order to further understand the stable movement pattern of passengers' travel in Shenzhen Metro, the analysis of spatial distribution of travel will be studied in this section. We extracted more than 21,000 point-of-interest information from APIs in the Baidu map and classified them into seven major categories: R, C, L, T, F, M and E. Based on the extracted POI information, one primary function is assigned to each metro station using following method: we firstly search all the POIs within a radius of 1000 m for each metro station, then the specific category that covers the largest proportion of POIs is set to be the primary function for the station. For example, the station"ShenZhenBeiZhan" is described as transportation hub (T) and the station"ShaoNianGong" is assigned as the function of leisure and entertainment (L). Furthermore, the 30 metro stations with highest visiting frequency are identified for each mode based on the result of tensor decomposition, and then set their primary functions as the feature of the mode. Table 2 shows the spatial distribution of the stable departure and arrival patterns of urban travel on weekdays and weekends. From the Table 2, we can see Weekday-Departure is decomposed into three basic modes. For Mode-1, we extracted the 30 stations with high visiting frequency in the spatial dimension and found that the passenger source comes from residential areas and leisure and entertainment. The spatial distribution characteristic of Mode-1 is composed of 40% [R] +16.7% [L] , then the feature of Mode-1 can be defined as [R] for its highest proportion. For the Mode-2, in the stations with high visiting frequency, ten of them belong to the corporate companies, six stations belong to the residential area, five stations belong to leisure and entertainment, four stations belong to transportation hub, three stations belong to gourmet restaurant, one station belongs to medical institution and one station belongs to education institution. The spatial distribution characteristic of Mode-2 is composed of 33 , then the feature of Mode-3 can be identified as [L]. For the weekday, passengers travel from the residential area to the corporate area at the morning peak, and go to the residential area from the corporate area at the evening peak. For the weekend, passengers travel from the residential area to the entertainment area and corporate area at the morning peak, and from the entertainment area, corporate area to the residential area at the evening peak. Most of the metro trips do not provide door-to-door service, so it may be necessary to transfer by other means in the transportation hub area.

Analysis of Travel Time Distribution
We extract the records of different modes based on the tensor decomposition results and then analyze their statistical characteristics. The travel time distribution of passengers is a key factor to evaluate the operating efficiency of a metro network. Taking the travel time as an example, the daily record data is divided into three categories according to their departure/arrival time and the metro station. Millions of passengers' records of the specific weekday (6 April 2017) and the weekend (8 April 2017) were extracted by category, and the travel time of each passenger was calculated. The distribution characterization of travel time on weekday and weekend was analyzed according to different classes.
Class-1 corresponds to the early peak period represented by mode-1. On weekdays, we extracted the records from 7:00 to 8:00, the departure station function is R, and the arrival station function is C. On weekends, we extracted the records from 7:00 to 8:00, the departure station function is R, and the arrival station function is L. Class-2 corresponds to the early peak period represented by mode-2. On weekdays, we extracted the records from 17:00 to 18:00, the departure station function is C, and the arrival station function is R. On weekends, we extracted the records from 17:00 to 18:00, the departure station function is L, and the arrival station function is R. Class-3 corresponds to the daily period represented by mode-3, its travel time distribution becomes more stable, there is no obvious peak time, and its spatial distribution characteristics have no obvious statistical rules, so we did not perform further analysis for Class-3. Based on the classification, the proportion of travel time at one minute's intervals is calculated. The fitting degree results of four function distribution are shown in Table 3. We can see that the Gaussian mixture model has the best fitting results, whose probability density obeys

Conclusions
The smart card data collected from a metro system were used to analyze the characteristics of urban metro passenger flow, urban spatial functional structure and travel behaviors. The whole mobility is considered as a combination of several basic patterns. Applying tensor decomposition and correlation analysis, stable components are extracted from collective movement data, and each pattern is analyzed according to the primary function of the metro station by using POI information. The main conclusion of this study can be summarized as follows:

Conclusions
The smart card data collected from a metro system were used to analyze the characteristics of urban metro passenger flow, urban spatial functional structure and travel behaviors. The whole mobility is considered as a combination of several basic patterns. Applying tensor decomposition and correlation analysis, stable components are extracted from collective movement data, and each pattern is analyzed according to the primary function of the metro station by using POI information. The main conclusion of this study can be summarized as follows:

Conclusions
The smart card data collected from a metro system were used to analyze the characteristics of urban metro passenger flow, urban spatial functional structure and travel behaviors. The whole Sustainability 2020, 12, 1475 13 of 16 mobility is considered as a combination of several basic patterns. Applying tensor decomposition and correlation analysis, stable components are extracted from collective movement data, and each pattern is analyzed according to the primary function of the metro station by using POI information. The main conclusion of this study can be summarized as follows: (1) The departure and arrival mobility of weekdays and weekends can be decomposed into several stable basic modes by tensor decomposition. For example, it can be decomposed into morning peak hours, evening peak hours and daily pattern. (2) Based on the tensor decomposition results and correlation analysis, stable and occasional patterns are distinguished. (3) On weekdays, passengers' travel broke out in the morning and evening peak hours. However, on weekends, the peak hours are slightly later than on weekdays. The peak period lasted for a long time, and the passengers' travel in the peak period decreased significantly. Since passengers are not limited by rigid work requirements on weekends, liquidity is more random and uniform and the change of passengers' flow is slower than that of weekdays. (4) The typical stable patterns on weekdays mainly involve travel between residential and workplace. The typical stable patterns on weekends mainly involve travel between residential and entertainment. (5) The travel time of passengers is calculated based on the classification. We find the travel time distribution is similar between different classes, which mainly caused by the travel of stations with same function, and the metro travel time with highest travel probability is within 35-40 min. Above all, we try to apply a tensor decomposition perspective to depict the spatial-temporal characteristic of passenger flow on the metro network.
What we found in this study has important reference value for the planning and operation of metro networks, which contributes to a better understanding travel behavior. Identifying the spatial-temporal characteristic of trips can help administrations strengthen the management and operations of stations so that the service level and quality of safety can be improved in the public traffic system. Meanwhile, the departure interval of metros can be specifically adjusted in all directions and metro ridership can be improved efficiently.
This study could be extended in the following ways: (1) The research based on metro smart card data does not fully reflect all aspects of urban public transport travel, such as bicycle trips, private cars, taxis, buses and other travel behavior, so we could further study the spatial-temporal interplay between different mass transit modes in the future; (2) We only used smart card data for three consecutive weeks on normal conditions, in the next step, the spatial-temporal characteristic of travel patterns is necessary to be explored under different conditions, such as severe weather, large-scale activities, emergencies and so on; (3) The tensor can be further constructed a multi-dimensional model, which not only includes spatial-temporal information but also information about transport mode and different types of public transport passengers and so on; (4) Travel behavior is affected by many factors, such as passengers' income level, personal preference, age and so on, so travel behavior characteristics under the influence of multiple factors can be further explored.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.