Comparative Perspective of Human Behavior Patterns to Uncover Ownership Bias among Mobile Phone Users

: With the rapid spread of mobile devices, call detail records (CDRs) from mobile phones provide more opportunities to incorporate dynamic aspects of human mobility in addressing societal issues. However, it has been increasingly observed that CDR data are not always representative of the population under study because it only includes device users alone. To understand the discrepancy between the population captured by CDRs and the general population, we proﬁle principal populations of CDRs by analyzing routines based on time spent at key locations and compare these data with those of the general population. We employ a topic model to estimate typical routines of mobile phone users using CDRs as topics. The routines are extracted from ﬁeld survey data and compared between those of the general population and mobile phone users. We found that there are two main population groups of mobile phone users in Dhaka: males engaged in an income-generating activity at a speciﬁc location other than home and females performing household tasks and spending most of their time at home. We determine that CDRs tend to omit students, who form a signiﬁcant component of the Dhaka population.


Background
The large amount of spatiotemporal data collected from pervasive devices has advanced the understanding of human mobility behavior.This understanding enables policy-makers and governments to incorporate dynamic aspects of human mobility into public policies and city planning.Based on the assumption that these devices are widespread, the spatiotemporal data can be considered representative of the general population.However, this may not be true due to data characteristics of call detail records (CDRs).Particularly, device ownership biases must be taken into account in developing countries because ownership varies across different demographic groups [1].Biases are found in mobile phone user ownership in terms of gender and age group [2].The utilization of spatiotemporal data for societal issues is suggested to require knowledge of the types of populations that are represented by the data [1,3].Without knowing which part of society the data represent, interpretation of the data analysis results may be misleading.In addition, the location distribution observed in CDRs is biased spatially and temporally.Since CDRs are updated only when the mobile phone is used, the records are heavily affected by users' calling behavior [4,5].This nature of CDRs results in generally sparse data that provide a partial view of actual trajectories [6].
In this study, we examined the discrepancy between the principal population components of CDRs and those of the general population by comparison of typical routines.First, we profile the typical routine of the principal population components for mobile phone users by comparing routines extracted from CDRs and the diary survey data of mobile phone users.CDRs are inherently sparse; hence, we interpolate CDRs based on the estimated routines by employing the topic model.This allowed us to transform CDRs, which originally only include information related to timeslots with call records, into routines of mobile phone users in a continuous manner.Then, the obtained results were compared with the routines of the core components of the general population in Dhaka.Thus, we were able to observe how the principal population in CDRs differs from those in the general population with regard to behavior patterns.
The contributions of our study are as follows:

‚
The principal population components of mobile phone users are profiled by comparing routines extracted from CDRs and those obtained from field survey data.The sparse CDRs were interpolated based on the predicted routines and interpreted as sequential activities.The ownership bias among mobile phone users is elucidated.

‚
A novel approach to identifying device domain-specific bias for large-scale spatiotemporal data is proposed.The potential to extend our approach to other areas using other data is discussed.

Related Work
An increasing number of studies investigate the behavior patterns of people through analysis of large-scale spatiotemporal data.Since the number of mobile phone users is very high, CDRs form large-scale spatiotemporal databases.With the sequential information of time and location of individuals, the data enable us to understand the dynamics of human mobility.Mobile phone data can capture quantitative aspects of human mobility, such as volume and statistical patterns of mobility [7][8][9].In recent years, the increasing availability of spatiotemporal data has advanced the research on mobility patterns and their application in sectors, such as transportation [5,10,11], public health [12,13], and urban planning [14].The properties of human mobility are represented in a better manner by incorporating periodic modulation of human mobility.References [7,9] contributed to the characterization of human mobility by quantifying the interaction between the regularity and randomness in human mobility dynamics.By mining semantically meaningful locations, such as home and the workplace, in anonymized CDRs, Reference [15] determined that a few limited locations where people spend most of their time are the means in understanding human mobility and social patterns.In addition, correlation is found between daily activity pattern and the type of areas, which are considered to be work locations [16].A part of human mobility is explained well by taking into account daily travel-activity patterns because human mobility is considered to be driven by the demand to participate in activities.Activity in combination with socio-demographics further elucidates human mobility patterns [17].The socio-demographic characteristics were also proven to significantly affect the time allocation for activities inside the home and those outside.Reference [18] suggested that human activity-travel behavior could be described by the individual spatial behavior, which can be captured by a monthly and seasonal variability in activity.In this context, the extraction of typical behavior patterns using a few key locations can help in understanding major components of mass population.
Reference [19] described the hidden structure in human behaviors by analyzing the data collected from 95 mobile phone users.The study presents their characteristic behavior by extracting the principal components, referred to as eigenbehaviors.It analyzes individual eighenbehavior and also describes the community affiliations of populations.While the study presents partial behavior traits as the principal components, Reference [20] characterizes behavior patterns as regular temporal transitions between typical states, such as home and the workplace.The Latent Dirichlet Allocation (LDA) topic model is employed to extract location-driven routines.Reference [21] extended the topic model approach to evaluate the similarities and differences in behavior among multiple users by clustering the underlying structure of individual behavior patterns.References [22,23] presented interesting works on the application of the LDA model in large-scale geo-location data to identify latent activity patterns.
While the application of CDRs seems to be a prominent means of addressing societal issues, some problems exist in CDR data.One critical problem is representativeness because analysis results of mobility data may vary according to datasets, which capture different populations [1,24] and different moving processes [25].For instance, the data of the Oyster card (an electric ticketing system for public transport passengers in Greater London) captures the mobility of transportation users alone [26].CDRs include mobile phone users alone.Reference [27] found gaps in socio-economic status between mobile phone users and non-users.The application of the analysis results to societal issues may cause problems if a discrepancy exists between the population represented by mobility data and the population under study.
The remainder of this paper is structured as follows: data used in this study is described in Section 2. In Section 3, the characteristics of typical mobile phone users are described and their typical routines are presented by analyzing the field survey data of mobile phone users.In Section 4, we examined the typical routines of mobile phone users, which are extracted from CDRs.The diary survey data of the general population is analyzed in Section 5. Our conclusions are presented in Section 6.

Mobile Phone Data
We use the CDRs of August 2013 from one of the leading mobile network operators in Bangladesh (hereafter referred to as "the MNO").The data include the time, antenna location, and duration of calls.We randomly sampled the call records of 5000 unique IDs.We did not use the entire dataset because we consider 5000 samples enough to extract the typical routine patterns.The selected samples are evenly distributed geographically in the study site.The data comprises records of all antennas located in Greater Dhaka in Bangladesh.

Diary Survey Data of Mobile Phone Users
To understand the personal attributes and activity of mobile phone users who use the service of the MNO, we conducted a diary survey of these users as part of a field survey-The Survey on Patterns of Activity for Comprehensive Explorations of Mobile Phone Users in Dhaka (SPACE) [28].The survey was conducted from November 2013 to January 2014 and covers selected areas of Greater Dhaka.SPACE consists of two parts, namely one-day diary records from mobile phone users and their personal attributes and activity.The former part collects time spent for activities of the day along with call records of the same day.The latter part collects their age, gender, occupation, and major routine activity.We employed two-stage stratified sampling according to land use and household income levels.The areas covered by CDRs are split into 161 administrative areas, which are classified into three groups according to their dominant type of land use: residential, commercial, or industrial.Of these, 15 areas, which consist of 10 residential, two commercial, and three industrial areas, are randomly selected in proportion to their population shares in the total population.From each area, 18 households, each having at least one mobile phone user of the MNO as a household member, were selected from each income group: high, middle, and low.If the slum population in an area is greater than 25%, we sampled that population as part of the low-income group.As a result, we interviewed 922 mobile phone users from 810 households.The SPACE data do not represent mobile phone users using the service of the MNO; these data are considered to represent the mobile phone users corresponding to each income group.We scheduled the survey on both weekdays and the weekend to reduce bias.In addition, we visited the household according to the availability of household members in the morning, afternoon, and late evening to collect the data from those who work during daytime.

Diary Survey Data of the General Population
We utilized another set of diary survey data, Person Trip (PT) data, to understand the personal attributes and activity of the general population in Greater Dhaka.The data include the timing, origin-destination, means of transportation, and purpose of trips for the day, which is a typical day, and the data structure is almost similar with that of the diary survey part of the SPACE data.In addition, demographic attributes, such as age, gender, and occupation, were collected by the Japan International Cooperation Agency (JICA) by interviewing 75,000 people residing in Greater Dhaka in 2009.This survey was household-based and the sampling methodology was chosen such that it would obtain results that were representative of the general population.In the sample, the number of males was slightly greater than that of females (54% vs. 46%).Table 1 presents three key population groups based on their activities: respondents engaged in income-generating activity (38%), household tasks (25%), and education (32%).Furthermore, we note that male respondents comprise the majority of the income-generating group while females are in the majority in the household tasks group.Additionally, we found that those who receive education as their main activity constitute almost one-third of the total population in Greater Dhaka.Hence, we focus our analysis on the following key population groups: working males, housewives, and students, and label the rest of the population as others.

Population Composition of Mobile Phone Users
To understand the principal populations of mobile phone users, we examined the SPACE data, which is diary survey data of 922 mobile phone users.Table 2 describes the proportion of males and married mobile phone users classified by income level.The overall proportion of males is greater than that of females.In addition, more than 85% of the users are married.Bangladesh has relatively strong social norms of behavior based on gender.Among women, the labor force participation rate in urban areas is 35% while that of men is 80% [29].Assuming that the sex ratio is almost 1, we can roughly estimate that the population rate of working males is approximately 40% and that of working females is less than 20%.When we take into account the large proportion of the married population among mobile phone users, we expect gender-specific behavior patterns to be predominant in the SPACE data.That is, married males are generally engaged in an income-generating activity to support their family.Females tend to stay at home and perform household tasks while taking care of children and other family members.Next, the primary activity of mobile phone users was analyzed.Assuming the gender-specific trends mentioned above, the type of activity was classified based on gender (male and female) and economic activity (income-generating activity and non-income generating activity).An income-generating activity is any activity that generates monetary income as a return.For example, salary earners, part-time workers, and self-employed people are classified as persons engaged in an income-generating activity.The remaining people are engaged in non-income-generating activity, which is any activity that does not generate monetary income as a return.Table 3 shows the distribution of the type of activity by gender.More males are shown to be engaged in an income-generating activity while the majority of females are engaged in a non-income-generating activity, particularly household tasks.This indicates that most of the mobile phone users subscribed to the MNO are those who perform responsible roles within the household, i.e., they have available money at their disposal.The mobile tariff for this MNO is relatively expensive compared to that for other companies in Bangladesh, and this was considered a factor affecting the user trends.Table 4 shows the proportion of people who are engaged in a typical activity corresponding to their gender.The values in parentheses represent the proportion of married users.Trends by gender were observed to be similar for males and females across all income levels.Therefore, we conclude that, regardless of income levels, two types of typical mobile phone users exist: the married male engaged in an income-generating activity, and the married female who mainly performs household tasks.Based on the results, the two typical mobile phone users are termed as working males and housewives.

Location of Main Activity
Considering working males and housewives as typical mobile phone users, the type of location of their main activity was classified by specifying whether it is home or outside home.The activity of the people is required to be linked to location types because the behavior patterns extracted from CDRs will be described based on the probability distribution of location types in the next section.Table 5 shows the distribution of location types for the main activities of working males and housewives.In the case of working males, 72% of the locations for their main activity are a specific location outside the home.This indicates that more than half of typical male users have a specific location outside the home for their main activity.In the case of housewives, 94% of the locations for their main activity are the home.Owing to the distinctive difference in the location of the main activity for the two principal population groups, we found it acceptable to interpret the typical behavior patterns extracted from CDRs based on the analysis results of this section.

Typical Behavior Patterns of Mobile Phone Users
In the previous section, we classified the type of location as home and outside home.For each user, a location reported as home is labeled as Home.Among outside home locations, a location reported as a workplace is labeled as Work.For those who do not have a workplace, they have home and outside home locations only, where the most frequently reported location among the outside home location was selected, and labeled as Work.Therefore, every user has locations, which are labeled as Home and Work.Using these three types of locations, namely, Home, Work, and Other, we obtain the location-based behavior patterns of typical mobile phone users based on the SPACE data.For housewives and students, we considered the primary location outside home as their Work, i.e., the time spent for education for students is considered as Work.In addition to working males and housewives, we examined the behavior pattern of students, who form the third-largest segment of mobile phone users, although the absolute proportion of this segment is much smaller than the other two.Mobile phone users classified as students are mostly college students.Figure 1a,b shows the hourly distribution of the probability of being at Home and Work, respectively.We can observe a distinctive trend for housewives: the probability of them being at home is almost 100% throughout the day.Working males and students have relatively similar probability distributions of being at Work, but this probability is much higher for working males.This higher probability could be attributed to differences between office hours (e.g., from 9 a.m. to 5 p.m.) and school hours in Dhaka.Weekday and weekend were not differentiated because the type of day for the diary survey, which is one of the datasets used for comparison, was specified just as a typical day for interviewees.

Typical Behavior Patterns of Mobile Phone Users
In the previous section, we classified the type of location as home and outside home.For each user, a location reported as home is labeled as Home.Among outside home locations, a location reported as a workplace is labeled as Work.For those who do not have a workplace, they have home and outside home locations only, where the most frequently reported location among the outside home location was selected, and labeled as Work.Therefore, every user has locations, which are labeled as Home and Work.Using these three types of locations, namely, Home, Work, and Other, we obtain the location-based behavior patterns of typical mobile phone users based on the SPACE data.For housewives and students, we considered the primary location outside home as their Work, i.e., the time spent for education for students is considered as Work.In addition to working males and housewives, we examined the behavior pattern of students, who form the third-largest segment of mobile phone users, although the absolute proportion of this segment is much smaller than the other two.Mobile phone users classified as students are mostly college students.Figure 1a,b shows the hourly distribution of the probability of being at Home and Work, respectively.We can observe a distinctive trend for housewives: the probability of them being at home is almost 100% throughout the day.Working males and students have relatively similar probability distributions of being at Work, but this probability is much higher for working males.This higher probability could be attributed to differences between office hours (e.g., from 9 a.m. to 5 p.m.) and school hours in Dhaka.Weekday and weekend were not differentiated because the type of day for the diary survey, which is one of the datasets used for comparison, was specified just as a typical day for interviewees.

Methodology
We draw an analogy between discovering a pattern of daily routine from CDRs and discovering a latent topic from documents.The CDRs of each user are considered as a document, and each data point, described by the time stamp and geographical location, is considered as words in the document.The vocabulary in our model describes the temporal distribution and geographical distribution.In this study, we extend the classical LDA model by drawing the time stamp as well as the location from the latent assignment of topic for each record, and assume that people have similar daily routines but different main locations (e.g., home, workplace).Hence, as shown in Figure 2, we place latent variables of time patterns outside the user plate of the model and latent variables of location inside the user plate.The symbols used in Figure 2 are explained in Table 6.To infer the latent variables, a Gibbs sampling [30] inference is applied as shown in Algorithm 1.

Methodology
We draw an analogy between discovering a pattern of daily routine from CDRs and discovering a latent topic from documents.The CDRs of each user are considered as a document, and each data point, described by the time stamp and geographical location, is considered as words in the document.The vocabulary in our model describes the temporal distribution and geographical distribution.In this study, we extend the classical LDA model by drawing the time stamp as well as the location from the latent assignment of topic for each record, and assume that people have similar daily routines but different main locations (e.g., home, workplace).Hence, as shown in Figure 2, we place latent variables of time patterns outside the user plate of the model and latent variables of location inside the user plate.The symbols used in Figure 2 are explained in Table 6.To infer the latent variables, a Gibbs sampling [30] inference is applied as shown in Algorithm 1.

Algorithm 1 Gibbs sampling based behavior pattern discovery
`1 // Calculate the values to be returned

Extracting Typical Spatiotemporal Calling Behaviors Based on Call Records
We applied our extended LDA model to CDRs of 5000 unique IDs from mobile phone users using the algorithm presented at the beginning of this section.The time pattern that we discovered is shown in Figure 3.We extracted three principal topics as the three typical calling patterns of mobile phone users.Figure 3 illustrates the topic proportion at each time and Topics 1 and 3 depict the calling behavior with a dominating high topic proportion during morning hours and during the day, respectively.Moreover, Topic 2 represents the calling behavior of a preference of call at night than at daytime.The topic is determined by the spatial/temporal topic distribution.Therefore, the patterns in Figure 3 are clustered based on the pattern of calls in relation to the pattern of their periodic visit to the same location.As discussed earlier, locations that are repeatedly visited, such as home and the workplace, can explain the behavior patterns of people.Thus, we conclude that, to a certain extent, the patterns extracted by our LDA model can be associated with some significant locations for mobile phone users.

Extracting Typical Spatiotemporal Calling Behaviors Based on Call Records
We applied our extended LDA model to CDRs of 5000 unique IDs from mobile phone users using the algorithm presented at the beginning of this section.The time pattern that we discovered is shown in Figure 3.We extracted three principal topics as the three typical calling patterns of mobile phone users.Figure 3 illustrates the topic proportion at each time and Topics 1 and 3 depict the calling behavior with a dominating high topic proportion during morning hours and during the day, respectively.Moreover, Topic 2 represents the calling behavior of a preference of call at night than at daytime.The topic is determined by the spatial/temporal topic distribution.Therefore, the patterns in Figure 3 are clustered based on the pattern of calls in relation to the pattern of their periodic visit to the same location.As discussed earlier, locations that are repeatedly visited, such as home and the workplace, can explain the behavior patterns of people.Thus, we conclude that, to a certain extent, the patterns extracted by our LDA model can be associated with some significant locations for mobile phone users.Taking into account the lifestyle of people in Dhaka where most offices, shops, and entertainment venues are closed at night, we assume that the location having the highest probability after midnight is associated with the home location because the majority of the people most likely stay at home or in the vicinity of home late at night.Populations clustered into Topic 3 exhibit the highest probability of being at home for the longest hours after 12 a.m. until 12 p.m.This is similar to the pattern of housewives, which were extracted in the previous section.Populations clustered into Topic 2 show a clear difference in probabilities of being at home between nighttime and daytime.We assume that this population group is generally engaged in an activity outside the home similar to the Taking into account the lifestyle of people in Dhaka where most offices, shops, and entertainment venues are closed at night, we assume that the location having the highest probability after midnight is associated with the home location because the majority of the people most likely stay at home or in the vicinity of home late at night.Populations clustered into Topic 3 exhibit the highest probability of being at home for the longest hours after 12 a.m. until 12 p.m.This is similar to the pattern of housewives, which were extracted in the previous section.Populations clustered into Topic 2 show a clear difference in probabilities of being at home between nighttime and daytime.We assume that this population group is generally engaged in an activity outside the home similar to the pattern of working males, which was also extracted in the previous section.Likewise, the populations of Topics 1 and 2 spend a large amount of time outside the home.Topic 1 exhibits two peaks: the first is at approximately 8 a.m. and the second is at 7 p.m.This is not very similar to the pattern of students extracted in the previous section, which also has narrow peaks in the late morning hours (between 10 a.m. and 12 p.m.) and around 5 p.m.This is probably because populations clustered into this group include not only students but also other populations which cannot be clustered into Topics 2 and 3. Comparing the analysis result with the previous sections, we observe that the LDA model can distinguish two predominant population groups: (1) Topic 2 represents the behavior patterns of people who generally spend the majority of their time outside the home and return home at a time that is the latest among all patterns, and these people are most likely to be working males; and (2) Topic 3 represents people who are engaged in an activity related to the home, and these people are most probably housewives.

Typical Behavior Patterns of Principal Population Groups in Dhaka
In this section, we report our findings related to typical behavior patterns across three principal population groups obtained from the data.The results are summarized in Figure 4a,b.As discussed in earlier sections, previous research has found that behavior patterns of people can be explained by focusing on important places such as Home and Work.Therefore, we present the distribution of the key population groups considering only (a) Home and (b) Work locations.The results show clear and evident patterns.The probability of being at Home is the highest for housewives among the three principal population groups.We find more similarity in the probability of being at Work between working males and students for the general population compared to mobile phone users.In spite of the minor difference in the result, we conclude that the behavior patterns extracted from the two sources of data are generally similar.
ISPRS Int.J. Geo-Inf.2016, 5, 85 9 of 12 pattern of working males, which was also extracted in the previous section.Likewise, the populations of Topics 1 and 2 spend a large amount of time outside the home.Topic 1 exhibits two peaks: the first is at approximately 8 a.m. and the second is at 7 p.m.This is not very similar to the pattern of students extracted in the previous section, which also has narrow peaks in the late morning hours (between 10 a.m. and 12 p.m.) and around 5 p.m.This is probably because populations clustered into this group include not only students but also other populations which cannot be clustered into Topics 2 and 3.
Comparing the analysis result with the previous sections, we observe that the LDA model can distinguish two predominant population groups: (1) Topic 2 represents the behavior patterns of people who generally spend the majority of their time outside the home and return home at a time that is the latest among all patterns, and these people are most likely to be working males; and (2) Topic 3 represents people who are engaged in an activity related to the home, and these people are most probably housewives.

Typical Behavior Patterns of Principal Population Groups in Dhaka
In this section, we report our findings related to typical behavior patterns across the three principal population groups obtained from the data.The results are summarized in Figure 4a

Ownership Bias
Finally, we discuss the ownership bias of mobile phone users by comparing the principal population groups of the general population and those of the mobile phone users.Table 7 shows approximate estimates of the proportion of the three principal population groups for the general population and mobile phone users.The "+" mark in the estimate for mobile phone users indicates a possible minimum estimate.This indicates that proportions were provided as the minimum because these were obtained for each income level but do not have information on the population share.For instance, the proportions of housewives among mobile phone users for high, middle, low, and slum levels are 30%, 29%, 27%, and 26%, respectively.The overall proportion of housewives among mobile phone users can be at least 26%.In the general population, a sizeable population of students exists, with education as their main activity.However, the corresponding proportion of students among

Ownership Bias
Finally, we discuss the ownership bias of mobile phone users by comparing the principal population groups of the general population and those of the mobile phone users.Table 7 shows approximate estimates of the proportion of the three principal population groups for the general population and mobile phone users.The "+" mark in the estimate for mobile phone users indicates a possible minimum estimate.This indicates that proportions were provided as the minimum because these were obtained for each income level but do not have information on the population share.
For instance, the proportions of housewives among mobile phone users for high, middle, low, and slum levels are 30%, 29%, 27%, and 26%, respectively.The overall proportion of housewives among mobile phone users can be at least 26%.In the general population, a sizeable population of students exists, with education as their main activity.However, the corresponding proportion of students among mobile phone users is very small.Furthermore, the proportion of male workers and housewives is significant in the general population and among mobile phone users.Considering that the population pyramid of Bangladesh is wide at the base [31] and students generally belong to relatively younger population groups, we conclude that CDRs are biased because the data seldom include students, who comprise a significant proportion of the general population in Dhaka.

Conclusions
In this study, we proposed a novel approach to elucidate the discrepancy in principal population compositions between the general population and mobile phone users by comparison of their typical behavior patterns.We profiled principal populations of mobile phone users through SPACE and diary survey data of mobile phone users.We found that working males and housewives are two dominant population components of mobile phone users of the MNO.We also succeeded in extracting their behavioral patterns from CDRs by employing a topic model.Analysis results presented two typical behavior patterns, namely people who spend most of the day engaged in routines outside the home, and those who spend most of their time at home.These findings were consistent with the behavior patterns extracted from the diary survey data of mobile phone users, where we observed two typical behavior patterns: the male engaged in income-generating activity outside the home during the day and the female who spends the majority of the time at home, mainly performing household tasks.
Comparing the principal populations of mobile phone users and those of the general population, we found that students form a core component of the general population but are not considered significant among mobile phone users.The analysis results indicated that CDRs capture the behavior patterns of working males and housewives.Therefore, we suggest that the application of CDRs, targeting the younger generation in particular, takes this bias into account because the data do not necessarily represent this population group.We believe that our findings will be useful for the utilization of CDRs in developing countries, which have limited resources.CDR acquisition does not have additional costs.The data are generally collected for billing purposes by MNOs and are therefore available as long as a mobile network is present.Our study shows that the potential for understanding domain-specific biases, which can constitute major constraints in the utilization of large-scale domain-specific data such as CDRs, exists through the analysis of CDRs in combination with secondary data.However, the activity patterns of people in this case are not very complex because people's lifestyles are affected by strong social norms.Applications of this study in other areas would require more analysis on the relationship between the features of the principal population groups and their daily routines.Additionally, conducting a large-scale field survey is expensive.For further studies, we intend to use census data because mobile ownership is recommended as a core topic for the Population and Housing Census by Reference [32].A census is recommended every five years and has been conducted in more than 200 countries in the census round spanning the period from 2005 to 2014 [33].We consider utilizing such data to lower the cost of data acquisition for application in other study areas.

Figure 1 .
Figure 1.Hourly probability distribution of being at (a) home and (b) work for three principal population groups.

Figure 1 .
Figure 1.Hourly probability distribution of being at (a) home and (b) work for three principal population groups.

1 /Figure 2 .
Figure 2. Graph model of our extended LDA model.Shaded and unshaded nodes denote observed and latent variables, respectively.

Figure 2 .
Figure 2. Graph model of our extended LDA model.Shaded and unshaded nodes denote observed and latent variables, respectively.

Figure 3 .
Figure 3.Time patterns of three principal topics extracted from CDRs.

Figure 3 .
Figure 3.Time patterns of three principal topics extracted from CDRs.
,b.As discussed in earlier sections, previous research has found that behavior patterns of people can be explained by focusing on important places such as Home and Work.Therefore, we present the distribution of the key population groups considering only (a) Home and (b) Work locations.The results show clear and evident patterns.The probability of being at Home is the highest for housewives among the three principal population groups.We find more similarity in the probability of being at Work between working males and students for the general population compared to mobile phone users.In spite of the minor difference in the result, we conclude that the behavior patterns extracted from the two sources of data are generally similar.

Figure 4 .
Figure 4. Hourly probability distribution of being at (a) home and (b) work for the three principal population groups.

Figure 4 .
Figure 4. Hourly probability distribution of being at (a) home and (b) work for the three principal population groups.

Table 1 .
Percentage distribution of survey respondents and their income status by gender.

Table 2 .
Proportion of males and married users.

Table 3 .
Distribution of main activity by gender.

Table 4 .
Proportion of people engaged in typical activity, classified by gender and income level.

Table 5 .
Distribution of location for main activity.

Table 6 .
Explanation of symbols used in Figure2.

Table 6 .
Explanation of symbols used in Figure2.

Table 7 .
Proportion of the three principal population groups.