3.1. Data
In response to our research question, we conducted a comprehensive online survey in Japan from 16 November to 14 December 2015, targeting a diverse demographic to gather insights on fully automated vehicles (FAVs), lifestyle, and environmental concerns. The survey, executed through Nikkei Research Inc., was designed to randomly select participants while mirroring the gender and age distribution of the Japanese population. This approach ensured a representative sample, as evidenced by the 246,642 respondents who were rigorously screened through trap questions to maintain data quality.
To ensure the robustness of our findings, we employed multiple data processing methods. These included comparative analyses between the survey results and Japanese Census data, specifically examining the distribution of socioeconomic variables as shown in
Table A5. Although we observed minor discrepancies in gender and education levels when compared to the census data, our survey results broadly align with Japan’s average demographic trends. This methodology not only underscores the comprehensiveness of our data but also demonstrates our commitment to utilizing varied analytical approaches to validate our findings.
Again, we are aware that in 2015, the respondents were less familiar with FAVs than those in 2021. Therefore, we excluded those who answered that they had ‘no awareness’ of FAVs, which accounted for 14.48% of the entire sample (35,715 observations). Thus, in our model, we only account for the people who were aware of FAV technologies in 2015. Therefore, given that FAVs were not introduced in the market back then and still not introduced in 2021, a substantial change in the result, for example, a change in the sign or implications of the results, is less likely to happen. Therefore, more attention should be given to the signs and relative comparisons of coefficient magnitudes of the latent constructs.
Finally, we drop those who selected “don’t know/don’t want to answer” about their individual income (30,156 observations). As a result, we have 180,771 respondents in total. Before the large-scale survey started, a pre-survey was carried out to tune the questionnaires.
For the questions related to the FAV purchasing intention, respondents were asked the question: “Do you want to add a completely self-driving option that allows you to move around when you purchase a car in the future?”. Then, the respondents answered the following questions: “(1) Purchase for sure, (2) Purchase under certain conditions, (3) Do not purchase, and (4) I don’t know”. Given that FAVs are not yet fully introduced to the market in 2015 or 2021, we assume that people who show an affinity to FAVs can be potential consumers in the future. Therefore, we included those who answered (1) and (2) as a group of ‘potential consumers’ as they show an affinity toward using FAVs. On the other hand, people who answered (3) and (4) are reluctant to purchase FAVs, and we did not consider them potential consumers. Therefore, we coded WTB equal to 1 if a respondent belonged to a potential consumer group and coded WTB as 0 if not. Therefore, our analysis would allow us to see what kinds of factors would shift consumers who belong to (3) and (4) to (1) and (2). We would like to note that we are making a clear distinction between “adding” a completely self-driving option and “purchasing” FAVs by asking “Do you want to add a completely self-driving option that allows you to move around when you purchase a car in the future?”.
Next, we also asked PV for FAVs. Respondents were asked to write down their PVs freely regardless of the purchase decisions, ranging from 0 JPY to 3.25 million JPY (based on the approximate conversion rate of 1 JPY to 0.0073 USD, the range of PV for FAVs from 0 JPY to 3.25 million JPY, converts to approximately 0 to 23,725 USD). We used a payment card method to measure PV, and we provided detailed ranges of PVs in
Table 1. However, given that FAVs are a newly introduced technology, people may not have a specific price range of PV if we choose to leave PV as an open question. In that sense, leaving PV as an open question may increase the variances of responses for two reasons. First, because evaluating PV is not a typical daily decision-making behavior, it may result in many nonresponses, and respondents would feel that it is difficult to answer with a concrete number without providing any examples. Second, following the first reason, the number of outliers may increase, and the outliers may distort the representative values by abnormally large or small amounts. Third, the answers tend to be concentrated on round numbers (Ministry of Land, Transport, and Infrastructure—We refer to
https://www.mlit.go.jp/kowan/beneki/images/kaigan_hiyoubeneki_06.pdf, accessed on 26 October 2023). Thus, we chose to use categorical but detailed PV questions. We have respondents who chose a PV of 0, indicating that they would choose to add it if it is free, and such an answer does not indicate that the respondents are not willing to purchase AVs.
We also included the respondents’ car ownership and car types in our model for two reasons. First, we would like to increase the survey’s internal validity; therefore, we would like to control for individuals who do not know the price and maintenance costs for cars. Thus, we included the ‘car ownership’ variable to control for those who do not own a car and are less likely to be aware of car prices. Second, along with car ownership, we also include car types (gasoline, diesel, hybrid, plug-in-hybrids (PHEV), fuel-cell vehicle (FCV), and electric vehicles (EV)), as car prices differ according to the car types.
Then, we asked about concern for the environment in the form of ‘importance as a policy’ on a 5-point Likert scale, including zero (no awareness). Based on previous studies, we classified the topics for environmental policy into eight factors referring to the House of Councilors, The National Diet of Japan, (2015) [
57]; We have 13 questions in total, and the topics are about the renewable energies, air pollution, environmental conservation, water pollution, endangered species conservation (biodiversity), reuse and recycling, waste disposal, and CO2 emissions with questions such as, “How important is the policy to you?’ The scale of responses is as follows: (0) for no awareness/interest at all--therefore, the difference between those who answer (0) and others would be whether that person at least has an interest in a certain policy/issue, (1) for very insignificant; (2) for insignificant; (3) for neither important nor insignificant; (4) for important; (5) for very important. Next, we surveyed the technological merits and concerns regarding FAVs. Respondents were asked to check multiple options among 17 options for merit and 12 options for concerns.
We also included sociodemographic variables: income, gender, age, and commuting time.
Table 2 shows descriptive statistics. Overall, we had approximately 180,771 respondents. We divided the sample into three groups: the overall group (Panel (A)), those who would not purchase an FAV, (as in Panel (B)), and those who would purchase an FAV (as in Panel (C)). Although we do not see significant differences across the groups for the sociodemographic variables, annual income, PV for FAVs, and EV dummy show higher mean values for those who belong to Panel (C) than in Panel (A) and (B).
Among all options and questions, we used factor analysis to choose the options that are used in the estimation. We discuss more on factor analysis and how we chose the important factors in
Section 3.2. Specific lists of questions are listed in
Table 3, which shows notations for each option and explanations of them. ‘Sources’ in
Table 3 refers to the previous works we referred to when designing survey questions. The proportions of consumers choosing each option are listed in
Appendix A,
Table A1 and
Table A2.
3.2. Empirical Strategy
We use structural equation modeling (SEM) to assess the relationship between factors that are correlated with the WTB and PV of FAVs. We chose SEM, which is a suitable methodology that allows us to examine the psychometric factors that are correlated with people’s intentions to FAVs. SEM can handle a substantial number of endogenous and exogenous variables and can include latent variables in the model. Thus, SEM enables the inclusion of the theory of planned behavior (TPB), which explains people’s behavior based on psychometric intentions through latent variables determined by attitudes [
77]. Thanks to such benefits, SEM has been employed in many research fields incorporating psychometric modeling, such as psychology, sociology, educational research, political science, and market research. Several SEM applications in transportation research have been conducted in the past (examples of previous works including SEM as the main method include [
78,
79,
80,
81]). Our model explains the WTB and PV of automated vehicles based on the four latents of nature, pollution, merit, and accidents and thus focuses on the psychometric intentions of the potential consumers, and SEM allows such analysis.
Moreover, SEM offers simultaneous estimations of latent variables and exogenous variables and allows for correlations between latents. If the latents and exogenous variables are estimated sequentially, for example, one can conduct factor analysis to construct the latents in the first step and proceed to the estimation of latents and exogenous variables to the choice modeling, while this strategy is simple, it does not guarantee unbiased estimators for the parameters involved and tends to underestimate standard errors (see, for example, [
82,
83]). Furthermore, sequential estimation does not allow for the interaction of latent variables. As we assume that latents are correlated and people’s choice behavior is not ‘sequential’, we choose SEM in this study and use STATA to estimate our model (see [
84] for a discussion of sequential versus simultaneous estimation).
3.2.1. Identifying Latent Constructs
We first identified the latent variables that can be related to WTB and PV for FAV based on the process used by previous studies (e.g., [
85]), as shown in
Table 3. We chose four categories: fear (fear of FAV technology), merits (advantages and benefits of FAV technology), pollution (concerns about pollution), and nature (concerns about conserving natural environments) as the latent variables.
We conducted an extensive literature review and factor analysis to sufficiently validate our latent variable construction process. To do so, we focused on the merits of FAVs and focus on the disadvantages that FAVs would possibly bring. First, the latent variables and statements (questions) for each survey were based, whenever possible, on statements previously used and found to be effective in the literature. Second, we constructed the latent variables according to our research hypothesis, exploratory factor analysis (EFA) and previous works. First, using EFA, we explored the latent variables that represent the respondents’ awareness and attitudes toward issues related to FAV and the natural environment—as a rotation method, we adopted the promax method, one of the oblique rotations, to assume that latent variables can be correlated with each other. In previous studies, orthogonal rotation methods are frequently used for setting no correlation between latents. However, it is debated that the uncorrelation assumption is unrealistic: in social science, attitudes and perceptions tend to be mutually related [
86]. From the EFA, we obtained four latent variables: fear, merits, pollutions, and nature. These latent variables were derived from the indicator variables shown in
Table 3. Cronbach’s alpha values of merit, fear, pollution, and nature were 0.559, 0.734, 0.953, and 0.914, respectively. Cronbach’s alpha is regarded as a measure of scale reliability, whose acceptable range is >0.6. Only merit did not satisfy this condition, but its Cronbach’s alpha value was not too far from 0.6 [
87]. The correlation between indicator variables is shown in
Table A3,
Table A4,
Table A5 and
Table A6 in
Appendix A.
Next, using the relationship between latent and indicator variables obtained from the EFA, we conducted a confirmatory factor analysis (CFA) to estimate the coefficient of latents on indicators, and calculate the score of each latent variable. In the CFA process, we can assume the correlations between error terms of indicator variables. Suppose that one latent construct is measured by five indicator variables. The error terms of the indicator variables are calculated as their unique variance that is not related to the latent construct. If two specific indicator variables are similar compared to the other three, the two share common variances that are not captured by the latent. In such a situation, setting a correlation between the error terms of those two indicators can explain such a similarity and improve the overall model fit. We decided which error terms should be correlated with each other according to the goodness-of-fit indices and the strength of the correlations between indicator variables.
Finally, we included the four latent variables obtained by the EFA and CFA processes in our SEM model. These latent variables were used as the exploratory variables for purchasing decisions and PV for FAV. In addition, we included gender, individual income, age, and commute time as the control variables for purchasing decisions and PV for FAV because these individual characteristics may affect purchasing intention and PV as well as latent awareness and attitudes.
The first latent construct, fear, represents an individual’s concerns toward possible accidents, malfunctions, or responsibility issues (i.e., who would be responsible when there is an accident) toward FAVs. Numerous works and experts argue that FAVs will eliminate human errors, therefore creating safer traffic environments. Nevertheless, many members of the public are concerned about potential problems. These concerns were also mentioned in previous works; Petrovic et al. (2020) [
61] mention that rear-end collisions are likely to occur more often in AVs. Ahmed et al. (2020) [
62] argue that the public is still concerned about possible crashes due to malfunctions of AVs and cybersecurity issues. Other works also point out that people are concerned with safety issues [
63]. Due to these concerns, we expect those who are wary of possible accidents to be less willing to purchase FAVs and AVs than those who do not fear them. On the other hand, resolving such issues would then encourage them to purchase FAVs and AVs [
64].
The second latent construct, ‘merit’, shows an individual’s interest in the advantages that AVs/FAVs would bring. It ranges from simple benefits that allow people without licenses or people without long-term experiences in driving to drive [
68], to enable drivers to multitask [
65], drive more comfortably [
66], and usefulness [
59,
67].
The third and fourth latent variables are related to the environmental awareness of individuals. The third latent construct, ‘pollution’, represents attitudes about reducing environmental pollution and promoting reusing and recycling materials. The fourth is ‘nature’, which shows individuals’ awareness about conserving biodiversity and the natural environment. Studies in the field of transportation show that an individual with high pro-environmental awareness has a higher intention to buy FAV [
53,
66]. Although most of the previous studies have only focused on overall pro-environmental attitudes, we categorized environmental awareness into pollution-related and conservation-related because each of them might have varied effects on attitudes toward AV. The contribution of AVs to the environment is associated with pollution reduction (particularly those related to air pollution) by easing traffic jams rather than the conservation of natural environments such as animals and forests. Thus, to promote AVs effectively, it is important to know whether both types of awareness, AV-related (pollution) and non-AV-related (nature), affect PV and WTB for AVs.
3.2.2. Structural Equation Modelling
Using the latent constructs, we have created SEM models as in
Figure 2. The rectangles in the diagram symbolize the observed variables, whereas the circles denote latent variables and error terms. Each arrow represents the path from one variable to another, with bidirectional arrows indicating correlations between variables. Each latent variable is measured by its corresponding indicator variables. Subsequently, four latent variables and individual characteristics have paths to our primary objective variables: WTB and PV.
We have three models in total. First, we investigate factors that are correlated to WTB (Model (1)) and PV (Model (2)). Second, we assume that a higher PV would be positively correlated with a higher WTB; therefore, we add such a relationship to Model (1) and assume that all types of latents and other exogenous variables are correlated to both WTB and PV (Model (3)). Our preferred main model is Model (3), and we take Models (1) and (2) to confirm our findings in Model (3). Such diverse specifications from Models (1) and (2) allow us to confirm the robustness of the results. To make a better fit of the model, we assume that some of the error terms associated with indicator variables are correlated. Hypothesizing a correlation between these error terms can improve our model’s ability to explain the data.