1. Introduction
In 2009, the number of people living in urban areas (3.42 billion) surpassed the number living in rural areas (3.41 billion), and since then the world has become more urban than rural. In 2014, there were 7.2 billion people living on the planet (United Nations, 2014) [
1]. It is estimated that by 2017, a majority of people were living in urban areas. The global urban population was expected to grow approximately 1.84% per year between 2015 and 2020, 1.63% per year between 2020 and 2025, and 1.44% per year between 2025 and 2030 (World Health Organization, 2014) [
2].
The increasing population trend shows us the importance of arranging city resources. Smart city projects are one of the efficient solutions. The use of smart computing technologies makes the critical infrastructure components and services of a city—which include city administration, education, healthcare, public safety, real estate, transportation, and utilities—more intelligent, interconnected, and efficient (Washburn et al., 2010) [
3]. There is a range of conceptual variation generated by replacing smart with other alternative adjectives such as digital city or sustainable city. Mills et al. (2022) [
4] also give a definition of smart city from the perspective of big data, artificial intelligence, and other characteristics. Oke et al. (2022) [
5,
6] found that all the smart city leaders and followers themselves can help each other to overcome some challenges.
Smart city ranking is a useful performance evaluation method. There exist many smart city ranking results. The ranking results give all the city stakeholders an idea about how each smart city is making progress. The results also help stakeholders make decisions; for example, investors may decide which smart city project to invest in based on a reliable ranking result. Many companies, research institutes, and Non-Government Organizations (NGOs) are working on smart city ranking or evaluation (Albino, Berardi, and Dangelico, 2015) [
7]. They are typically displayed as a score or ranking index.
This research will use fuzzy logic and machine learning techniques to predict whether a smart city will be classified as a leader or a follower. This research starts with the current smart cities’ leader or a follower classification summary and analysis through fuzzy logic and machine learning techniques. Based on the current smart cities classification result, some insightful rules and information will be extracted for future smart cities prediction.
Only limited smart cities were in the prediction list, due to the limitation of sampling framework, survey budgets, data accessibility, and others. More cities should be included in the prediction list in the future. Furthermore, different ranking results use different methodologies. For example, an organization may use survey methodology; another may use secondary data. These differences lead to “heteroscedastic” results.
Based on the accessible smart city ranking results, a smart city can be either classified as a leader or a follower. A fuzzy logic will be used to summarize the current smart city leaders and followers on the list. This research paper applies several machine learning algorithms to identify smart city leaders and followers by using some existing city indicators. The highest test accuracy algorithm will be used for additional smart city leader and follower predictions. Smart city progress issues will also be investigated based on the prediction.
In their assessment of smartest cities in the Gulf States, Woods et al. (2016) [
8] define a smart city leader and follower as follows:
Smart City Leader: These cities have differentiated themselves through the clarity, breadth, and inclusiveness of their smart city vision and planning. They are also leading the way in implementing significant projects at both the pilot and increasingly full-scale levels.
Smart City Follower: These are cities that are beginning their smart city journeys. They may have made initial statements of intent and begun limited pilot projects and soloed operations, but they need to develop a more integrated view for city development and/or stronger leadership for their programs.
Thus, the research question is “What machine learning algorithm can accurately identify smart city leaders and followers based on existing city indicators, and how can this knowledge be used to analyze smart city progress issues?”.
The research paper employs a combination of fuzzy logic and machine learning techniques to identify and predict smart city leaders and followers. The authors first use fuzzy logic to label cities as either leaders or followers based on evaluation meta results from various organizations. They then apply machine learning techniques to uncover the key characteristics of each group. Using the Support Vector Machine (SVM) algorithm, the authors use the training data’s performance to predict which cities are likely to become smart city leaders or followers. The proposed prediction framework successfully predicted 30 smart city leaders and 20 followers.
2. Related Work
2.1. Call for Clarity
Amidst the multitude of efforts surrounding the notion of the smart cities, Hollands (2008) [
9] formulates a critique on the usage of smart cities as a label. The call for clarification finds fertile soil in the research community, which assesses smart city research to be fragmented, divergent, and lacking unifying cohesion and intellectual exchange (Mora, Bolici, and Deakin, 2017) [
10].
Hollands’ (2008) [
9] main critique is that the smart city label incorporates a wide range of fields (from IT to business to communities). However, it remains ambiguous in the ways in which these fields are connected to the smart city notion and to each other. This is exemplified by the way that “smart” can be replaced by a multitude of other adjectives, such as “creative” or “wise” cities, without increasing descriptive clarity. Although Hollands’ considers this overlap in meaning to be problematic, Moir, Moonen, and Clark (2014) [
11] point out that these slight differences may indicate a desire to highlight one of the specific aspects of the smart city concept. They observe that smart cities are but one formulation of the more generic ‘future city’ term, which is used to “convey either environmental, social, economic or governance aims, or a hybrid of some or all of these elements” (p. 4). Additionally, the lack of cohesive understanding may also be due to the various different motivations that determine the choice of smart city label. Cities gravitate towards concepts that are most appealing to them in that moment, which may be influenced by factors such as geography and zeitgeist (Eremia, Toma, and Sanduleac, 2017) [
12]. For example, after the 1950s, the most popular term in urban development was “sustainable city”, while “digital city” came up in the late 90s (Eremia et al., 2017) [
12]. In 2009–2010, “smart city” became the dominant term with previously 132 documents published between 2002 and 2009 to more than 900 in 2010–2012 (Mora et al., 2017) [
10].
The current discourse on future cities is distinctive for its global, positive, strategic, integrated, and evidence-led character (Moir et al., 2014) [
11]. This is also noted by Hollands, who claims that the way that these labels “link together technological informational, transformations with economic, political and social-cultural change” (Hollands, 2008, p. 305) [
10], which is generally positive in nature. With this positive connotation, cities are generally eager to use these labels in an effort to appear more positive as well. Thus, a rhetorical inflation occurs in which the label loses its actual meaning and reference to technological and infrastructural change in favor of marketing-fueled hype. This conflation of labels also occurs with words that might initially appear more neutral, such as “intelligent” or “digital”. These words similarly carry an optimistic assumption regarding urban development (i.e., a harmonious high-tech future) and can have multiple possible meanings (see (Komninos, 2013) [
13] for four possible meanings of intelligent cities). The purpose of Hollands’ paper was to break down the usage of the label and its assumptions, thus creating an opportunity for other researchers reflect on and seek clarification of the notion of a smart city. For example, Allwinkle and Cruickshank (2011) [
14] critically reflect on the concept of “smartness” and other arguments set forth by Hollands. More recently, Kitchin (2015) [
15,
16] contrasts Hollands’ arguments by arguing that the majority of the smart city literature actually appears to be non-ideological, commonsensical, and pragmatic. Still, he identifies several shortcomings that inhibit the growth of the smart city agenda. The first of which is in line with Hollands’ argument that there is a lack of shared understanding about the concept and initiatives. Kitchin (2015) [
15,
16] then extends it by claiming an overreliance on canonical and simplified examples and an absence of in-depth empirical case studies and comparative research in the literature.
In 2014, the European Parliament commissioned a report that maps the state of European smart cities. To do this, they first outlined what a smart city seeks to achieve (Manville, Europe, Millard, Institute, and Liebe, 2014, p. 17) [
17]:
“A Smart City is quintessentially enabled by the use of technologies (especially ICT) to improve competitiveness and ensure a more sustainable future by symbiotic linkage of networks of people, businesses, technologies, infrastructures, consumption, energy and spaces”.
As such, their working definition is (Manville et al., 2014, p. 17) [
17]:
“A Smart City is a city seeking to address public issues via ICT-based solutions on the basis of a multi-stakeholder, municipally based partnership. These solutions are developed and refined through Smart City initiatives, either as discrete projects or (more usually) as a network of overlapping activities”.
2.2. Smart City Characteristics
Since there is no commonly agreed-upon definition, substantial research effort is conducted on describing the characteristics of smart cities.
The most prominent scheme distinguishes six conceptually distinct characteristics related to a smart city: (1) smart governance, (2) smart people, (3) smart living, (4) smart mobility, (5) smart economy, and (6) smart environment (Giffinger, Fertner, Kramar, and Meijers, 2007) [
18]. The European Parliament follows this scheme in the sense that in order to qualify as a smart city strategy or initiative, it must exhibit at least one of these six characteristics. Other schemas approach the matter from different perspectives. For example, Chourabi et al. explored the literature from multiple fields to propose a framework containing eight core components of smart city initiatives: “(1) management and organization, (2) technology, (3) governance, (4) policy, (5) people and communities, (6) the economy, (7) built infrastructure, and (8) the natural environment” (2012, p. 2291) [
19]. Interestingly, the authors caution against using these components to rank smart cities. Instead, they highlight these components as a supportive tool to understand and advance smart city strategies and initiatives. A similar approach was undertaken by Joshi, Saxena, Godbole and Shreya (2016) [
20], who propose a six-pillar framework “SMELTS”: (1) social, (2) management, (3) economy, (4) legal, (5) technology, and (6) sustainability. In this framework, technology, economy, and legal are said to have a greater impact on and by the smart city initiatives, which then affect the social, management, and sustainability factors in the outer level [
21].
2.3. Fuzzy Logic
The core idea behind fuzzy logic is that it aims to model the more imprecise reasonings used by humans when they make rational decisions, especially in an uncertain and imprecise environment. This is possible due to the human ability to use imprecise, inexact, incomplete, or unreliable knowledge to infer an approximate answer. Thus, fuzzy logic seeks to extend logical reasoning in the sense that if logic is the application of formal principles of reasoning, then fuzzy logic is the application of formal principles of
approximate reasoning (Zadeh, 1998) [
22]. Fuzzy logic is better equipped to handle the concept of a partial truth, because fuzzy logic views everything, including truth itself, as a matter of degree rather than a binary true or false. This does not mean that “fuzzy logic is fuzzy”; rather, it is a “precise logic of imprecision and approximate reasoning”. (Zadeh, 2008) [
23]. Its principal facts are that it is logical, fuzzy-set-theoretic, epistemic, and relational (Dzitac, Filip, and Manolescu, 2017) [
24]. By providing a mathematical means of representing vagueness, fuzzy logic models or sets are able to recognize, represent, manipulate, interpret, and utilize approximate information. This contrasts with more traditional Western Aristotelian logic systems, which tend to be more binary in approach. It initially drew mixed reactions as science, and engineering at the time did not consider the dullness of class boundaries [
25]. Yet, the way that fuzzy logic seeks to formalize the human ability to reason and decide in situations of imperfect importation is one of the factors that has enabled fuzzy logic to be applied to many fields, from artificial intelligence and quantum particle physics to control engineering, robotics, and even natural languages.
3. Importance of Smart City Evaluation
As cities vary widely in their economic, geographical, socio-cultural, and historical make-up, smart city efforts require tailored approaches in order to satisfy the requirements of that particular city. Taking this into consideration, Pellicer et al. (2013) [
26] take an innovation and development-based standpoint in which they divide current initiatives into those that feature newly formed cities versus efforts that seek to transform existing traditional cities into smart cities.
Within the smart transportation area, Refs. [
27,
28,
29,
30,
31,
32,
33] proposed a comprehensive and practical framework for benchmarking cities with specific indicators according to the smartness of their transportation systems. This framework was developed through the (1) formulation of a proper concept of smartness in the context of urban transport system, which the authors view as one that utilizes self-operative and corrective technologies and systems in its operation and management, (2) the generation of a generic matrix of 66 indicators of smartness based on a systematic literature survey, and (3) calculating a composite smartness index (SI) of a city’s transportation system using the smartness indices. They then applied their framework to 26 major cities in the world to provide an illustrative example on how it might be applied by benchmarking smart transport cities across the world. This study is illustrative in multiple ways. The first is with regard to the selection of the criteria or indicators used for analysis. The criteria for selection of these cities were to rank within the top 50 of a global infrastructure benchmarking study and have at least two million inhabitants. Of the 66 indicators identified, only 21 indicators were ultimately included due to a lack of available information on the other indicators. This reveals a concern with benchmarking studies because due to their reliability on secondary information sources (for reasons that are in many cases perfectly practical and sensible), they may be limited in the quality or generalizability of their results. The quality and availability of information are related to the second concern of the indicators. The authors ran their analysis with both equal and unequal weights assigned to the sub-systems and concluded that this had a strong influence on the resulting city rankings. A third difficulty concerns the relevancy of the results. The authors note that due to the speed at which technology and information changes, the accuracy of the benchmarking study may only be applicable for a short time period only. This is a valid concern and one that applies especially to the smart city field as smart city initiatives are constantly initiated and terminated [
34].
In their study, Giffinger et al. (2007) [
18] specifically focused on medium-sized cities in Europe. The discourse regarding city development is often discussed in a similar way that management literature discusses organizations: in broad sweeping terms that pertain more to the larger metropolises and multinationals than to the smaller medium-sized organizations and cities. While size may be an important differentiator, it is not the only or most important characteristic by which these entities differ. Giffinger et al. (2007) [
18] observe that medium-sized cities often have less resources, organizing capacity, and critical mass than their larger counterparts, forcing them to have to be more selective competitive. Yet, comparisons between cities rely on similar metrics, no matter the size or circumstance. This is not to say that city rankings are identical. On the contrary, rankings are known to produce different results depending on their aims and resources as well as their data collection, processing, and analysis methods. Additionally, not all cities are included in the ranking, often due to issues with data access or quality. Therefore, although city rankings can be a useful tool to assess the attractiveness of urban regions and to identify city strengths and assets, cities are not always able to benefit from them. In an effort to alleviate some of these concerns, Giffinger et al. (2007) [
18], based their ranking on a rather comprehensive selection method, sought to apply a more solid methodology that would better reflect the characteristics of medium-sized cities. In addition to their robust methodology, Giffinger et al. [
18,
35] also contributed to the smart city literature by identifying six characteristics by which smart cities can be understood: smart economy, smart people, smart governance, smart mobility, smart environment, and smart living. These six characteristics can be further described by 33 factors, each of which is further associated with 1–4 indicators for a total of 74 indicators.
4. Identifying Current Leaders and Followers with Machine Learning Algorithms
Below is the process of smart city leaders or followers’ identification. At the beginning, we use fuzzy logic to summarize the smart city ranking results and then categorized them into two groups: smart city leaders or smart city followers. Next, use all the smart cities and their corresponding indicators as the training data set. Then, we apply several classic machine learning algorithms to this data set. Based on machine learning algorithm’s accuracy performance, the highest accuracy rate algorithm will be used for future smart cities prediction. The below
Figure 1 shows the detailed process.
5. Data Preparation
5.1. Fuzzy Leaders and Followers Classification
This research uses five organization ranking results for classification. These data sources are selected based on data availability, reputation, data quality, newspaper citation, and other factors. Some of them are institutes; some are companies or NGOs. A city may not be listed on all the ranking results.
Table 1 displays the smart city ranking resource details.
Different organizations rank smart cities differently. This data preparation applies fuzzy logic to make the leader or follower identifiable. Every ranking list will be divided into three levels.
Table 2 shows the three levels and their relative locations.
All the selected cities are assigned with corresponding levels. For example, Tokyo is assigned with “RANKING-HIGH”, “RANKING-MEDIUM”, and “RANKING- MEDIUM”.
Essentially, this will be the fuzzy set problem. A membership function will be used to quantify the grade of membership of the element in
X to the fuzzy set.
where
is the membership function, and X represents the universe of discourse while the fuzzy set is
A. A Triangular function will be used here. There is a lower limit
a, an upper limit
b, and
a value
m, where
a <
m <
b also shown in
Figure 2.
5.2. Defuzzification Process
Defuzzification is the process of converting a fuzzified output into a single crisp value with respect to a fuzzy set. There are many defuzzification methods, such as the Center of Sums Method (COS), Center of Gravity (COG)/Centroid of Area (COA) Method, Center of Area/Bisector of Area Method (BOA), and Weighted Average Method [Flir and Yuan, 1995] [
36]. This research takes advantage of the Center of Sums (COS) method, which is one of the most commonly used methods for the defuzzification process. This method is defined as follows:
where
n is the number of fuzzy sets,
N is the number of fuzzy variables, and
is the membership function for the
k-
th fuzzy set.
As mentioned before, Tokyo is associated with “RANKING-HIGH”, “RANKING-MEDIUM”, and “RANKING- MEDIUM”.
The center of the area of the fuzzy set is let to say = (0.75 + 1)/2 = 0.875, similarly, = 0.5, = 0.5
Now, the calculated defuzzification value = 0.68
The next step is to give fuzzy classification results to all the smart cities on the list. The list is the top-ranking results, which means it is a leader’s smart city ranking results. Additionally, there are many other follower smart cities not on the list. So, based on a Delphi method adjustment, the fuzzy classification criteria are displayed in
Table 3:
Based on the fuzzy classification criteria, Tokyo should be classified into the leader group. All the other cities can be found in the data set upon request.
5.3. Attribute Selection
After defining fuzzy smart city leader and follower, smart city attributes are also needed. This research selects smart city meta-data (Index) for modeling. All the meta-data relate to smart city dimensions, such as living quality, sustainability, and others. Different smart city concepts have different smart city dimensions. This research selects the most used dimensions. All the metadata are the latest. Most of them are year 2018 data; only a small portion are year 2017 data or earlier.
Table 4 summarizes all the smart city meta-data (Index).
6. Models Building and Evaluation
Prescreening is used to ensure the modeling quality. During this process, NETWORKED is removed due to low correlation. Additionally, 30% of the current leader/follower smart cities were removed due to high missing data; only 93 smart cities stay on the data set. There are still some missing values. Moving average smoothing is used here as an efficient imputation method. This research also uses a Python package (sklearn fit_transform) to scale all the attribute values into a range between −1 and 1.
Four types of supervised learning algorithms were implemented in this research.
- o
Logistic Regression;
- o
KNN;
- o
SVM;
- o
Neural Network.
To perform machine learning on the smart city data set, we utilized the Scikit-learn and Pandas packages for Python (all the source code will be available on request).
6.1. Logistic Regression
Logistic regression is a classical machine learning classification algorithm that is used to predict the probability of a categorical dependent variable. This research is a two-class value problem, so logistic regression will be used for binary classification.
Below is an example of logistic regression equation:
where Y is the predicted output,
is the bias or intercept term, and
is the coefficient for the single input value (
X).
Table 5 shows the logistic regression results (test size = 0.20):
6.2. KNN
The k-Nearest Neighbors algorithm (KNN) is a non-parametric method used for classification and regression (Altman, 1992). The KNN algorithm assumes that similar things exist in close proximity. In other words, similar things are near to each other. For example, for this research classification problem, leader is 0, and follower is 1.
Table 6 shows the logistic regression results (test size = 0.20):
6.3. SVM
Support vector machine (SVM) is a supervised machine learning algorithm. It is powerful for classification problems. For two-dimensional data, there is more than one possible dividing line that can perfectly discriminate between the two classes. The best is accepted to be the hyperplane that creates the largest separation between the two classes, or the maximum margin.
The SVM can be described as the following equation:
The SVM is more powerful when it is associated with kernels, especially for the nonlinear relationship classification more fit. The kernel projects data into higher-dimensional space defined by polynomials, Gaussian basis functions, or other functions.
This research uses four different kernel functions: sigmoid functions, radial-basis functions (RBF), polynomial, and linear.
Table 7 illustrates all the four function names and their kernels.
Table 8 shows the SVM results (test size = 0.20):
6.4. Artificial Neural Network
An artificial neural network is a collection of connected units or nodes, which are inspired by the biological neural networks that constitute animal brains. In an artificial neural network, the signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs.
This research uses a multilayer perceptron (MLP), which is one of the feedforward neural networks. Only three layers are created in this research (input layer, hidden layer, and output layer). For the hidden layer, each node uses the “RELU” activation function. For the output, “SIGMOID” is used as the activation function. The optimizer selects “ADAM”. For the input layer, there are 10 nodes. For the hidden layer, there are 30 nodes, and there is one node for output layer, because of a binary classification problem. The epoch size is 50.
The results of the accuracy of this artificial neural network are little lower than SVM. The highest rate is only 80% (test size = 0.2).
Figure 3 is a graph showing this artificial neural network architecture design.
7. Summary of Models Results
This research evaluates all the four models based on different parameters, such as testing size, Value of N, kernel function, and others. Based on the current smart city data set, the SVM with sigmoid kernel holds the highest accuracy for both 10% and 20% test size. In a word, the prediction will apply this model.
Table 9 shows all the machine learning algorithms training and testing results.
According to the above model results, the algorithm SVM (Kernel = Sigmoid, C = 1) has the highest performance score. For all the four algorithms, the results can vary if conditions change. Additionally, if the sample size was increased, the results could also be different. So, based on the above results, the SVM (Kernel = Sigmoid, C = 1) algorithm will be used for the future prediction task.
8. Smart Cities Leader and Follower Prediction
Prediction Results
Using the selected machine learning model (SVM), the 50 cities were predicted as being either a potential smart city leader or follower. Not all the cities were used for prediction. A lot of cities were abandoned due to the problem of missing data. All the cities are listed based on alphabetical order.
Table 10 provides an overview of the cities predicted as leaders, while
Table 11 lists the cities predicted as followers.
Figure 4 and
Figure 5 depict the geographical location of the smart city leaders and followers in the world.
9. Results Validation
To evaluate the prediction results validation. We use the F-test to compare their internet infrastructure performance improvement. Internet infrastructure is a key factor for smart city projects. Better internet service could lift up the smart city project standards. The internet service plays a significant role in transforming financial, environment, and other aspects of urban life digitally. The International Data Corporation (IDC) states that smart city development uses smart initiatives combined with leverage technology investments across an entire city, with common platforms increasing efficiency, data being shared across systems, and IT investments tied to smart missions. All the tasks rely heavily on the internet service.
To evaluate the internet service improvement, we use the data from Existent Ltd. This company along with New America’s Open Technology Institute, Google, Princeton University’s PlanetLab, and other supporting partners released an annual worldwide broadband report for 2019,
https://www.cable.co.uk/broadband/speed/worldwide-speed-league/ (accessed on 1 January 2020). This report includes 207 countries’ internet service data, such as ranking, mean download speed, distinct IPs tested, and others. This report also includes the data of the year 2018 for comparison purposes. We assume all the cities in the same country have the same internet service performance. To evaluate the performance improvement, we use the internet average download speed change rate from year 2018 to 2019. The formula is below:
The F-test is a classic method to evaluate two population data variations. The formula is below:
After removing the 10% of the extreme data, we conducted an F-test. The F statistic is 2.2605. The p-value is 0.04225, which is less than α = 0.05. This result means that under a 95% confidence level, we have sufficient evidence to say that the smart city leader group has a better internet service improvement than the follower group.
10. Discussion
This research used fuzzy logic and machine learning to predict whether smart cities can be categorized as either leaders or followers. This result contributes to a lot of practitioners and theory researchers. All the public and private stakeholders (urban planning department, citizen, and others) could take advantage of it according to their own goals. For example, investors could take advantage of this result of further technology investment decisions; policymakers could use the result and insights for urban planning; employees could take this result into consideration when they decide which city they should move to if they like smart city lifestyle.
This result also contributes to theory development. This smart city classification algorithm has proven a high accuracy based on testing data, which means that smart cities have a significant relationship with its basic elements, such as innovation, living quality, globalization, and others. For example, innovation has a significant positive relationship with smart city evaluation results.
The prediction results indicate that more than one leader or follower comes from the same country. For example, Guangzhou and Nanjing are followers that both come from China. Phoenix and Pittsburgh are both leaders from the United States. This points to the potential effect of peer effects on smart cities, similar to peer effects on classmates. This could be further investigated because if peer effects exist, then it could lead to both theoretical and applied urban planning contributions.
Another finding is that most follower locations are close to the coast, while the leader locations have no such relationship with coastal proximity. For example, in the United States, all the four followers (Cleveland, Baltimore, Miami, and Houston) are close to coast, while the leaders in the United States are nationwide, with some located on the west coast, some located on the east coast, and some located in the middle.
11. Conclusions and Future Work
The smart city prediction results provide a helpful framework of categorizing smart cities. This study has the following limitations. The highest accuracy is less than 90%, according to the experiments, which means there is room for improvement regarding Type I or Type II errors. It is conceivable that some smart city leaders have been mis-categorized as followers and vice versa. There are many reasons that lead to this bias in these errors. Firstly, the original feature data could be biased because most of their city sampling is not transparent. The data collected are not reliable. Secondly, there may be an issue with multi-collinearity. The predicted feature variable could be linearly predicted from others. For example, the livability could be related to safety features, which may adversely affect investment opportunities. Lastly, future experimental design should look into extracting more features, such as technology investments and economic features.
The prediction results are in binary categories, which means that the result is either leader or follower. A possible solution is presenting the results as quantitative values. If so, a scoring system should be developed. Current smart city evaluation methods are only ranking, expert scoring, focus group analysis, or any other qualitative methods. All the methods are biased due to the sample cities’ selection transparency. Different evaluation methods have different city samples. Some samples only contain big cities while others just include developed cities. These evaluation results are not reliable. It is necessary for an evaluation paradigm-shifting from smart city ranking to testing. For future work, we plan to propose a smart city testing framework. The testing framework should be similar to a quiz. Every city could do the quiz and then receive a score. The testing framework would ignore the city selection because all the cities can be tested. Additionally, by doing testing, the testing scores become comparable, either comparable to other cities or itself.
One of the future studies is about investigating the factors that impact smart city leaders or followers. This research shows that the highest accuracy is less than 90%, which means there is a large room for improvement. Those factors of smart city can make a difference. Currently, there is not enough deep investigation of those factors, either factors themselves, or factor interactions.
The difference between the smart city leader and follower prediction results should be further analyzed. For example, the current results assume that there is no significant relationship between smart city identification and Gross Domestic Product (GDP). The assumption can be either rejected or not rejected if further hypothesis testing has been conducted.
Another insight is about shifting the smart city follower to a leader. Being a smart city leader means that citizens have higher satisfaction about their urban lives. All the stakeholders have an agreed goal of shifting to a smart city leader. The actionable and meaningful plan should be further developed and reviewed.