A Scientometric-Based Review of Traffic Signal Control Methods and Experiments Based on Connected Vehicles and Floating Car Data (FCD)

This paper reviews the state of the art in traffic signal control methods that are based on data coming from onboard smartphones or connected vehicles. The review of the state of the art is carried out by applying analytical scientometric tools (topic visualization, co-citation analysis to establish influential journals and references, country analysis based on coauthorship, trending-topics analysis carried out by overlay visualization). The introduction of autonomous and connected vehicles will allow city management organizations to introduce new intersection management systems that rely on real-time positional data coming from instrumented vehicles. Traditional vehicles also could benefit from these new technologies by profiting from better-regulated intersections. This paper using a scientometric approach frames all the scientific contributions aimed at the field of traffic signal methods and experiments based on connected vehicles and floating car data. The applied scientometric approach reveals trending ideas and concepts and identifies the relevant documents that can be consulted in order for scientists and professionals to develop further this field with the implementation of new traffic signal control systems that can “give the green light” to drivers.


Introduction
Connected and autonomous vehicles and cooperative intelligent transportation systems (C-ITSs) are new technologies that will impact traffic control and management. Cooperative intelligent transportation systems (C-ITSs) are based on the sharing of information between drivers and road management. Connected vehicles (CAs) will be able to share speeds and positions among many other various useful data. Vehicle speeds and positions can be used to better manage traffic signals in real time. Connected vehicles can become an important part of new C-ITSs. Some of these ideas have been explored in some European-funded projects such as SAFESPOT [1], EuroFOT [2] and DRIVE C2X [3].
One of the main tasks that road managers must deal with in traffic control is the management of signalized intersections. Control operations, unfortunately, very often are not adjusted in real time and sometimes are implemented with out-of-date fixed-time traffic signal settings. Traffic congestion, which can also be caused by traffic signals, is a serious problem in cities and also a great cause of air pollution. For this reason, efforts to solve congestion problems have been centered on attempts to shift demand on transit systems [4] and on better road traffic control, by adopting tools such as traffic simulation [5][6][7][8][9][10] dynamic network loading equilibrium and dynamic models [11][12][13][14] and the study of and attempt to affect user route choice [15][16][17][18][19].
Many large-scale deployments of systems based on floating car data (FCD) are already showing the use of mobile phones [20][21][22][23] and wireless communications [24,25] combined with global navigation satellite system (GNSS) technologies. Cooperative systems based on smartphones are spontaneously spreading in ordinary use (BlaBla Car, Uber, etc.).

Materials and Methods
In this section, the general methodology adopted for a systematic literature review based on a keyword search is presented.
The procedure applied has a defined structure with the objective of performing an objective review of FCD-based adaptive traffic signal control methodologies. The procedure, in fact, is based on the seven steps shown in Figure 1. Similar procedures are frequently used in the literature [53,54]. In this work, the Scopus database was used. Keyword selection is the most important step in this procedure; the keyword choice can include or exclude different scientific works. The objective is that of creating a list of scientific works that contains the more important and influential works in the sector while at the same time avoiding the insertion of documents that are off-topic. To create a thorough list of documents in this sector, the authors performed a manual iterative procedure of trial and error represented by the backward-pointing arrow between the second and third steps of the procedure in Figure 1. At the end of this procedure, the choice was that of including in the search all documents that contain in the title "traffic signal" or "traffic light" and that also contain one of the following keywords in the title, keywords or abstract: "floating car data" or "v2x" or "smartphone" or "connected vehicle" or "car communication" or "v2I" or "vehicle infrastructure" or "sensor" or "vanet". This search resulted in a list "A" of 698 documents (in the list "A", 425 documents were published between 2015 and 2020). Many influential and relevant documents at this stage may have been excluded since they might not have included "traffic signal" or "traffic light" in the title.
It must be noted that some previous attempted searches with different choices of keywords and logical operators either included many documents not pertinent to the problem of floating car data and traffic signal regulations or included too few documents.
With the above results and without manually excluding documents that are not pertinent to the field, it was possible to elaborate the distribution in time for the documents of list "A" (up to 2019); this distribution is presented in Figure 2 and shows an increasing number of documents published annually (it must be noted that the general number of all indexed documents in Scopus has also been growing year by year, so a more detailed analysis would be necessary to establish whether there is growing scientific attention in this sector).
The four analytical scientometric tools introduced above were applied to the 698 documents belonging to list "A", and results are described in the following subsections: • Topic visualization carried out by text-mining abstract and titles; • Cocitation analysis to establish influential journals and references; • Country analysis based on coauthorship; • Trending-topics analysis carried out by overlay visualization.
The software that was used to perform the bibliometric analysis is VOSviewer, a dedicated software tool developed for creating maps based on network data and for visualizing and exploring these maps [55]. The bibliometric analysis carried out with an objective and automatic approach was then complemented with a manual procedure to identify the most influential works that might be useful as a base to investigate this research sector.
The most influential works were then described and clustered.

Topic Visualization Analysis
To create a profile of the most important issues arising in the field and to perform a visualization of main topics, we carried out a text-mining analysis using the VOSviewer suggested procedure to text-mine the keywords of the papers.
VOSviewer applies natural language calculation methods to analyze the keywords. We followed indications and considerations from [56].
With this procedure, we generated what is called a "concurrence network" (Figure 3). In this network, the most relevant terms were highlighted, and a cluster analysis was performed identifying four clusters that theoretically gather themes that share some higher connection. Once the procedure was completed, we gave a title to each cluster in agreement with what seems to be the leading topic. We applied the VOSviewer option of eight for the minimum number of occurrences of a term. From a total of 1524 keyword terms, 20 met the threshold. The selected terms were grouped in four clusters, as shown in Figure 3 and Table 1. It must be noted that the procedure we applied to extract the terms and to create the clusters is completely automatic, the clusters are created according to an algorithm that puts together terms that are not necessarily logically connected. We manually gave a label to each cluster, and some of the terms, thus, might not be connected with the specific topic label we have chosen.

Cocitation Analysis: Influential Journals, References and Authors
Following standard VOSviewer suggested procedures, we performed a cocitation analysis to establish the more influential journals and references. Reference cocitation can allow observers to automatically recognize the shape, qualities and scientific progress of a research field. In the same way, journal cocitation analysis can be a useful instrument for assessing the general layout of a scientific field relative to the current theoretical framework of specific journals. The algorithm considers that two references or two journals are more strongly connected when they are more frequently cocited. Again, a clustering algorithm is applied to discover clusters of connected references and journals. In this procedure, we decided to include only references and journal sources with a minimum of three cocitations.
The results show that, with 13,574 cited references, only 67 meet the threshold. For each of the sorted references, the total strength of the cocitation links with other cited references was calculated, and all 67 references are mapped in Figure 4. In an attempt to resolve the poor connection among these 67 references, the largest set of connected references was considered, which consists of 23 items. For each of the 23 sorted references, the total strength of the cocitation links with other cited references was calculated, and all 23 cocited references are listed and mapped in Figure 5. Five dispersed clusters surfaced, and in the visualization, it is still possible to see a poor connection among references. Clusters are differentiated by colors. Larger nodes indicate a higher number of citations, and the different clusters evidence the connection among references. The network of references is not well developed, and this seems to suggest that the specific field of research is in an embryonic stage with a great potential for growth: new findings are possible and new scientific works have the potential to create a more developed research network in the future.  The largest node is represented by a work from Abdulhai, Pringle and Karakoulas [57], which was not in the original database of 684 works.
This procedure, in fact, is able to find influential documents that might not be present in the database list "A" and that might have received a high number of cocitations in the database.
Among these 67 cocited documents, it is also possible to see general documents that constitute a scientific base for the research field under investigation but do not necessarily specifically pertain to this field of traffic signal control methods and experiments based on CAs and FCD.
In order to consider documents that are influential for the investigated field of research, the 67 documents were manually checked to exclude completely impertinent documents; the following 15 documents listed in Table 2 have been considered as influential and will be added to the final list of influential documents (in case they are not already listed).  Regarding the analysis of influential journals, we also used VOSviewer to elaborate the journal cocitation network. By establishing a cocitation threshold of 20 citations for a source (default value in VOSviewer), 48 sources were selected from 7641 sources and were grouped in four clusters. Results are shown in Figure 6, where the distance between the nodes shows the cocitation frequency. In other words, a small distance between two nodes shows a high cocitation frequency between the two sources. By manually elaborating the resulting data and considering source duplications due to slight differences in the source name, it was possible to elaborate Table 3, where the most influential sources are ordered with regards to the connection strengths evidenced by the network analysis. The cocitation analysis of authors was carried out by establishing a minimum threshold of 20 citations for an author; in this way, 195 authors were selected. The resulting network is visualized in Figure 7, where the dimension of a node shows the total link strength of that node as an index of relevance in the network and the vicinity of two nodes shows the connection in terms of cocitations. It must be noted that the cocitation analysis visualization of Figure 7 shows a wellconnected network, while the reference analysis reported in Figures 4 and 5 shows a poorly connected network; this means that the main authors in the field are connected in terms of citations among each other, while the specific research papers with high relevance are not connected. This incongruence shows that the more important works have been published without cross-referencing, and this is possibly an indication of a new field of research that might be in an embryonic state. Table 4 shows the list of most influential authors based on the connection strengths evidenced by the network analysis.

Country Analysis Based on Coauthorship
The dispersed nature of the field of traffic signal control methods and experiments based on CAs and FCD does not make it easy for a single paper or author to establish a definitive result. For this reason, in this section, we investigate potential collaborations among researchers of different countries by using country coauthorship analysis. An assessment of research exchanges and development between countries and the single-country contributions was conducted by setting a minimum number of considered documents from a given country to 10. Among 79 countries, 18 countries meet the threshold; among these 18 countries, 17 countries are connected and shown in Figure 8, where the dimension of the circles represents the number of documents for each given country. In Table 5, the 17 more active countries are listed in order of citations.

Trending Topics: Overlay Visualization Analysis
We used the overlay visualization feature of VOSviewer to generate a map of the newest vs. oldest topics. We proceeded by doing a text-mining analysis on all titles and abstract words analogous to the one carried out in the section on topic visualization, but with a threshold of 20 concurrencies for each term; then, for the 148 terms, a relevance score was calculated. Based on this score, only the top 60% of most relevant terms were selected (this is a standard feature of VOSviewer and was adopted in preceding scientific papers, so we adopted this value to conform to what appears to be a de facto standard), resulting in a total of 89 terms that are mapped in Figure 9. In order to identify trending terms, the terms were evaluated on the basis of the publication year of the papers from which they are mined. For every term, an average value of publication year was obtained, and this value is visualized with a blue color for older topics and a red color for trending topics. It must be noted that with this more selective choice of topics, the following three terms resulted as the more relevant: adaptive traffic signal control, smartphone and connected vehicle. Among the trending topics, the following terms were identified as the more trending: connected vehicle, autonomous vehicle, internet, accuracy, traffic system, travel time, experiment, VANET, emission and v2I. These words are just the outcome of a standard VOSviewer overlay visualization analysis and show a tendency of words to appear or disappear among titles and abstracts over time. The trending topics are trending among the selected papers, and the emerging terms such as internet and connected vehicles can be explained with the emerging connectivity that is a recent characteristic of new connected traffic signal systems.

The Most Influential Documents
The final list "A" of 698 documents that was analyzed in the previous sections still contained some documents that were not exactly centered on the investigated field of study.
Moreover, the cocitation analysis described above revealed a very sparse network that appeared unuseful to establish the most influential documents. For this reason, and in an attempt to establish the most influential papers in the field, the following procedure was applied in order to include important documents that may have been excluded from list "A": - The top 20 documents (considered belonging to the investigated field among documents of list "A") in order of received citations were manually selected. In this way, we created a list "A20" of influential documents. -All highly cited documents among the cited documents of the more recent papers of list "A" were considered in the following way: a list "A20-restricted-to-2015-2020" was extracted from list "A"; in the same way, a list "A20" was created but with only the documents published between 2015 and 2020. Then, all documents cited by the 20 documents in the list " A20-restricted-to-2015-2020" were added to a list "C" consisting of 736 documents. The list "C" was then ordered by citations, and all documents were manually examined reading the titles, abstracts and, in some cases, the full papers. The papers (all manually considered belonging to the investigated field) with a number of citations higher than 71 were added to the list "A20". The number 71 was used as a reference since it was the number of citations of the least-cited document in the list "A20".
With this procedure, a list "A31" was created containing 31 documents that can be considered among the most influential in the sector. It must be noted that some general scope papers have been left in the list since they have been considered relevant. Moreover, papers on the following topics have also been left inside the list:

•
The use of floating car data as a means to establish traffic signal timings; • The use of green light optimized speed advisory (GLOSA).

Most Relevant Papers on Floating Car Data and Traffic Signal Control
Based on an in-depth analysis of the 31 papers included in the "A31" list, 15 papers were selected by examining the works most focused on floating car data and automatic control of traffic lights and ordering them according to the number of citations received. The main insights taken from the 15 selected papers are shown below, highlighting the innovative contribution of each work compared to the state of the art.
Gradinescu et al. [46] present an adaptive traffic light platform based on short-range wireless communication between vehicles. Thanks to fixed controller nodes deployed in intersections, the platform determines the optimum values for the traffic light phases. For the system validation, the authors developed an integrated simulator that comprises a realistic mobility model for vehicles and a wireless network simulator. Simulating the effect of the adaptive traffic light system on the two most important signalized intersections in Bucharest, they found a significant improvement in traffic fluency compared to the existing pretimed traffic lights. This result implies a reduction in the total average delay, as well as in fuel consumption and pollutant emissions, improving environmental sustainability.
Feng et al. [65] proposed a real-time adaptive signal phase allocation algorithm using connected vehicle data. The algorithm optimizes the phase sequence and duration by solving a two-level optimization problem based on the minimization of total vehicle delay and minimization of queue length. In order to take into account the low penetra-tion rate of the connected vehicles, the authors developed an algorithm that estimates the states of unequipped vehicles based on connected vehicle data. As in the paper of Gradinescu et al. (2007), the algorithms were validated in a simulation environment. In particular, a real-world intersection was modeled in VISSIM. Simulation results showed that the proposed control algorithm outperforms actuated control by reducing total delay in a high-penetration-rate case as well as in a low-penetration-rate case.
Wen [76] proposes a new framework for dynamic and automatic traffic light control systems for improving traffic congestion problems. The system consists of hardware and software components, including RFID tags and a backend server, and it is based on the transmission of radio signals. Once the vehicle data have been acquired, maximum flow, interarrival time and average car speed are calculated and used as the input parameters of the traffic light control simulation model built in the server. The system can automatically infer and provide different alternatives in terms of varieties of traffic situations and then set red or green light duration via a traffic light control interface for improving the traffic congestion problem. The simulation results prove the efficiency of the traffic system in an urban area, with a reduction in the average waiting times.
Mandava et al. [78] developed arterial velocity planning algorithms maximizing the probability of drivers having a green light when approaching signalized intersections. The algorithms minimize the acceleration/deceleration rates of vehicles, ensuring that a single vehicle never exceeds the speed limit, passing through intersections without coming to a stop. The results show a great reduction in vehicle fuel consumption and emissions from these velocity profiles compared with those from a typical velocity profile of vehicles without velocity planning.
In the paper from Yousef et al. [81], an adaptive traffic signal time manipulation algorithm is proposed on single and multiple road intersections by using a wireless sensor network (WSN). A traffic system communication algorithm (TSCA) and a traffic signal time manipulation algorithm (TSTMA) are implemented to provide the system with adaptive and efficient traffic estimation represented by the dynamic change in the traffic signals' flow sequence and traffic variation. The results of simulated scenarios and a real implementation show the efficiency of the system in solving traffic congestion by reducing the average waiting time and the average queue length both for isolated intersections and multiple intersections.
Tielert et al. [82] present a traffic-light-to-vehicle communication (TLVC) system that has the potential to reduce the environmental impact of traffic flow by helping drivers to avoid braking and accelerating maneuvers at traffic lights. They used a detailed emission model to identify key factors influencing TLVC and evaluate the level of detail required for the different simulation components. The results of a sensitivity analysis identify the gear choice and the distance from the traffic light at which vehicles are informed as key influencing factors. The authors generated the driving cycles using a microscopic traffic simulator, using a real-world inner-city road network, in which they integrated the speedadaptation algorithm as well as a communication model. The work highlights that the driver's behavior, e.g., gear choice and compliance, plays an important role in the impact advantage of the TLVC on the environment.
Zhou et al. [83] propose an adaptive traffic light control algorithm to minimize the average waiting times at intersections and optimize the traffic throughput. Three main steps compose the algorithm: the vehicle detection, the green light sequence determination and the light length determination. The algorithm performances are affected by the traffic information detected for each vehicle traversing the intersection (e.g., distance, speed, vehicle's length). The simulation results show that the proposed algorithm can achieve higher throughput and lower average waiting time compared to a fixed-time control algorithm and an actuated control algorithm.
Jeff Ban et al. [84] studied how to estimate real-time queue lengths at signalized intersections using intersection travel times collected from mobile traffic sensors. The estimation methodology is based on the assumption that critical pattern changes of intersection travel times or delays indicate signal timing or queue length changes. By detecting these critical points, the real-time queue length can be estimated with a reverse thinking process. The model and algorithm were tested in a field experiment and in a simulation environment.
Koukoumidis et al. [85] evaluated a novel software (SignalGuru) that, by collecting mobile phone probes, detects and predicts traffic signal schedules. The data collected allow the drivers to adjust their speed arriving at the end of the signalized intersection when the signal ahead will turn green. To use SignalGuru, the phones must be mounted on the car windshield in order to detect current traffic signals with their cameras, collaboratively communicate and learn traffic signal schedule patterns and predict their future schedule. Results from two applications of SignalGuru in Cambridge (MA, USA) and Singapore show that traffic signal schedules can be predicted accurately.
Katsaros et al. [86] present a green light optimized speed advisory (GLOSA) system to reduce traffic congestion by decreasing the average stop time behind traffic lights. The authors intended to evaluate the impacts of GLOSA on both fuel consumption and CO2 emissions. For the implementation of the proposed system, the authors used an integrated simulation tool based on VSimRTI to simulate different penetration rates of GLOSA-equipped vehicles and traffic density. The simulation results show that the higher the GLOSA penetration rate is, the greater the performance is, with a maximum of 80% reduction in stop time and up to 7% reduction in fuel consumption in a high-trafficdensity scenario.
A smart city framework is proposed by Barba et al. [91], in which an intelligent traffic light (ITL) system is used to provide warning messages to the vehicles and to inform drivers about traffic and weather conditions of the road network. The authors evaluated the performance of the vehicles of a VANET in a smart city using information managed by the ITL system in some crossroads of the city. The evaluation of the performances of the proposed system was performed by the network simulator NCTUns 6.0. The results show the effectiveness of this framework in terms of reduction in the braking distance and the driver's reaction time.
Pandit et al. [92] employed the oldest job first (OJF) algorithm for the optimization of traffic signal control in intersections. The authors used vehicular ad hoc networks (VANETs) to collect and aggregate real-time speed and position information of individual vehicles. The algorithm formulates the vehicular traffic signal control problem as a job scheduling problem on processors, with jobs corresponding to platoons of vehicles. The results show that the employment of the OJF algorithm led to a great reduction in the delay that is less than or equal to twice the delay of an optimal offline schedule with perfect knowledge of the arrivals.
A traffic signal control algorithm is proposed by Goodall et al. [93] to achieve three main objectives: (1) to match or improve the performance of a state-of-the-practice actuated coordinated system; (2) to respond to real-time demands only, thereby eliminating the need for manual timing plan updates to adjust for traffic growth or fluctuations; (3) to never manage records of individual vehicles, protecting driver privacy. The authors developed a predictive microscopic simulation algorithm (PMSA) adopting microscopic traffic simulation to calculate the objective function delay directly from the vehicle's simulated behavior. The results show that the proposed algorithm maintains or improves performance compared with that of a state-of-the-practice coordinated actuated timing plan optimized by Synchro at low and midlevel volumes; however, performance becomes worse under saturated and oversaturated conditions. Collotta et al. [95] propose a novel traffic light control system for isolated intersections able to reduce the average waiting time of vehicles while managing the phase sequence. The system is mainly composed of a wireless sensor network (WSN) for real-time traffic data acquisition, a phase sorting module for calculating the phase execution order based on the number of enqueued cars and a fuzzy logic controller for calculating the appropriate green time duration of the relevant phase. To assess the performance of the proposed system, the authors performed several simulations using MATLAB for the evaluation of the fuzzy logic controllers and TRUETIME for simulating the IEEE 802.15.4 WSN infrastructure. The obtained results highlight that the proposed system outperforms related works in terms of reduction in the waiting times in the queues, especially under heavy traffic volumes. The feasibility of the system with real components was proven through implementation on a microcontroller, which provided experimental results in agreement with the simulated ones.
In 2014, He et al. [97] addressed the conflicting issues between actuated coordination and multimodal priority control. In order to enable multiple requests for priority messages to a traffic signal controller of priority-eligible vehicles, such as emergency vehicles, transit buses, commercial trucks and pedestrians, the authors formulated a request-based mixedinteger linear program (MILP). The proposed control method was compared with stateof-practice transit signal priority (TSP) under the optimized signal timing plans using the microscopic simulation technique. The simulations highlighted that the proposed control model is able to reduce average bus delay, average pedestrian delay and average passenger car delay, especially for highly congested conditions with a high frequency of transit vehicle priority requests.

Conclusions
The aim of this work was to frame the field of traffic signal methods and experiments based on connected vehicles and floating car data using a scientometric and thus an objective approach to evidence the most important issues in the sector. The topic visualization analysis revealed a sparse network with few connections among different scientific works. This could be indicative of a research field that is not established and still in development. In addition, the fast development of technologies that could impact this field may have contributed to this lack of connections among different works.
Moreover, the growing number of papers that are published in the field could simply be the consequence of a parallel and wider field of research developing at an increasing speed: connected and automated vehicles (CAVs).
By reading all the gathered works, it was possible to note that most papers are based only on simulation, leaving an unfilled gap with the real world. We are aware of only one experiment in the field [98] (using 100% real connected vehicles in a controlled environment). The experimentation and implementation of new traffic signal control solutions based on connected vehicles and floating car data (FCD) are clearly at an embryonic state, which opens the possibility for an explosion of new experiments and new solutions. This paper using a scientometric approach has framed almost all the scientific contributions aimed at the field of traffic signal methods and experiments based on connected vehicles and floating car data.
The applied scientometric approach has revealed trending ideas and concepts and allows the reader to identify the relevant documents that can be consulted and considered as starting milestones on which this field can be further developed with the implementation of new traffic signal control systems that can "give the green light" to drivers.
The scientometric analysis that was carried out in this paper was implemented applying an objective bibliometric procedure. The considerations of this section are partially sustained by the analysis but are simply the opinion of the authors and thus not fully sustained by an objective methodology.
Finally, it should be noted that the scientometric analysis was carried out using only the Scopus database, which allows optimal use of the VOSviewer software. This is possible mainly through the procedures for exporting records, such as keywords, which are one of the main factors to be analyzed by applying the natural language calculation methods. However, the authors intend to integrate the data with other available databases, such as WoS, and report their analyses in a future paper.

Conflicts of Interest:
The authors declare no conflict of interest.