Utilizing an Adaptive Neuro-Fuzzy Inference System (ANFIS) for Overcrowding Level Risk Assessment in Railway Stations

: The railway network plays a signiﬁcant role (both economically and socially) in assisting the reduction of urban tra ﬃ c congestion. It also accelerates the decarbonization in cities, societies and built environments. To ensure the safe and secure operation of stations and capture the real-time risk status, it is imperative to consider a dynamic and smart method for managing risk factors in stations. In this research, a framework to develop an intelligent system for managing risk is suggested. The adaptive neuro-fuzzy inference system (ANFIS) is proposed as a powerful, intelligently selected model to improve risk management and manage uncertainties in risk variables. The objective of this study is twofold. First, we review current methods applied to predict the risk level in the ﬂow. Second, we develop smart risk assessment and management measures (or indicators) to improve our understanding of the safety of railway stations in real-time. Two parameters are selected as input for the risk level relating to overcrowding: the transfer e ﬃ ciency and retention rate of the platform. This study is the world’s ﬁrst to establish the hybrid artiﬁcial intelligence (AI) model, which has the potency to manage risk uncertainties and learns through artiﬁcial neural networks (ANNs) by integrated training processes. The prediction result shows very high accuracy in predicting the risk level performance, and proves the AI model capabilities to learn, to make predictions, and to capture risk level values in real time. Such risk information is extremely critical for decision making processes in managing safety and risks, especially when uncertain disruptions incur (e.g., COVID-19, disasters, etc.). The novel insights stemmed from this study will lead to more e ﬀ ective and e ﬃ cient risk management for single and clustered railway station facilities towards safer, smarter, and more resilient transportation systems.


Introduction
In recent years, there has been a global increase in the demand for rail transport, and rail usage is expected to continue to increase worldwide [1]. Passenger streams in railway systems are growing dramatically in many cities over the world with the rapid development of rail transit. This reflects on stations that face enormous pressure from passenger congestion and the high level of overcrowding in peak times. The metro systems in Beijing and Shanghai provide a daily transport service for more than nine million passengers, and statistics indicate that the annual usage has almost doubled from 2011 to 2015 [2,3]. In the UK, the number of rail passenger journeys has more than doubled over the transfer distance and transfer time. The proposed method is beneficial to the station design and layout, crowd management, emergency planning, route scheduling, and improvement of health and safety in similar environments, which reflect positively on the industry and raise service satisfaction. Specifically, the novelty of this study lies in the following points. First, unlike most models that rely on simulation data, our model uses real data collected from stations for the prediction of overcrowding. The data include important factors, such as peak time and walking velocity which varies for different equipment (e.g., stairs and escalators) and are used to calculate the transfer efficiency. Second, a learning-based method has been introduced for crowd management in stations and particularly on platforms that are a hot spot in terminals. Then, a smart framework has been presented, which can be generalised for many locations and fields for managing overcrowding. Other contributions of this study are as follows: • Many models consider the total number of passengers entering a station. However, overcrowding may occur only on one platform. In our study, we consider the number of passengers on a single platform that is determined by the passengers in waiting and the flow from the station to the platform through the escalators or other channels. Additionally, we assume that the delay of one train can affect only one platform. The information fed to the model can be captured from images in the station that include the spatial and temporal dependence of the crowd. • The ANFIS model used in this study can learn and specify a threshold for the risk level depending on location and crowd standards from accurate real-time data.

•
Estimating the risk level is an important outcome of our prediction, unlike traditional prediction studies that focus only on station-level or route-level forecasting (time series).

Literature Review
Many studies have presented the risks of overcrowding from different perspectives. Fuzzy logic (FL)-based methods have been utilised for modelling uncertainties in overcrowding dynamics. A fuzzy modelling method has been proposed to simulate and analyse factors such as evacuation time against varying crowd velocities [18]. Takagi-Sugeno-type fuzzy rules have been used in the suggested fuzzy model to convert physical laws of pedestrian motion into a fuzzy inference system (FIS) [19]. Moreover, it was demonstrated that high congestion in railway stations affects the quality of service to travellers, potentially damaging the image of the business, or worse, interrupting operational continuity because of the growth of the dwell time [20,21]. Furthermore, high levels of passenger density have been connected to many psychological and social issues, including perceptions of risk to personal safety and security [22][23][24][25]. Congestion risk management is increasingly important for the railway industry to protect passengers, the public, and employees while enhancing safety and decreasing maintenance costs [14]. Additionally, from a financial perspective, overcrowding affects the rail business by causing significant revenue losses to operators and being a constraint to the government's rail passenger growth objectives and management of demand [26]. Passenger flow is based on nonlinear relationships in which changes occur naturally. To overcome the limitations of models that assume linearity in passenger forecasting, researchers have used various methods, such as statistics-based approaches, machine learning, hybrid models, and deep learning. Related works are shown in (Table 1). Table 1. Review of flow prediction methods from previous studies.

Reference
Method Task Data Source [27] Deep learning Safety prediction of rail transit system Datasets based on the Beijing metro in China [28] K-means and the entropy method The risk of passenger flow congestion 3rd ring road corridor in Beijing (the bus passenger) [29] Deep residual neural network framework Forecast metro station crowd flows Using the records collected by automated fare collection (AFC) data in Suzhou, China [30] Interpretative Structural Modeling (ISM) Urban rail transit operation safety evaluation Beijing metro [31] Failure mode and effects criticality analysis (FMECA) Risk evaluation of railway rolling stock failures Data collected from the literature, the company's maintenance management software system called EQUINOX and the UK's railway performance management software DATASYS BUGLE [32] Analysed the whole process of safety risk assessment The concept of the safety risk assessment of large-scale railway stations Beijing South railway station [33] Multilevel fuzzy comprehensive evaluation model Operation safety of the large high-speed railway station. Guangzhou South railway station in China [34] Data envelopment analysis (DEA) Design the risk evaluation index system of crowd agglomeration and stampedes In 13 subway stations in Beijing [35] Bayesian network Safety risk interactions in rail stations Risk events were collected from 62 stations worldwide and 241 from China [36] Quantitative statistics for safety risk analysis Metro incident management Shanghai metro operation incident database (MOID) [37] Fuzzy analytical hierarchy Analysis of operation risk factors of urban rail transit Experts' judgement information [17] Risk evaluation model For the safety of passengers and improving the operation efficiency in metro stations A practical application in Beijing's GuoMao metro station. [38] Multi-Criteria Decision-Making (MCDM) model Train derailment risk response strategy Yangyangcheng station in Shenmu city in China [39] Nonparametric regression model Predicting the transfer passenger flow to forecast pedestrian congestions Transfer passenger flow for one month in the XIDAN transfer station [40] Grey clustering and dynamic identification method Passenger flow congestion risk in rail transit hub station Real-time data of AFC (Automatic Fare Collection) system from Dongzhimen rail transit hub station in China [41] Combines failure mode and effect analysis (FMEA) and pessimistic-optimistic fuzzy information axiom (POFIA) considering acceptable risk coefficient (ARC) Evaluate the risk of railway dangerous goods transportation system The historical data of Chinese railway dangerous goods transportation system (RDNGTS) accidents from 1986 to 2017 [42] Nonparametric regression model Forecasting traffic flow incidents Collected incident-free data in the freeway in San Francisco Bay [43] Online spatio-temporal Crowd flow prediction Dataset of Sydney trains network and the weather data from the Australian Bureau of Meteorology The literature review suggests that artificial intelligence (AI) methods for flow prediction have become a new trend and have gained a lot of academic and industrial interest. According to such literature, we have entered the era of big data [59][60][61][62]. The new technology will play an important role in the future of flow prediction and in dealing with the flow forecast issues of intelligent transportation systems. Studies have been utilising a wide range of data; however, the acquisition of data is a problem for researchers. More datasets are necessary for training to improve prediction performance. Recently, traffic prediction (both numerical and image based) has greatly developed to cover all transportation types, including vehicles, buses, taxis, and trains, with more studies using road data. Moreover, it is linked to many aspects, such as delays, activity patterns, weather, fuel consumption, pollution, city congestion, driving speed, traffic and public flow, and pedestrian flow characteristics. Therefore, it is a challenge to focus on one specific prediction such as the congestion in railway stations with risk and safety management perspectives. Prior studies provide an excellent source of information for studying the passenger flow risk. Nevertheless, congestion in stations is extremely hard to mitigate because of congestion features such as randomness, diffusivity, and uncertainty. In the last decade, studies have used traffic road safety and time series models as an evaluation method for urban road traffic. However, few studies have focused on passenger safety in rail stations. The recent deep learning models present high performance; however, the modelling has been performed by computer and information technologies rather than the crowd risk and management analysis. Moreover, they often share parallel structures and concepts. Passenger flow has not been completely addressed or is often overlooked, and studies are more focused on vehicles and city congestion. Analysis of the flow will provide knowledge for designing future stations and correcting any deficiency in trains or facilities provided to passengers. This would support the effective planning of essential services, such as transfer services at railway terminals. Many factors must be considered in such cases, including walking speed, crowd density, free-flow speed, and risk level. However, there is still a great need for extensive studies from the perspective of safety and risk. Some of the aspects and methodologies that should be considered while studying flow prediction are listed as follows: • The risk and consequences of the unwanted events (crowd) must be considered, such as stamping and falling or panicking.

•
Real data is essential and will reduce error compared to simulation data.

•
Time series models usually neglect many factors and specific locations, such as the platforms in railway stations.

•
Appling AI to existing systems in railway stations, such as closed-circuit television CCTV, will be beneficial and cost efficient to operators for timely flow prediction.

•
Certain passenger categories, such as the disabled, families, and the elderly, need to be considered to make stations accessible to all.
In this study, we propose a method that utilises data from many digital sources in a station. In fact, a large volume of data is collected daily in railway stations, e.g., smart card data, smart phones applications, sensors, Bluetooth, and video surveillance [63][64][65][66][67]. To monitor the crowd, it is necessary to locate and follow passengers and members of the public whilst they are in the station. This requires extracting crowd information [19] and making a robust prediction of the passenger flow to support the decision maker and assist in managing the operations in the stations. The crowd in the station is related to many characteristics: health and safety, security, ridership, service quality and satisfaction, cost and maintenance, station operation, dwell time, risk assessment, and passenger preferences and travel decisions.

Intelligent System for Managing Risks (ISFMR)
We propose a method to advance dynamic risk management in railway stations. Our aim is to minimize risk through automated processes by taking into consideration all the factors in the system and how they work mutually to provide an acceptable level of safety and security in real-time.
Accordingly, we use a smart method to create the Intelligent System for Managing Risks (ISFMR) framework. This is expected to increase safety and security and effectively assess and manage risks, thus increasing performance in railway stations, which reflect the whole rail system. To create a smart process for the risk management, timely data are required. Gathering online data offers the benefits of advanced technology in both the collection and analysis of the data. Stations generate a large volume of data on safety and risk history information. Various types of data can be captured, such as video, sensor signals, ticketing system data, and numerical documents. The complete structure of the proposed framework is shown in Figure 1. As illustrated, inputs represent the number of internal and external pieces of information gathered from various attributes and used for crowd flow prediction in stations.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 30 framework. This is expected to increase safety and security and effectively assess and manage risks, thus increasing performance in railway stations, which reflect the whole rail system. To create a smart process for the risk management, timely data are required. Gathering online data offers the benefits of advanced technology in both the collection and analysis of the data. Stations generate a large volume of data on safety and risk history information. Various types of data can be captured, such as video, sensor signals, ticketing system data, and numerical documents. The complete structure of the proposed framework is shown in Figure 1. As illustrated, inputs represent the number of internal and external pieces of information gathered from various attributes and used for crowd flow prediction in stations.

Overcrowding at Railway Stations
It is a priority for municipalities worldwide to support citizens' mobility within the urban environment. Moreover, by 2050, it is estimated that over 75% of the world's population is expected to be living in cities. The British rail network is likely to boost passenger km by 50%, in accordance with government targets set out in the Ten-Year Plan for Transport [5]. However, absorbing increasing loads in public transportation is a challenge, especially during peak commuting hours where overcrowding creates incredibly high levels of discomfort and unsafe train stations. The industry has some solutions, such as discouraging peak-time travel by means of fare differentiation. However, this has had no observable effect on travellers' journeys [8,68]. Crowding at railway stations remains one of the main causes of concern for passengers, as evidenced by complaints to operators. It has been noted that crowding increases risks to the health and/or safety of those affected [12,69,70]. Any high congestion in stations could lead to dangerous situations and abnormal consequences. This is because of the dynamic and free-flowing nature of the system. Clogging, flow, narrow paths, and congestion may lead to crowd accidents such as stampede. It is necessary to

Overcrowding at Railway Stations
It is a priority for municipalities worldwide to support citizens' mobility within the urban environment. Moreover, by 2050, it is estimated that over 75% of the world's population is expected to be living in cities. The British rail network is likely to boost passenger km by 50%, in accordance with government targets set out in the Ten-Year Plan for Transport [5]. However, absorbing increasing loads in public transportation is a challenge, especially during peak commuting hours where overcrowding creates incredibly high levels of discomfort and unsafe train stations. The industry has some solutions, such as discouraging peak-time travel by means of fare differentiation. However, this has had no observable effect on travellers' journeys [8,68]. Crowding at railway stations remains one of the main causes of concern for passengers, as evidenced by complaints to operators. It has been noted that crowding increases risks to the health and/or safety of those affected [12,69,70]. Any high congestion in Appl. Sci. 2020, 10, 5156 8 of 29 stations could lead to dangerous situations and abnormal consequences. This is because of the dynamic and free-flowing nature of the system. Clogging, flow, narrow paths, and congestion may lead to crowd accidents such as stampede. It is necessary to monitor the condition, movement flow, and behaviour of pedestrians in stations to anticipate the overcrowding risk in real time during operations, and to predict areas where congestion may occur during any situation to implement safety measures, according to Helbing et al. [71]. From the literature, many studies have been conducted that present the danger of overcrowding from a different perspective. Additionally, regarding health and safety and the spread of viruses such as COVID-19, it has been suggested that individuals avoid overcrowded areas such as on public transportation. Thus, it may now be even more imperative to control and prevent overcrowding risks quickly and in real-time in places such as railway stations and platforms [72][73][74].
The outline of related studies for this work will be presented in the next lines. Fuzzy Logic (FL)-based methods have been utilised for uncertainties in overcrowding dynamics. It has proposed a fuzzy modelling method to simulate and analyse possible factors, such as evacuation time against varying velocities of the crowd [18]. Takagi-Sugeno-type fuzzy rules have been used in the suggested fuzzy model to convert physical laws of pedestrian motion into a fuzzy inference system FIS [73]. Moreover, high congestion in railway stations affects the quality of service for travellers and damage might extend to the business images, or worse, it may subsequently interrupt operational continuity because of the growth of the dwell time [20,74,75]. Furthermore, high levels of passenger density have been connected to many psychological and social issues, including perceptions of risk to personal safety and security [12,76], increased anxiety [24], stress, feeling of exhaustion, and the possibility of ill health [22,25,77]. Congestion risk management is increasingly important for the railway industry in order to protect passengers, public, and employees, whereas enhancing safety and decreasing maintenance costs [14]. In addition, from a financial perspective, overcrowded situations affect the rail business, causing significant lost revenue to operators and providing a constraint to the government's rail passenger growth objectives and management of demand [26]. Furthermore, it has been indicated that overcrowding should be considered a probable threat to the health of both rail business and passengers. Finally, the effect of rail passenger crowding reductions would improve passenger numbers, passenger experience, safety and security, satisfaction, and the long-term sustainability of stations [12]. In this paper, we propose a method that can utilise data from many digital sources in the station. A large volume of data are collected daily in railway stations, such as smart card data, smart phones applications, sensors, Bluetooth, and video surveillance [63][64][65][66]. To monitor the crowd, it is necessary to locate and follow passengers and members of the public whilst they are in the station. This requires the extraction of information from crowds [19].

Development of the ANFIS Model and the Framework
ANFIS is a combination of the computational power of ANN and the high-level reasoning capability of an FIS [78], which is developed in [79]. In other words, by inserting the FIS into the framework of adaptive networks, we receive the ANFIS architecture [80,81]. Owing to the effective learning and reasoning competence of the ANFIS model, it has received growing interest from academics in numerous engineering and scientific fields. In the literature, different models implement different learning structures for many applications and surveys [82][83][84][85]. The ANFIS model is utilised in problems connected to parameter identification. This is achieved through a hybrid-learning rule combining back-propagation, gradient descent, and a least-squares method. It is a multilayer feed-forward network that employs an ANN and FLS to map inputs into an output see Figure 2. Therefore, it is an FLS implemented in the structure of an adaptive ANN. Furthermore, ANFIS can construct a network realisation of IF/THEN rules [80]. One of the earliest works in the identification risk literature was completed by [86]. Moreover, for the assessment of risk factors in highway management corporations in China, a back-propagation ANN was used [87]. The output is predicted through an adaptive network, which feeds forward multi-layers of ANN with adaptive nodes. Learning rules are specified for the parameters of the adaptive node and the adjustment of the parameter depends on error values [83,88].
In this work, we use the ANFIS as a smart method or tool for risk prediction and to train on station risk levels as it has been indicated that the model can be utilised to assess risk during a small computational time [89]. We have developed the model for predicting the overcrowding risk, but it can be applied to other risk factors with different indicators. Besides, the suggested tool is used to assess the ISFMR by using overcrowding risk in stations, which supports future smart risk management.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 10 of 30 Besides, the suggested tool is used to assess the ISFMR by using overcrowding risk in stations, which supports future smart risk management. The ANFIS is a hybrid method that has several features of interest. For instance, the ANN can identify patterns in advance using the adaptation to the environment with its learning ability. At the same time, the FLS can combine expert judgment and perform the decision-making process [90]. The ability to handle historical data as well as expert knowledge whilst being flexible to adapt to unusual risky circumstances are important the features of the ANFIS model. Moreover, the model has been employed in emergency management, and analyses of external environments and level of threats [91]. The system is described as smart because of the parameters and fuzzy rules, which are estimated by ANN intelligently [92]. The method utilises supervised learning on a learning algorithm. It consists of five layers with two or more ANFIS inputs (X, Y), which are transformed into fuzzy inputs in Layer 1 (the adaptive node or fuzzification layer), as shown in Figure 3. Then, the process generates membership functions (MFs) for each of the inputs. There are many MF types, for instance, triangular MF, generalized bell MF, and Gaussian MF. In this study, various MFs have been examined [59,93,94].
The method of fuzzy output clarification is used in Mamdani-type inference systems, whereas the weighted average is used in Sugeno-type inference systems [95]. The ANFIS is a Sugeno-type inference system and is a hybrid method that utilises parallel calculation and the learning ability and FLS features of ANN [96]. Regarding fuzzy identification, a Sugeno-type of fuzzy system is the most common MF. To determine the relation between input and output in the ANFIS model, two algorithms are used, namely error backpropagation and hybrid learning. The ANFIS uses the The ANFIS is a hybrid method that has several features of interest. For instance, the ANN can identify patterns in advance using the adaptation to the environment with its learning ability. At the same time, the FLS can combine expert judgment and perform the decision-making process [90]. The ability to handle historical data as well as expert knowledge whilst being flexible to adapt to unusual risky circumstances are important the features of the ANFIS model. Moreover, the model has been employed in emergency management, and analyses of external environments and level of threats [91]. The system is described as smart because of the parameters and fuzzy rules, which are estimated by ANN intelligently [92]. The method utilises supervised learning on a learning algorithm. It consists of five layers with two or more ANFIS inputs (X, Y), which are transformed into fuzzy inputs in Layer 1 (the adaptive node or fuzzification layer), as shown in Figure 3.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 10 of 30 Besides, the suggested tool is used to assess the ISFMR by using overcrowding risk in stations, which supports future smart risk management. The ANFIS is a hybrid method that has several features of interest. For instance, the ANN can identify patterns in advance using the adaptation to the environment with its learning ability. At the same time, the FLS can combine expert judgment and perform the decision-making process [90]. The ability to handle historical data as well as expert knowledge whilst being flexible to adapt to unusual risky circumstances are important the features of the ANFIS model. Moreover, the model has been employed in emergency management, and analyses of external environments and level of threats [91]. The system is described as smart because of the parameters and fuzzy rules, which are estimated by ANN intelligently [92]. The method utilises supervised learning on a learning algorithm. It consists of five layers with two or more ANFIS inputs (X, Y), which are transformed into fuzzy inputs in Layer 1 (the adaptive node or fuzzification layer), as shown in Figure 3. Then, the process generates membership functions (MFs) for each of the inputs. There are many MF types, for instance, triangular MF, generalized bell MF, and Gaussian MF. In this study, various MFs have been examined [59,93,94].
The method of fuzzy output clarification is used in Mamdani-type inference systems, whereas the weighted average is used in Sugeno-type inference systems [95]. The ANFIS is a Sugeno-type inference system and is a hybrid method that utilises parallel calculation and the learning ability and FLS features of ANN [96]. Regarding fuzzy identification, a Sugeno-type of fuzzy system is the most Then, the process generates membership functions (MFs) for each of the inputs. There are many MF types, for instance, triangular MF, generalized bell MF, and Gaussian MF. In this study, various MFs have been examined [59,93,94].
The method of fuzzy output clarification is used in Mamdani-type inference systems, whereas the weighted average is used in Sugeno-type inference systems [95]. The ANFIS is a Sugeno-type inference system and is a hybrid method that utilises parallel calculation and the learning ability and FLS features of ANN [96]. Regarding fuzzy identification, a Sugeno-type of fuzzy system is the most common MF. To determine the relation between input and output in the ANFIS model, two algorithms are used, namely error backpropagation and hybrid learning. The ANFIS uses the learning ability of ANN for determining input and output spaces. One of the limitations of this system is the time needed for training. In the Sugeno membership function, the output of the fuzzy rules are in a linear function form (Takagi and Sugeno, 1985), [97]. This assumes that two forms of the first order and zero-order have been considered for this linear relationship (fuzzy inference system). Thus, in this model, two inputs and one output are used, as shown in Figure 4. The linear relation between these inputs (Sugeno fuzzy model) [97] would be: 1.
Rule 1: If x is A 1 and y is B 1 , THEN f 1 = c 1 x+q 1 y+ R 1 2.
Rule 2: If x is A 2 and y is B 2 , THEN f 2 = c 2 x+q 2 y+ R 2 where c i , q i , and r i , (i = 1, 2) are linear parameters of the Sugeno fuzzy model for determining the relation between input and output, and x and y are numerical inputs, while A and B are numerical variables. The system is formed by five layers with a brief explanation of the ANFIS architecture presented in Figure 4, and more detail of the layers discussed in the following section. All computations can be presented in diagrammatic form. ANFIS typically has five layers of neurons, and each layer is from the same function family. The ANFIS structure also has five layers excluding input layer.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 30 learning ability of ANN for determining input and output spaces. One of the limitations of this system is the time needed for training. In the Sugeno membership function, the output of the fuzzy rules are in a linear function form (Takagi and Sugeno, 1985), [97]. This assumes that two forms of the first order and zero-order have been considered for this linear relationship (fuzzy inference system). Thus, in this model, two inputs and one output are used, as shown in Figure 4. The linear relation between these inputs (Sugeno fuzzy model) [97] would be: 1. Rule 1: If x is A1 and y is B1, THEN f1 = c1x+q1 y+ R1 2. Rule 2: If x is A2 and y is B2, THEN f2 = c2x+q2 y+ R2 where ci, qi, and ri, ( = 1,2) are linear parameters of the Sugeno fuzzy model for determining the relation between input and output, and x and y are numerical inputs, while A and B are numerical variables. The system is formed by five layers with a brief explanation of the ANFIS architecture presented in Figure 4, and more detail of the layers discussed in the following section. All computations can be presented in diagrammatic form. ANFIS typically has five layers of neurons, and each layer is from the same function family. The ANFIS structure also has five layers excluding input layer. The first layer designates how much each numerical input relates to the different fuzzy set. Let the MF of fuzzy sets Ai, be Bi, then the output of this layer can be calculated by the next equation (Equation (1)) as follows: As an example, for the MF, a bell function (gbellmf) where {a, b, c} is the parameter set (premise parameters), Equation (2) would be: The outputs of the previous layer constitute the inputs of the next one. The next layer (layer of rules) operates logic operations "AND" and "OR", which lead to the output value (called firing strength), which can be calculated by Equation (3) as follows: where is the value of multiplication of the output of Ai node and that of Bi node, which shows the ignition strength of the rule [98,99]. The next layer (Third layer, layer of normalisation) for each of the outputs of Layer 2 is divided between all of the outputs of that rule, as in Equation (4), as follows: In the next layer (Layer 4, the clarification layer), the input of each rule for the calculation of The first layer designates how much each numerical input relates to the different fuzzy set. Let the MF of fuzzy sets Ai, be Bi, then the output of this layer can be calculated by the next equation (Equation (1)) as follows: As an example, for the MF, a bell function (gbellmf) where {a, b, c} is the parameter set (premise parameters), Equation (2) would be: The outputs of the previous layer constitute the inputs of the next one. The next layer (layer of rules) operates logic operations "AND" and "OR", which lead to the output value (called firing strength), which can be calculated by Equation (3) as follows: where w i is the value of multiplication of the output of Ai node and that of Bi node, which shows the ignition strength of the rule [98,99]. The next layer (Third layer, layer of normalisation) for each of the outputs of Layer 2 is divided between all of the outputs of that rule, as in Equation (4), as follows: In the next layer (Layer 4, the clarification layer), the input of each rule for the calculation of model output is computed by Equation (5), as follows: In the last layer (The fifth, the output layer), where the products of the previous neurons are summed with each other and lastly, by defuzzification, fuzzy outputs are transformed into numerical products. Equation (7) explains how this transformation calculated Equation (6), as follows: or leaving the arguments out (Equation (7)): This can be separated into phases by first defining Equation (8), as follows: and then f can be written as Equation (9), as follows: Once the premise parameters are fixed, the overall output is a linear combination of the consequent parameters. In symbols, the yield f can be formed as Equation (10), as follows: f = (w 1 x)c 11 + (w 1 y)c 12 + w 1 c 10 + (w 2 x)c 21 + (w 2 y)c 22 + w 2 c 20 (10) which is linear in the consequent parameters c ij (i = 1,2, j = 0,1,2). A hybrid algorithm adjusts the consequent parameters c ij in a forward pass and the premise parameters {a i , b i , c i } in a backward pass. In Layer 5 (output value layer), the node sums the output values of each node of the previous layer [80,100].

Explanation of the ANFIS Model and the Framework
The input of the model is based on data extracted from Beijing's Guomao metro station. Two parameters (retention rate index and transfer efficiency index) have been selected as input for predicting the overcrowding risk level in the station, and all relevant risk factors have been considered for the input data [17]. In this research, we do not calculate the indices; however, the most important equations will be presented in the following discussion. To calculate the transfer efficiency in the station, which is an essential index (the station state), we require the transfer time (in seconds) and the transfer distance. In an ideal situation, the passenger flow and movement on the platform are a periodically repeated process associated with the departure and arrival of trains. For evaluation and analysis of the process, it has been suggested that the departure and arrival of trains at the platform is a prompt action that finishes quickly. In practice, assume a train denoted (jn−1) is departing from platform (i) with some of the targeted passengers on board. The passengers left on the platform are at the minimum density level. Then, passengers will start gathering again until they reach the maximum density level just before the arrival of the next train (jn). This process is repeated, and the density level of passengers will increase or decrease depending on train schedules and boarding see Figure 5. It has been suggested that the transfer efficiency index in the station channel location is affected by the amount of equipment in the channel (such as stairs and escalators) [17,101,102] .If ( ) is the walking velocity and (d) is the distance between equipment, the transfer efficiency can be calculated by the following equation: Regarding the second index, the retention rate captures the number of passengers left at the platform waiting for the next train because of limited space on the current train. The density of these passengers reaches a maximum level depending on the size of the platform and the standard used to estimate how many people can stand in one square meter. Passengers on the platform are grouped into three periods, based on the departure and arrival of trains, assuming the process occurs with quick actions, as follows: Train departure j n−1 Train arrival j n Train departure j n Therefore, this retention rate η R of the terminal as the indicator to represent the proportion of the stranded passengers can be calculated by the equation below: For the ith platform in jth passenger period, the N r i J are the passengers who wait on the platform and can not get in the current train, so they planned to wait for the next one. The number of stranded passengers is restrained by the total waiting passengers N w i J and boarding passengers N b i J . For every specific terminal, the stranded passengers can be calculated by a maximum number of passengers left after boarding by the equation below: Therefore, the number of waiting passengers can be calculated as follows: It has been suggested that the transfer efficiency index in the station channel location is affected by the amount of equipment in the channel (such as stairs and escalators) [17,101,102]. If (v) is the walking velocity and (d) is the distance between equipment, the transfer efficiency can be calculated by the following equation: Regarding the second index, the retention rate captures the number of passengers left at the platform waiting for the next train because of limited space on the current train. The density of these passengers reaches a maximum level depending on the size of the platform and the standard used to estimate how many people can stand in one square meter. Passengers on the platform are grouped into three periods, based on the departure and arrival of trains, assuming the process occurs with quick actions, as follows: Train departure j n−1 Train arrival j n Train departure j n Therefore, this retention rate η R of the terminal as the indicator to represent the proportion of the stranded passengers can be calculated by the equation below: For the ith platform in jth passenger period, the N J r i are the passengers who wait on the platform and can not get in the current train, so they planned to wait for the next one. The number of stranded passengers is restrained by the total waiting passengers N J w i and boarding passengers N J b i . For every specific terminal, the stranded passengers can be calculated by a maximum number of passengers left after boarding by the equation below: Therefore, the number of waiting passengers can be calculated as follows: where N J−1 r i is the stranded people in the previous period, N J i i is the passengers who check in and N J t i is the passenger who will alight to the same line in the station in period J. To calculate the number of boarding passengers N J b i , many factors should be considered, such as the loading rate β and the train passenger fixed number c 0 : To execute the ANFIS model to manage risk factors in the station, we evaluate the performance of the suggested system for overcrowding. The density level of passengers waiting at the platform can be calculated depending on the area of the platform and other factors, such as human factors (e.g., passenger belongings). Existing data from the Chinese metro (see Table 2) have been used for the model as a parameter set. We will use part of these data to train the system, and the remaining data will be used for testing the model. Applying the ANFIS technique can combine the concepts of the ANN with FIS to design a dynamic method-referred to as soft computing. This model has a number of main features that compare to other systems and methods of risk management, as follows: • Monitoring the station to provide early precaution of overcrowding risk, • Flexibility in reaction to overcrowding risk, • Comparative learning technique of station resources in response to the overcrowding risk, • Can be an intelligent system with the perspective of the ISFMR, • Can be based on on-hand data from digital sources, which can be gathered in real time, • Small scale, which can be generalised for more than just overcrowding risk levels.
By selecting actual data collected from the GuoMao station of Beijing metro in China during peak hours, the model proposes a risk scale rather than a probability of risk, which has been developed in the form of a four-choice range (very unlikely, unlikely, likely, very likely). The risk case selected is congestion in the station. This will be evaluated utilising the following indexes: • Stranded passengers (platform).

•
Transfer efficiency (channel), with each level grouped to an equivalent degree of risk.
The risk level for congestion is divided into sublevels that fluctuate between the levels of the index effect. If the number of warning levels is not clear (either high or low), each group may cover a wide range, leading to an unreliable level of risk management and inaccurate responses to real conditions. Owing to its importance, the level of risk is classified into four levels with diverse colours for ease of reference. The data are related to ratio (stranded) and transfer efficiency in the station; therefore, we need to convert the feature dimension of the inputs to be based on the probability of risk level in the suggested form. Then, we obtain the covariance matrix, and the patterns and colours of the risk level are noted, as presented in (Tables 3 and 4). The scale maintains the same range of values for each of the inputs of the model [17,90,103]. Table 3. Scale of the level of the risk probability (green: low risk; yellow; moderate risk; red: high risk).

Result Description (Retention Rate of the Platform) Description (Transfer Efficiency) Risk Levels Indexes
Very unlikely Full capacity and have chance to ride without being stranded Passengers can move easily D Unlikely A few stand and wait for the next train A few lines before the lift and stairs C Likely Some are waiting for the next train Some lines before the lift and stairs B Very likely Highly crowded and no chance to ride Very slow movement A We select the overcrowding index to apply the ANFIS model to demonstrate AI application in the ISFMR. The overcrowding index for risk management is obtained from the collected data. Then, the input data are compared with the threshold level to evaluate their status in real-time, which assists the decision-maker in responding to issues at an early stage. In the case of an unexpected high number of passengers, various train loops can be arranged for conveying the passengers. In the station, passenger paths or marketing plans can be modified, depending on the capacity and the number of passengers. This process can be applied to other risks; in fact, it can be applied to overcrowding problems in other contexts. The transfer rate has been used as indicator, which relates to the density and walking velocity of passengers at a critical point in the station, such as stairs, channels, lifts, ramps, or gates [104][105][106]. We have assumed different relationships between the passenger density and velocity, such as linear, logarithmic, and triangular functions [107]. Velocity is linked to the density of travellers, and when the density reaches 2.86-5 people per m 2 , overcrowding might occur [108][109][110][111]. Finally, the essential density for movement is 0.92 people/m 2 and the maximal moving velocity is 1.14 m/s [105]. It has been shown that the passenger velocity and density are challenge to estimate within a station setting; therefore, we have selected different indices to evaluate the risk level. Other factors that may impact the risk level are standards, operation management measures, and the size of the area [112][113][114].

Results and Analysis
Data were collected during the peak hour 07:30-08:30, with a super peak occurring at 07:45-08:15, and were used to calculate the retention rate. Moreover, to obtain the stranded ratio, we have considered the loading factor (1.05-1.02) for our calculations. Next, the index for the transfer efficiency from channels was calculated; this included the factors of walking velocity (1.5 m/s) and passing time (107 s). The total transfer time, including the time required to cover the distance to the platform, was 152 s. The data of the transfer efficiency show that after 08:00, the congestion remained at a steady rate, and passenger movement was smooth. The congestion risk level fluctuated between very likely and unlikely, as shown in Figures 6 and 7.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 16 of 30 number of passengers, various train loops can be arranged for conveying the passengers. In the station, passenger paths or marketing plans can be modified, depending on the capacity and the number of passengers. This process can be applied to other risks; in fact, it can be applied to overcrowding problems in other contexts. The transfer rate has been used as indicator, which relates to the density and walking velocity of passengers at a critical point in the station, such as stairs, channels, lifts, ramps, or gates [104][105][106].We have assumed different relationships between the passenger density and velocity, such as linear, logarithmic, and triangular functions [107]. Velocity is linked to the density of travellers, and when the density reaches 2.86-5 people per m 2 , overcrowding might occur [108][109][110][111]. Finally, the essential density for movement is 0.92 people/m 2 and the maximal moving velocity is 1.14 m/s [105]. It has been shown that the passenger velocity and density are challenge to estimate within a station setting; therefore, we have selected different indices to evaluate the risk level. Other factors that may impact the risk level are standards, operation management measures, and the size of the area [112,113,114].

Results and Analysis
Data were collected during the peak hour 07:30-08:30, with a super peak occurring at 07:45-08:15, and were used to calculate the retention rate. Moreover, to obtain the stranded ratio, we have considered the loading factor (1.05-1.02) for our calculations. Next, the index for the transfer efficiency from channels was calculated; this included the factors of walking velocity (1.5 m/s) and passing time (107 s). The total transfer time, including the time required to cover the distance to the platform, was 152 s. The data of the transfer efficiency show that after 08:00, the congestion remained at a steady rate, and passenger movement was smooth. The congestion risk level fluctuated between very likely and unlikely, as shown in Figures 6 and 7.  For more analysis and to explore the data, we train regression models to predict data using supervised machine learning as the ISFMR framework proposed to advance dynamic risk management in railway stations. Comparison regression learner optimizable models have been conducted, and, to protect against overfitting, the cross-validation has been applied. Moreover, to minimize the model mean squared error (MSE), the model with the optimized hyperparameters has been employed (best point hyperparameters). This analysis is one of the most important statistical methods for examining the relationship between the predictor variables and responses. Four combinations of hyperparameter optimization models have been performed: regression trees, Gaussian process regression model (GPR), support vector machines (SVM), and ensembles of regression trees. The GPR assessment shows the best fit by the minimum MSE (see Table 5). Figure 8 displays the MSE at each iteration and the value of optimization results for each model, the Minimum MSE Plot, which updates as the optimization runs is shown in Figure 9.  For more analysis and to explore the data, we train regression models to predict data using supervised machine learning as the ISFMR framework proposed to advance dynamic risk management in railway stations. Comparison regression learner optimizable models have been conducted, and, to protect against overfitting, the cross-validation has been applied. Moreover, to minimize the model mean squared error (MSE), the model with the optimized hyperparameters has been employed (best point hyperparameters). This analysis is one of the most important statistical methods for examining the relationship between the predictor variables and responses. Four combinations of hyperparameter optimization models have been performed: regression trees, Gaussian process regression model (GPR), support vector machines (SVM), and ensembles of regression trees. The GPR assessment shows the best fit by the minimum MSE (see Table 5). Figure 8 displays the MSE at each iteration and the value of optimization results for each model, the Minimum MSE Plot, which updates as the optimization runs is shown in Figure 9.  For more analysis and to explore the data, we train regression models to predict data using supervised machine learning as the ISFMR framework proposed to advance dynamic risk management in railway stations. Comparison regression learner optimizable models have been conducted, and, to protect against overfitting, the cross-validation has been applied. Moreover, to minimize the model mean squared error (MSE), the model with the optimized hyperparameters has been employed (best point hyperparameters). This analysis is one of the most important statistical methods for examining the relationship between the predictor variables and responses. Four combinations of hyperparameter optimization models have been performed: regression trees, Gaussian process regression model (GPR), support vector machines (SVM), and ensembles of regression trees. The GPR assessment shows the best fit by the minimum MSE (see Table 5). Figure 8 displays the MSE at each iteration and the value of optimization results for each model, the Minimum MSE Plot, which updates as the optimization runs is shown in Figure 9.    The Neuro-Fuzzy Designer has been used as a tool for the model, design, training, and testing in Sugeno-type fuzzy inference systems. The precalculated indices and risk matrices have been utilised for training the system. We selected a partitioning method where grid partitioning is the default scheme of the FIS structure generation with ANFIS, as shown in Figure 10a. From the chosen data, datasets were applied to train the model and the residue was used for assessment and validation of the system. Moreover, for rule generation, subtractive clustering was used, and the extent of influence, squash factor, acceptance ratio, and rejection ratio were set at 0.5, 1.25, 0.5, and 0.15, (c) (d) The Neuro-Fuzzy Designer has been used as a tool for the model, design, training, and testing in Sugeno-type fuzzy inference systems. The precalculated indices and risk matrices have been utilised for training the system. We selected a partitioning method where grid partitioning is the default scheme of the FIS structure generation with ANFIS, as shown in Figure 10a. From the chosen data, datasets were applied to train the model and the residue was used for assessment and validation of the system. Moreover, for rule generation, subtractive clustering was used, and the extent of influence, squash factor, acceptance ratio, and rejection ratio were set at 0.5, 1.25, 0.5, and 0.15, respectively (see Figure 10b). The MFs can vary in number and are assigned to each input in the ANFIS model. The Neuro-Fuzzy Designer has been used as a tool for the model, design, training, and testing in Sugeno-type fuzzy inference systems. The precalculated indices and risk matrices have been utilised for training the system. We selected a partitioning method where grid partitioning is the default scheme of the FIS structure generation with ANFIS, as shown in Figure 10a. From the chosen data, datasets were applied to train the model and the residue was used for assessment and validation of the system. Moreover, for rule generation, subtractive clustering was used, and the extent of influence, squash factor, acceptance ratio, and rejection ratio were set at 0.5, 1.25, 0.5, and 0.15, respectively (see Figure 10b). The MFs can vary in number and are assigned to each input in the ANFIS model.  We have also estimated the risk indices in two segments. Figure 11. Shows the latent class estimation results assuming homogeneity between the indices and the total risk. This indicates the importance of the platform design and punctuality in train schedules and loads to meet the waiting passenger volume. A low risk level indicates a safe platform and smooth passenger flow; however, other platforms or locations may be overcrowded at the same time.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 20 of 30 We have also estimated the risk indices in two segments. Figure 11. Shows the latent class estimation results assuming homogeneity between the indices and the total risk. This indicates the importance of the platform design and punctuality in train schedules and loads to meet the waiting passenger volume. A low risk level indicates a safe platform and smooth passenger flow; however, other platforms or locations may be overcrowded at the same time. The trained FIS includes 16 rules (clusters), as presented in Figure 12. Each input has 16 Gaussian curve built-in MFs. During training in ANFIS, sets of processed data were used to perform 150 cycles of learning. To consider the relationship between the input variables (risk indices) and the output variable (risk level), several fuzzy sets were assigned to the output data using the Sugeno-type FIS The trained FIS includes 16 rules (clusters), as presented in Figure 12. Each input has 16 Gaussian curve built-in MFs. During training in ANFIS, sets of processed data were used to perform 150 cycles of learning. To consider the relationship between the input variables (risk indices) and the output variable (risk level), several fuzzy sets were assigned to the output data using the Sugeno-type FIS method, which is illustrated in Figure 13.  The ANFIS models are evaluated based on their performance in training and testing sets. Various experiments were conducted with different MFs type inputs, the sizes of 80% of the data used to train the ANFIS and the rest as used to validate the accuracy of the model (test dataset). A random subset used for testing and RMSE values has been used to present the best membership functions and to better reflect the model performance differences [111]. For a comparison of RMSE values for the training and testing process, the experiments cover different MFs, a linear MF types output. Moreover, the mode train FIS can use the optimizer methods as either a backpropagation alone or a hybrid, which is a combination of backpropagation and the least-squares-type method to allow for the fuzzy learn from the data. The number of outputs is only one, and the grid partition method is used in establishing the rule base relationship between the input and output variable. During training with 150 epochs, the MF of dsigmf and psigmf perform the most effectively with minimum RMSE under linear MF type output with training hybrid method (see Table 6). The ANFIS models have shown significant performance variations against the evaluation criteria in terms of error and types of membership functions and the method of outputs. From all the values of RMSE, it appears that the ANFIS models are sensitive in different MFs and datasets, so the model can be learning content for the risk in the stations and provide real-time protection. The type of MF inputs,  The ANFIS models are evaluated based on their performance in training and testing sets. Various experiments were conducted with different MFs type inputs, the sizes of 80% of the data used to train the ANFIS and the rest as used to validate the accuracy of the model (test dataset). A random subset used for testing and RMSE values has been used to present the best membership functions and to better reflect the model performance differences [111]. For a comparison of RMSE values for the training and testing process, the experiments cover different MFs, a linear MF types output. Moreover, the mode train FIS can use the optimizer methods as either a backpropagation alone or a hybrid, which is a combination of backpropagation and the least-squares-type method to allow for the fuzzy learn from the data. The number of outputs is only one, and the grid partition method is used in establishing the rule base relationship between the input and output variable. During training with 150 epochs, the MF of dsigmf and psigmf perform the most effectively with minimum RMSE under linear MF type output with training hybrid method (see Table 6). The ANFIS models have shown significant performance variations against the evaluation criteria in terms of error and types of membership functions and the method of outputs. From all the values of RMSE, it appears that the ANFIS models are sensitive in different MFs and datasets, so the model can be The ANFIS models are evaluated based on their performance in training and testing sets. Various experiments were conducted with different MFs type inputs, the sizes of 80% of the data used to train the ANFIS and the rest as used to validate the accuracy of the model (test dataset). A random subset used for testing and RMSE values has been used to present the best membership functions and to better reflect the model performance differences [111]. For a comparison of RMSE values for the training and testing process, the experiments cover different MFs, a linear MF types output. Moreover, the mode train FIS can use the optimizer methods as either a backpropagation alone or a hybrid, which is a combination of backpropagation and the least-squares-type method to allow for the fuzzy learn from the data. The number of outputs is only one, and the grid partition method is used in establishing the rule base relationship between the input and output variable. During training with 150 epochs, the MF of d sigmf and p sigmf perform the most effectively with minimum RMSE under linear MF type output with training hybrid method (see Table 6). The ANFIS models have shown significant performance variations against the evaluation criteria in terms of error and types of membership functions and the method of outputs. From all the values of RMSE, it appears that the ANFIS models are sensitive in different MFs and datasets, so the model can be learning content for the risk in the stations and provide real-time protection. The type of MF inputs, data and MF type outputs are significant for the design of the ANFIS architecture. In the future, a variety of different and long training and testing datasets will be beneficial for conducting extensive research to cover all the dynamic changes in the crowds and covering all different possibilities of risk over a long time to enhance system performance. Moreover, it has been shown that the increase in epoch numbers for a training dataset does not necessarily develop model performance significantly. The motivations behind our design of an adaptive learner model was to support the framework (Section 3: intelligent system for managing risks (ISFMR)) throughout the learning activity in real-time to capture the risk of crowding by the predictor of the passenger flow in the station and that presented that the learning application is functional and can be a reasoning tool for managing dynamic risks. By including the ANFIS output in the system, the risk level for overcrowding can be predicted. In addition, the graph of the training error is presented in Figure 14. The model has been set up to train steps of 150 epochs with an addition of the required number of epochs, error diminishes, and error fluctuations to reach a steady state. Clearly, the small amount of training data is a significant performance variable. Moreover, one benefit is that the intended ANFIS example does not require time-consuming iterative training.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 22 of 30 data and MF type outputs are significant for the design of the ANFIS architecture. In the future, a variety of different and long training and testing datasets will be beneficial for conducting extensive research to cover all the dynamic changes in the crowds and covering all different possibilities of risk over a long time to enhance system performance. Moreover, it has been shown that the increase in epoch numbers for a training dataset does not necessarily develop model performance significantly. The motivations behind our design of an adaptive learner model was to support the framework (Section 3: intelligent system for managing risks (ISFMR)) throughout the learning activity in realtime to capture the risk of crowding by the predictor of the passenger flow in the station and that presented that the learning application is functional and can be a reasoning tool for managing dynamic risks. By including the ANFIS output in the system, the risk level for overcrowding can be predicted. In addition, the graph of the training error is presented in Figure 14. The model has been set up to train steps of 150 epochs with an addition of the required number of epochs, error diminishes, and error fluctuations to reach a steady state. Clearly, the small amount of training data is a significant performance variable. Moreover, one benefit is that the intended ANFIS example does not require time-consuming iterative training. The trend of errors of the trained fuzzy system, an example of the increase in the RMSE with training and the output of the ANFIS paradigm, are shown in Figure 15. This shows that the predicted The trend of errors of the trained fuzzy system, an example of the increase in the RMSE with training and the output of the ANFIS paradigm, are shown in Figure 15. This shows that the predicted short-term risk level for both the training data and the FIS output are in excellent agreement with the exact results. The surface plot shows that the three-dimensional (3D) surface poses a challenge to the evaluation of the two indices, which have a significant impact on the expected overcrowding prediction (see Figure 16). In general, it can be inferred from the 3D diagram that the risk level increases with the increase in the index probability, which is a logical conclusion. For other risk indices, the same procedure can be followed.
Appl. Sci. 2020, 10, x FOR PEER REVIEW  23 of 30 short-term risk level for both the training data and the FIS output are in excellent agreement with the exact results. The surface plot shows that the three-dimensional (3D) surface poses a challenge to the evaluation of the two indices, which have a significant impact on the expected overcrowding prediction (see Figure 16). In general, it can be inferred from the 3D diagram that the risk level increases with the increase in the index probability, which is a logical conclusion. For other risk indices, the same procedure can be followed.  A 3D surface of all possible values of inputs (indices) of the ANFIS model and the output corresponding to these values has been presented. Variations between the results for different MFs are evident from the 3D surface plots. The response of the 3D surface of risk level versus the indices shows deferent minimal errors depending on the MFs selected. It can be inferred from these figures that the high flow from both indices results in a high overcrowding risk level. The increase in the number of passengers on the platform or in other areas of the station will result in an increasing density, which is predictable. The ability of the prediction model is improved by increasing the size of the training and testing datasets. We can provide many 3D models for training with different MFs to create greater clarification and a more general comprehension of risk factors which need to be  This model is implemented using the selected simple-data sample obtained from the literature to reach conclusions about the effectiveness of the method. The results show the surface and counters of unsafe index levels and the impact of the index of the input obtained in risk level results. Regarding the selection of the best MF, different scenarios are considered, and the scenario with the smallest error is accepted. Regarding the performance of the designed FIS, different RMSEs are observed that indicate a perfect fit to the data.
A 3D surface of all possible values of inputs (indices) of the ANFIS model and the output corresponding to these values has been presented. Variations between the results for different MFs are evident from the 3D surface plots. The response of the 3D surface of risk level versus the indices shows deferent minimal errors depending on the MFs selected. It can be inferred from these figures that the high flow from both indices results in a high overcrowding risk level. The increase in the number of passengers on the platform or in other areas of the station will result in an increasing density, which is predictable. The ability of the prediction model is improved by increasing the size of the training and testing datasets. We can provide many 3D models for training with different MFs to create greater clarification and a more general comprehension of risk factors which need to be managed and of risk indicators in railway stations. The results of the observed data are compared with the ANFIS, as shown in Figure 17, which displays the assessment results from the ANFIS.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 24 of 30 managed and of risk indicators in railway stations. The results of the observed data are compared with the ANFIS, as shown in Figure 17, which displays the assessment results from the ANFIS. The above analyses elucidate that the proposed model of ANFIS can predict the risk and represent their statistical features with reasonable accuracy. In the proposed framework, the dynamic risk can be managed in real-time, which supports the decision making and reflects the speed of the reaction to manage overcrowding risks. Such a framework can use input parameters as the transfer efficiency and retention rate of the platform, in order to predict the flow safety performance. It should also be mentioned that the proposed model can provide great estimations of safety performance, especially for estimating the number of staff needed and for the advance measures of the platforms and machines at the project planning stage of the railway stations. This has to raise the importance of prediction more than any time in the past, as the crowding has led to the spread of COVID-19 and thus affected the safety, security and health concepts and strategies.

Conclusions
Station overcrowding indices are introduced to obtain more reliable results for risk management in real-time, leading to systematic, manageable, and identified hazardous events. Overcrowding risk prediction can be used to help analyse risk levels and ensure that a high level of safety is achieved in stations, where passenger density is one of the most critical indicators for the safety and quality of service. It can be concluded that the suggested system has the potential to provide very realistic vulnerability assessments of overcrowding in stations. Certain features, such as the relationship between risk indices, are highlighted. Furthermore, the proposed method can help create novel ways in ISFMR modelling for stations and support risk management sciences based on risk indices. Hence, the evolution of AI technology and big data makes this process more intuitive and timely for decision makers working in railway systems, and at stations in particular, contributing to more efficient outcomes than conventional approaches. The data extracted from Beijing's Guomao metro station with the two parameters (retention rate index and transfer efficiency index) have been analysed and trained using supervised machine learning, and three optimizable models conducted: regression trees, the Gaussian process regression model (GPR), support vector machines (SVM), and ensembles of regression trees. The prediction results indicate that the Gaussian process regression model (GPR) model shows high accuracy in predicting the risk level performance where the RMSE = 0.0064, and the coefficient of determination (R 2 ) = 1. It seems that the hyperparameter optimization model's performance shows a high level of fit to the data. Generally, it is thought that this is due to the The above analyses elucidate that the proposed model of ANFIS can predict the risk and represent their statistical features with reasonable accuracy. In the proposed framework, the dynamic risk can be managed in real-time, which supports the decision making and reflects the speed of the reaction to manage overcrowding risks. Such a framework can use input parameters as the transfer efficiency and retention rate of the platform, in order to predict the flow safety performance. It should also be mentioned that the proposed model can provide great estimations of safety performance, especially for estimating the number of staff needed and for the advance measures of the platforms and machines at the project planning stage of the railway stations. This has to raise the importance of prediction more than any time in the past, as the crowding has led to the spread of COVID-19 and thus affected the safety, security and health concepts and strategies.

Conclusions
Station overcrowding indices are introduced to obtain more reliable results for risk management in real-time, leading to systematic, manageable, and identified hazardous events. Overcrowding risk prediction can be used to help analyse risk levels and ensure that a high level of safety is achieved in stations, where passenger density is one of the most critical indicators for the safety and quality of service. It can be concluded that the suggested system has the potential to provide very realistic vulnerability assessments of overcrowding in stations. Certain features, such as the relationship between risk indices, are highlighted. Furthermore, the proposed method can help create novel ways in ISFMR modelling for stations and support risk management sciences based on risk indices. Hence, the evolution of AI technology and big data makes this process more intuitive and timely for decision makers working in railway systems, and at stations in particular, contributing to more efficient outcomes than conventional approaches. The data extracted from Beijing's Guomao metro station with the two parameters (retention rate index and transfer efficiency index) have been analysed and trained using supervised machine learning, and three optimizable models conducted: regression trees, the Gaussian process regression model (GPR), support vector machines (SVM), and ensembles of regression trees. The prediction results indicate that the Gaussian process regression model (GPR) model shows high accuracy in predicting the risk level performance where the RMSE = 0.0064, and the coefficient of determination (R 2 ) = 1. It seems that the hyperparameter optimization model's performance shows a high level of fit to the data. Generally, it is thought that this is due to the acceptable performance and a high accuracy modelling for estimating the level of risk. The models enhance the proposed framework for predicting the risks in real-time and supporting decision makers with low human error and learning from the history of the flow in the stations. A dynamic risk management model for railway stations using ANFIS has been developed, which can deal with active risk data indices in real-time. The ANFIS is one of the most successful techniques in terms of its ability to predict risk in railway stations. The selected model appears robust for predicting risks and aiding decision making for dynamic risk management in relation to the danger of overcrowding in railway stations and is a step toward smart future railway stations. The model has the flexibility to deal with a significant number of inputs for future research, which is not an easy task. The ANFIS appears to be more reasonable and appropriate because of its smoothness in calculating the risk levels. In addition, risk can be evaluated effectively from the source of the knowledge base built by modifying data from various sources and indices. Furthermore, the model shows faster learning and higher prediction accuracy. Risk indices are essential to reflect the actual condition of the station and indicators at the early stage of risk factors such as overcrowding. The dynamic risk management model can define risk level and aid decision makers by providing convenient and reliable results based on real-time data. In practice, the ANFIS prediction model provides a decision-support system for managing risk that can be captured through indicators or indices, which creates the rules depending on the inputs. The necessary inputs can be gathered from devices (such as sensor-based programming or CCTV), and the estimate can be calculated with the ANFIS system, generated from servers at the railway stations. Predictions of risk based on some indices can then be presented to decision makers through a web-based program. It is expected that if more data and training are passed through the ANFIS model as input, more accurate results can be obtained. Finally, decision-makers can act depending on real-time knowledge of the congestion risk levels in the station, which need to be maintained. This model promotes and improves safety relating to overcrowding risks and prevents outgrowth. Supervising the selected overcrowding signs (input variables) can prevent harmful events in many hot spots in the station, such as the platforms, ticket gates, and tunnels. From the framework of the model and the relationship between the inputs and results of the system, it is possible to generalise the method for predicting other risk factors and to simplify management by applying new rules. In future, further research will be needed to increase the usability and reliability of the proposed framework with a large scale of data and integration of the data sources to detect traffic flow risks in the stations. Finally, we hope that this study promotes further research on flow prediction in railway stations with the aim of improving decision making and public safety.