An Adaptable Train-to-Ground Communication Architecture Based on the 5G Technological Enabler SDN

: Railway communications are closely impacted by the evolution and availability of new wireless communication technologies. Traditionally, the critical nature of railway services, the long lifecycle of rolling stock, and their certiﬁcation processes challenge the adoption of the latest communication technologies. A current railway telecom trend to solve this problem is to design a ﬂexible and adaptable communication architecture that enables the detachment of the railway services—at the application layer—and the access technologies underneath, such as 5G and beyond. One of the enablers of this detachment approach is software-deﬁned networking (SDN)—included in 5G architecture—due to its ability to programmatically and dynamically control the network behavior via open interfaces and abstract lower-level functionalities. In this paper, we design a novel railway train-to-ground (T2G) communication architecture based on the 5G technological enabler SDN and on the transport-level redundancy technique multipath TCP (MPTCP). The goal is to provide an adaptable and multitechnology communication service while enhancing the network performance of current systems. MPTCP offers end-to-end (E2E) redundancy by the aggregation of multiple access technologies, and SDN introduces path diversity to offer a resilient and reliable communication. We carry out simulation studies to compare the performance of the legacy communication architecture with our novel approach. The results demonstrate a clear improvement in the failover response time while maintaining and even improving the uplink and downlink overall data rates.


Introduction
The railway sector is ruled by a regulatory framework aiming for European interoperability. This regulatory framework trades off interoperability for adaptability to new advances in the field of communication technologies. This causes a remarkable mismatch between new railway communications services and the fast development of new wireless communication techniques.
In order to provide flexibility and adaptability to the railway network architecture [1], the application level should be independent of the access technology (2G, 5G, etc.) to permit their differently paced development. To solve this problem, the railway sector is prone to integrating a new adaptable communication layer that decouples upper-level functionalities from underlying access technologies. This concept has been named an adaptable communication system (ACS), which is understood as a set of functions and techniques that enable the aforementioned decoupling.
While solving the adaptability challenge, the envisaged communication architectures should achieve an appropriate level of reliability [2] due to the critical nature of supported services [3].
Those services have very demanding constraints that require novel techniques to assure correct system operation. These constraints are linked to short lifecycle signaling messages, which are extremely sensitive to latency, and to the continuous handover procedures, which produce high packet loss rates. Moreover, the evolution towards the IP era has introduced the challenge of providing performance indicators [4] comparable to those found in traditional dedicated systems, such as Global System for Mobile-Railway (GSM-R).
To summarize, with the purpose of achieving the reliability of legacy dedicated circuits, and complying with the aforementioned constrains, it is crucial to use resilient architectures that apply spatial and temporal redundancy techniques [5]. However, redundancy without path diversity does not provide the needed resiliency for facing correlated communication errors. This fact motivates the necessity of novel traffic engineering techniques to reserve network resources and make data travel through disjoint paths. Figure 1 depicts the communication architecture of a traditional railway network. It consists of two parts, the access network, which connects the onboard equipment (OBE) and the control center equipment (CCE), and the multiprotocol label switching (MPLS) core, which is inspired by a real network deployed by the Italian railway infrastructure manager Rete Ferroviaria Italiana (RFI) [6]. In contrast to this architecture, this paper proposes a detailed architecture for an ACS that is based on the 5G technological enabler SDN and on the transport-level redundancy technique multipath TCP (MPTCP) [7]. The ACS is responsible for providing resiliency mechanisms and communication interfaces that are independent of any access technology. The proposal can be divided into two functional blocks. On the one hand, end-hosts are equipped with MPTCP and multiple network interfaces in order to achieve end-to-end (E2E) spatial and temporal redundancy. This procedure reduces the latency of the communication because all the packets are duplicated and sent simultaneously, so there is no need to wait for retransmissions. On the other hand, SDN is used in the network, where a centralized SDN controller runs a novel path-computing application to make forwarding decisions. Thanks to this application, the network is capable of establishing disjoint paths and auto-reconfiguring in case of failure, which provides path diversity and improves the resiliency of the architecture. Then, the design is modeled in a simulation platform to measure the performance of this new architecture.
The structure of this document is as follows, Section 2 addresses related work in the field of adaptable and reliable mobile communication architectures. After that, Section 3 gives a detailed explanation of the proposed ACS architecture. Section 4 describes the simulation scenario that has been configured and the platform where tests have been carried out. Then, Section 5 shows the results of the simulation, and finally, Section 6 explains the main conclusions of this work.

Related Work
Previous works have evaluated different options to provide a reliable train-to-ground (T2G) communication through the use of resilient network architectures.
Authors in [8] divide railway user traffic into high and low priority and use relay-links to send only high priority data through redundant long-term evolution (LTE) channels. The weakness of this system lies on its lack of flexibility because data classification is static and cannot be modified on demand.
Authors in [9] propose a wireless mesh network (WMN) where the T2G communication is fully based on WiFi ad-hoc links. In this network, the train transmits data simultaneously to many trackside nodes, then packets are forwarded between nodes forming a chain until they reach the control center. This configuration improves the resiliency of the wireless channel and reduces the delay by avoiding continuous associations with the access points. Nevertheless, this solution is focused on improving the wireless access network and does not include an E2E redundant solution-core network included.
In the literature, MPTCP is also used for resiliency purposes thanks to its ability to address specific "add and drop" subflow policies to realize efficient multitechnology seamless handovers and provide E2E redundancy. The authors in [10] deploy a heterogeneous architecture using several public networks to transmit railway signaling data. However, MPTCP only guarantees the E2E redundancy and does not take care about path diversity. In [5], an MPTCP extension, which merges spatial and temporal redundancy, is proposed combined with MPLS in order to assure the required E2E path diversity and avoid correlated communication errors. However, the distributed control of MPLS does not provide enough flexibility to dynamically modify routes and reconfigure the network in case of failure. Moreover, the granularity to manage traffic flows is limited to IP source and destination directions, while when SDN features are considered, it is feasible to manage traffic flows in accordance with many other parameters.
Authors in [11] propose an architecture for T2G communication based on an SDN-controlled mobile backhaul network. This architecture can efficiently handle mobility management and provide dynamic quality-of-service (QoS) for different services onboard. There is SDN equipment placed in the ground and also inside the train to centralize the management of the wireless channel. However, they use an in-band control plane because user and control traffic is sent through the same wireless channel. This means that the failure of the wireless link will cause a communication breakdown between the SDN equipment and the controller. Consequently, the adaptability and dynamic features offered by SDN are lost.
To the best of our knowledge, our proposed architecture is the first proposal combining MPTCP and SDN that provides an adaptable, multitechnology, and reliable T2G communication service. MPTCP in both communication endpoints offers E2E redundancy while SDN in the access and core networks contributes to the path diversity and flexibility.

Proposed ACS Architecture
This section describes the proposed T2G network architecture. Its goal is to obtain a flexible, adaptable, and resilient communication between the train and the ground equipment in order to overcome the limitations of legacy routing. The key novelty of this architecture can be summarized in three aspects. First, to the best of our knowledge, there is no previous proposal that includes SDN and MPTCP working simultaneously. Second, MPTCP is configured to use a novel redundant scheduler, which provides spatial redundancy by duplicating every packet and sending it through different network interfaces at the same time. Third, as standalone MPTCP does not assure the use of disjoint paths, a locally customized SDN forwarding application is employed to improve path diversity. This application is able to compute and set redundant disjoint data paths-active and backup-between two given SDN switches. Moreover, the failover response time is minimized because the backup path is configured beforehand, so that the switching time is reduced to the time that takes to install a flow rule.
Furthermore, the ACS can be logically divided into three architectural components depicted in Figure 2: core network, access network, and ACS functions.

Core Network
In this architecture, the core network is a dedicated network managed by the railway operator. According to the SDN paradigm, there is a clear separation between control and data planes. The SDN controller is the main element of the control plane, which in this case is out-of-band because there is a point-to-point connection from the controller to each SDN-based device-OpenFlow (OF) switches and OF access points (OF-APs)-of the network (see Figure 2). Additionally, OF is set as the southbound interface, and the forwarding is based on an SDN path-computing application that is explained later.
The data plane consists of several OF switches forming a mesh topology. These switches connect the access network and the control center network, which is a private network formed by the CCE that supports railway services (e.g., telemetry).

Access Network
The access network also constitutes a dedicated network under the management of the railway operator. The control plane is the same as in the core network, except for the forwarding, which is based on L2 matching flow rules (e.g., media access control (MAC) origin-MAC destination). The data plane is divided into two subnets: WiFi and LTE. These subnets provide wireless connectivity to the OBE, which is in charge of offering T2G connectivity services to the applications running in the train.

WiFi
This network is fully SDN-based, having several OF-APs that are connected to an OF switch that concentrates all the traffic coming from the OF-APs. The use of additional aggregation OF switches depends on the number of deployed OF-APs.

LTE
This access network also supports SDN. It consists of the evolved packet core (EPC) entities and some eNodeBs (eNBs) that are connected to each other through independent OF switches. The EPC entities and the eNBs have no OF support, which means that the controller cannot modify their behavior. As in the previous case, the amount of OF switches rises according to the number of EPC entities and eNBs that are deployed. Therefore, the simplest topology would comprise two OF switches, one for the interconnection of EPC entities and the other one to aggregate the traffic from the eNBs.

ACS Functions
The ACS functionality relies on a locally developed SDN path-computing application and the MPTCP transport layer. The application runs on top of the controller, and it is focused on reducing the delay of the communication and providing path diversity. Meanwhile, MPTCP runs in the OBE and the CCE with a scheduler that uses all the available network interfaces to obtain a redundant T2G wireless communication.

Multipath TCP
Both sides of the communication, the OBE and the CCE, run MPTCP with the redundant scheduler [12], which was developed by our research team and added to the Linux kernel (v 3.17.0) in 2015. This scheduler provides E2E redundancy by duplicating and sending every packet through all available network interfaces, as represented by continuous red-LTE-and discontinuous green-WiFi-lines in Figure 3. The advantage of this operating mode is that the same information is sent twice, so the receiver does not need to wait for retransmissions when a packet is lost, thus the overall delay of the communication is reduced. Moreover, thanks to the redundancy offered by MPTCP, the latency of the vertical handover procedure decreases.

SDN Path-Computing Application
In the control plane, the SDN controller runs the Disjoint Path-Computing (DisPaC) application, which is responsible for the forwarding in the core network. DisPaC is based on the path computation element (PCE) described in [13]. The PCE calculates disjoint paths between two switches-edge switches in Figure 3-and the service manager installs the correct flow rules in the OF switches. In this case, the application has been customized to install flow rules corresponding to precomputed backup paths in order to be installed when a failure occurs. Five parameters are necessary to configure each service: source edge switch/port, destination edge switch/port and VLAN ID (to differentiate services in case of having more than one).

•
If there are two disjoint paths, one path is set to active-dark color in Figure 3-and the other one to backup-light color in Figure 3. However, the flow rules corresponding to the backup path are also installed, so the traffic is also forwarded through this path. Edge switches are responsible for discarding the traffic from the backup path in order to avoid duplicated packets to be delivered out of the network. If the active path fails, the application can detect it and activate the backup path, that is, change a flow rule on each edge switch. When the active path recovers, the traffic is again handed off to it. If the backup path crashes while the active one is down, the ongoing service ends.

•
If there are two different paths, but they are not fully disjoint, the application establishes them and works exactly as in the previous case.
• Finally, if there is a single path, it is established and used. When any link in the path crashes, the service stops.
In consequence, as the backup path is precomputed and ready to work, when there is a failure, the switching time between active and backup path is reduced to the time that it takes to install one flow rule, i.e., send an OF Flow-Mod message and modify the flow table. This means that the failover response time decreases and the reconfiguration is faster. Moreover, the use of disjoint paths provides path diversity, which is reflected in the resiliency of the network.

Simulation Scenario
This section details the simulation scenario that is configured to test the previously presented communication architecture and measure several performance indicators such as delay and data rate.

Proof of Concept Scenario
A proof of concept (PoC) scenario is designed (see Figure 4) according to the aforementioned proposed architecture.
In the control plane, the Open Network Operating System (ONOS) is used as open-source SDN controller. DisPaC is programmed over the northbound interface of ONOS, but it could be migrated to any other controller.
In the data plane, the core network consists of five OF switches forming a mesh topology. Two of them are the exchange point of WiFi and LTE access networks, and another one is directly connected to the CCE. In the access network, there are six OF-APs with a separation of 600 m between them and a single eNB located in the center between two APs, so the total trajectory is 3 km.

Simulation Framework
The tests are carried out in an open-source simulation framework called OpenNet [14] that runs on top of Mininet and ns-3. It offers a Python programming interface to create the topologies. Mininet provides SDN support by emulating OF switches that run Open vSwitch (OVS) and by enabling the connection with an external controller. In contrast, ns-3 offers WiFi and LTE channel models, as well as logical entities like OF-APs, eNBs, and the EPC (S/P-GW and MME). The ns-3 emulator counts on a Friss channel model to provide an accurate implementation of 802.11 and LTE outdoor wireless channels. The integration of SDN in WiFi and LTE networks is different. WiFi access points (APs) are fully SDN-based, and they can be managed from the controller, while SDN support in the LTE network lies in the interconnection between traditional EPC entities and eNBs. This means that those entities are treated as end-hosts by the SDN controller, so that it cannot install any flow rule to manage their operation.

Use Cases
Two use cases are considered to test different configurations of the communication network and obtain a relative comparison of delay and data rate measurements. Table 1 shows a summary of the configuration of each use case. The goal of these use cases is to measure the performance of the novel communication architecture. In consequence, use case A is configured according to a legacy architecture that can nowadays be found in any railway network, such as the one depicted in Figure 1. However, in order to compare equivalent mechanisms, we set a layer-2 network with switches running Spanning Tree Protocol (STP) instead of a more complex layer-3 network based on routers executing MPLS. Alternatively, the configuration of use case B represents the proposed architecture according to Section 3.

Applications
For the scope of this paper, we choose train monitoring and video surveillance applications because they cover telemetry and closed-circuit television (CCTV) services, as reported by [15].

Train Monitoring
In this application, the OBE periodically informs the CCE about the current state of process variables according to a publisher-subscriber paradigm. It works on top of Message Queuing Telemetry Transport (MQTT) [16], a machine-to-machine (M2M) oriented protocol that is specified for low-latency communications and which runs over TCP. In the train, the OBE collects data from the sensors and publishes it on different topics in the CCE. Each topic represents a variable, for example, the temperature of a component. Then, the CCE's subscriber entity receives all the information published on the topics of interest-the ones that it is subscribed to. This application produces an uplink overall data rate of 20 kbps per variable.

Video Surveillance
The video surveillance application is based on a common CCTV service, and it is modeled with the iperf3 tool. In terms of traffic characterization, it is a real-time high-quality (HQ) video stream with a resolution of 1920 × 1080 pixels (Full HD) and coded in H.265, producing an uplink data rate of around 2.5 Mbps per camera.

Measurement Techniques
Delay and data rate measurement tests are carried out over the different use cases. Delay is measured by attaching timestamps to the train monitoring messages, which makes it possible to quantify the E2E latency, including the delay introduced by transport and application levels. It is remarkable that this method supports measures over MPTCP because the MQTT protocol maintains the same TCP connection during the whole dialogue. Similarly, the iperf3 traffic generator is used to measure the instant data rate of the communication, allowing us to set up TCP flows and taking advantage of the MPTCP subsystem.

Results
This section presents the results obtained from the aforementioned use case simulations. The objective is to compare the performance of the proposed architecture (case B) with a legacy network (case A) when there is a failure. The failure consists in a link of the core network crashing at about 150 s after the beginning of the simulation.
Looking at Figure 5, when the failure happens, in case B there is a negligible additional latency of 1.8 ms, while in case A, the connection stops for about 50 s. In addition, Figure 6 depicts how the overall data rate increases in uplink and downlink when the SDN path-computing application (DisPaC) and MPTCP are applied. There is a 6.15% increment for the uplink data rate and 0.43% for the downlink one. It is considered that the improvement rate resulted from this comparative analysis is independent of the simulation platform and that the comparative values obtained could be extrapolated to a real scenario.

Conclusions
The critical nature of railway T2G communications and the continuous evolution of radio access technologies motivates the necessity of a reliable and adaptable communication architecture that maintains the performance of legacy systems while offering enhanced functionalities for future services.
The architecture proposed in this paper relies on a 5G technological enabler, SDN, and on MPTCP to provide path diversity and E2E redundancy in order to contribute to a technology-independent and resilient communication service. SDN is a key enabler for addressing network flexibility and adaptability, due to its centralized control and its ability to deal with failures at runtime.
According to the results, the combination of MPTCP and SDN improves the T2G communication performance indicators compared to a legacy approach. The E2E latency remains in values under 60 ms (the maximum E2E delay of a user data block in GSM-R is 0.5 s [4]), the available bandwidth increments due to the provision of multiple access technologies, and the failover response time is reduced thanks to the SDN path-computing application DisPaC. DisPaC provides disjoint paths and automatic network reconfiguration with a minimal impact on performance, which improves the resiliency of the network.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: