3.1. The Securebox Architecture
The security in TCMS was categorized in a domain called Industrial Control System (ICS) security which is covered in the IEC 62443 standard [
20]. ICS security is slightly different from IT Security, as the failure of ICS security could cause physical damages that could lead to casualties and property losses. There is much research and best practice development that has been carried out in IT security. Next-generation firewalls [
21], the machine-learning-based Intrusion Detection/Prevention System (IDS/IPS) [
22], mobile Virtual Private Networks (VPN) [
23], GSM-R to LTE-R [
24], quantum cryptography [
25], etc., can be implemented to secure the IT network.
However, there are specific concerns that we must address for ICS security, especially in the railway system. First, communication in TCMS needs reliable wireless communication for uninterrupted and low-latency network performance. The use of a VPN is not an option as the VPN will drive higher latency, and the train mobility requires the connection to be re-established each time the train moves to a different network coverage. The GSM-R can provide higher data rates according to eMLPP features in circuit-switched digital model connection between train and train control. However, it is prone to MITM attack as there is no end-to-end encryption in this protocol. This disadvantage also occurs in LTE-R even though it provides a higher data rate. GSM-R and LTE-R only provide data security between MS to BS. Thus, LTE-R has not been implemented in all rural areas along the railway, which makes it unsuitable for many TCMS. The advanced firewall and IDS/IPS can reactively protect the IT network with higher detection accuracy. However, they only inspect the incoming data in a single node but not the data transferred along the network. Thus, it is impossible to implement these devices in every node of the network or along the railway network. For data security, the advanced encryption algorithm research comes in the form of asymmetric and quantum cryptography. However, their higher computational cost is not suitable for the limited computational resource in the existing ICS environment in TCMS.
According to the ANSI/ISA 99 standard [
26], the ICS security reference model is mapped in level 3 to level 1. Level 3 covers the operation management, which is the TCMS, level 2 is the supervisory control in the form of the monitoring function, and level 1 is the basic control of the on-board VCU in the train. Thus, we need a specific strategy to cope with the ICS security requirements [
27]. This research proposes a Securebox architecture implemented in a Securebox HMS. To cover the scientific and industrial requirements, the architecture was developed in terms of the following considerations:
Modular, which works as an add-on for an existing TCMS. Under this consideration, the TCMS and Securebox HSMs were efficiently developed and installed independently without adding more processing load to the TCMS or disturbing the train installation and operation.
Open and independent, which was developed according to an open communication system standard and to be independent of the operating system. In this case, the application could run on any operating system and any data network.
Secure, the security aspect was developed in terms of end-to-end data security. The cryptography algorithm encrypts plaintext data from the onboard train system and decrypts the encrypted data in the data server. The implemented cryptography algorithm is the most secure cryptography algorithm.
As it would be a critical dedicated device, the development must meet the railway reliability standards; the applicable standards are EN 50155 [
28] and EN 50121-4 [
29], covering signaling and telecommunication apparatus installed inside the railway environment.
The Securebox architecture is shown in
Figure 3. It is composed of four main functions: network management, buffer management, data management, and security management. The main functions are described below, from the bottom to the top.
The network management function is responsible for successful communication between the Securebox HSMs and the data server. It handles the messages passing through the communication channel regardless of the transmission media. This function is responsible for network addressing, the logical link control protocol, and the medium access control protocol. It also handles the necessity of connection by virtual or logical links on the network. The virtual connection can be in the form of a peer-to-peer, client-server, or virtual network for a secure communication channel between a mobile VCU and the data server.
- 2.
Buffer management
The buffer management function acts as a buffer for the received message from the VCU in the train and buffers the message, which it synchronizes and transmits to the data server. In the TCP/IP protocol suite, a data buffering function is developed in the application layer. The embedded synchronization algorithm is used to maintain data transmission from Securebox HSMs to the data server.
- 3.
Data management
The data management was developed because the HSMs will be installed in a moving train. There was a consideration that if the train passed through a poorly connected region, the HSMs might be unable to contact the server, and hence would not send the data. Therefore, data management is responsible for the synchronization and data parsing function. Synchronization is a task to keep data integrity, ensuring that the data are successfully transmitted to the data server even if the train passes through a poorly connected region. Data parsing is a task to cleanse the data from any character that was not needed and split the data according to the data separator. The data separator is a character that marks the end of a data entry and the beginning of the following data entry.
- 4.
Security management
Cryptography is a function responsible for data confidentiality and authenticity. Cryptography is used to preserve confidentiality by encrypting the plaintext data from the VCU before it is sent to the data server. In this case, the data was encrypted to non-readable encrypted data to ensure that any attacker could not read the plain data without the cryptographic key. The user access control was implemented at the user and hardware level to meet the authenticity requirement. Security management must ensure the security of the operating system and its application as well.
3.2. The Securebox HSM Design
To fulfill the modular requirements, the Securebox HSM was designed to work independently, enhancing the existing process without adding more processing load to the existing device. In testing, the existing onboard train VCU used a Programmable Logic Control (PLC)-based Selectron CPU 833-TG, which was already set to send data using File Transfer Protocol (FTP). Thus, the developed HSM must act as FTP server for the VCU and as a computer client for the data server. In this scheme, the Securebox HSM must maintain all data transmission even if any bad or blank spot connections occur along the railway. The HSM then sends the data to the data center via Hypertext Transfer Protocol (HTTP) using a one session TCP handshake for each packet.
The Securebox HSMs were developed based on x86 computer architecture [
30] and a standard TCP/IP protocol [
31] for communication protocols. The x86 computer architecture was chosen because major TCMS devices have been developed to support various computer platforms. Likewise, the TCP/IP protocol standard was chosen because almost all TCMS devices support the communication protocol. The architecture implementation is done for each function layer independently. The HSMs were set up in two modes. The first is the train-side HSM in the onboard train system and the other is a server-side HSM. The workflow of the Securebox HSMs is depicted as a flowchart in
Figure 4.
In the train-side HSM, the VCU sends two kinds of data, which are *.txt and *.dds files, to the Securebox. The *.txt file contains routinely plaintext data, which are sent periodically to the Securebox (in this case, taking 5 s). The *.dds are plaintext data, triggered on request or by any registered events. The Data Flow Diagram (DFD) for a TCMS which implements Securebox architecture can be seen in
Figure 5 for DFD level 0, in
Figure 6 for DFD level 1, and in
Figure 7 for DFD level 2.
3.3. The Secure TCMS Setup
The train-side Securebox HSM was installed on a diesel-electric train produced by the Indonesian Rolling Stock Industry, and the server-side Securebox HSM was installed in the colocation server in the Telkom University data center. The hardware was a certified railway computer with an x86 64-bit-based i7-7600U computing architecture with 16 Mb DDR4 RAM. The computing architecture had the sufficient computing power to run the Securebox application and withstand harsh railway conditions. The prototype was connected to the VCU using a railway-standard ethernet cable and TCP/IP protocol. The Securebox HSMs ran in two algorithms, 3DES and AES, in CBC (Code Block Chain) block cipher mode. The testing was intended to measure the performance of our architecture compared to the existing state-of-the-art [
7]. However, in this test, the 3DES was run in a x86 computing platform, not in the IoT-based TCMS. because the Securebox must comply with EN 50155, EN 50121-4, and IEC 61373 device standards.
The VCU collected real-time data from sensor devices in the train and sent the data every 5 s as a *.text file format to the train-side Securebox HSM. The HSM then connected to the public network through a 4G-LTE modem on the train. The test was done from Bandung to Singaparna station on Java Island in Indonesia. The trip took almost an hour and 15 min along urban, suburban, and rural conditions to evaluate the performance of the secure TCMS. However, the *.dds data was sent just once after the trip finished, because the data were an aggregate of all sensor data along the trip. The longer the journey, the greater the amount of data that will be sent .which is much greater than the *.txt data. For integrity and quality of service purposes, the data were set to be sent only one time after the trip finished.
The secure TCMS setup was done in two sites, which were an onboard train system setup and a data server setup. The setup details follow.
In the onboard train system setup, the sensor network from the train devices sensors was connected to a Remote I/O Module (RIOM) in each train carriage and connected to one VCU located in the train control room. The VCU data output was then transmitted to an HSM via ethernet protocols. In the HSM, the Securebox application was installed and programmed according to the Securebox architecture. The application handled the functional architecture as below.
The networking function used two Network Interfaces Cards (NIC). The first was the NIC-facing VCU, and the second faced the 4G modem device. It utilized ethernet protocol, as it is broadly used in the TCMS system. A Shielded Twisted Pair cable was chosen to be used in ethernet transmission media, as it proved to be more resistant to electromagnetic interference than wireless media used in a wireless protocol. Thus, it already complied with the existing infrastructure in the TCMS and did not interfere with the TCMS operation.
- b.
Buffer management
The received data from VCU were obtained and buffered in an FTP server. After the file was processed for data management and security management, the data were sent to the data server via HTTP with a specific decryption port address. When the synchronization function meets any unsuccessful HTTP transmission condition, the data will be kept in the buffer and are sent when the connection is available. This function is also responsible for keeping data integrity by the use of a hash function for every datum.
- c.
Data management
Securebox acted as an FTP server for the VCU. After receiving the file, the synchronization and parsing functions were developed using the Go language. Go was designed for multicore computers. Therefore, it can maximize the performance of a multicore CPU [
32]. The synchronization manages the connection to the data server and interacts with the data buffer. If the connection is successful, then the data are deleted from the buffer, and when the connection is lost, the data remain in the buffer until the connection is successful. The data from the VCU was sent via HTTP by a one session TCP handshake for each packet. The parsing function separated data according to the specific data sensor recorded in the VCU.
- d.
Security management
The Securebox implemented AES and 3DES symmetric cryptography algorithms, which are developed in the Golang language. AES was used as the main cryptography algorithm, and 3DES was used as a comparison based on research in [
7]. The symmetric algorithm was used as it is a lightweight algorithm and there was no need for key exchange distribution in the implementation. With the symmetric algorithm, the key setup is done once in the Securebox registration process in the data server. The symmetric key algorithm also provides broader choices for the algorithm to use. In this implementation, the AES algorithm in Cipher Block Chain (CBC) mode is used as one standard algorithm for modern cryptography [
33]. When it is configured as train-side HSMs, then its application works in encryption mode, and then the application works in decryption mode when configured as server-side HSM.
- 2.
Data server setup
The data server consisted of a server-side HSM and a data server. The security management function in the server-side HSM decrypts the ciphertext to obtain the plaintext. The plaintext data are then saved according to the data parsing position in a database. The database is crucial because it manages the data in the data server and is visually displayed in the monitoring function. The data in the database will proceed further for analytical purposes. The data server manages the plaintext received from the server-side HSM and acts as the database function and the monitoring function.
The database function is used as a repository for plaintext data from server-side HSM. In this test, PostgreSQL was used as it is a high-reliability database platform and can run on almost all operating systems [
34]. The database consists of one table. The table contains Epoch time, which includes the time when the data are generated by the VCU, and 12 parameters, v1 to v12, which consist of the data parameters generated by the VCU. The data are generated as a binary number. Here, the data are saved in plaintext since the data server handles the security aspect. The further analytical process of the data is not discussed in this paper as it will be discussed in a further report.
- b.
Monitoring function
The monitoring function is a function where all sensor status data in TCMS are displayed. To receive the data, the monitoring function connects to the database. The connection can be in the form of a database query or by API. For security reasons, API is preferable since it lessens the probability of SQL injection attacks [
35]. API itself can be either PULL or PUSH [
36]. The monitoring was done on a web-based platform. To connect to the database, a pulled API has been developed where the monitoring application could request data from the database according to its need and according to what data are served by the API endpoint.
3.4. Validation
In the blackbox testing, the system was analyzed based on application details which are the functions that exist in the application. This test did not inspect and test the source code of the program. It analyzed the function flow of the system to suit the business processes of the Securebox architecture, which are:
Data connection. The data connection function was tested in three scenarios using an internet connection and a TCP connection. The result can be seen in
Table 1.
Parsing data. The parsing data function validation used several scenarios: incoming data to Securebox HSMs, the data separation based on file extension, the contents of the *.txt file parsed by the parsing module, and the contents of the *.txt file entering the temporary buffer. The validation result is shown in
Table 2.
Data synchronization. The data synchronization function was validated in three scenarios: the HSM and data server contact, sending synchronization messages from HSM, and replying to synchronization messages. These scenarios are very important due to the possibility of a lost connection along the train trip. The synchronization validation output is shown in
Table 3.
Data encryption and transmission. The data encryption and transmission function were evaluated in several scenarios, which were: detect the *.dds file, encrypt the *.dds file, send *.dds encrypted data file, load the content of *.txt file, encrypt the content of *.txt file, and send the encrypted *.txt file. The result is shown in
Table 4.
Data receiving and decryption. The data receiving and decryption function is done in the data server. Three scenarios were used to evaluate the output: receiving the ciphertext (encrypted data), decrypting the ciphertext, and detecting the plaintext output. The validation result is shown in
Table 5.
From the Blackbox testing, the secure TCMS was successfully implemented and met all the software development life cycle requirements. From this behavioral test, all the required functions in the prototype worked and were integrated properly in all testing scenarios. The Securebox HSM connected and sent all data from the VCU, secured, and transmitted securely to the data server. The secured data also was successfully received and stored in the data server database.
- 2.
Whitebox Testing
In whitebox testing, several scenarios were deployed to verify the actual output of the system modules’ program code.
Data were sent from the VCU to the train-side Securebox HSM by testing the specific programming code to support the HSM FTP server testing. The scenarios were: a *.dds file sent from VCU to HSM buffer management, a *.txt file sent from VCU to HSM buffer management and parsing function, and other files sent to HSM file folder but not proceeding further. The result is shown in
Table 6.
Establish connection. It was done by observing the specific program code to support secured connection testing from the onboard train system to the data server. The result is shown in
Table 7.
Data Synchronization. Data synchronization testing was done by testing specific programs to support integrity functionality. The testing used four scenarios, which were: sending synchronization data to the data server whether the connection was available, synchronization data to the data server with no connection available, synchronization data to the data server after the connection was lost for 1 min, and synchronization data to the data server after a connection was lost for 5 min. The scenario and results are available in
Table 8.
Encrypt and transmit data. The encryption and transmission testing was done in the server-side HSM to ensure the programming code met the confidentiality requirement. Several scenarios were tested by encrypting the *.dds file, encrypting the *.txt file, and sending data when the onboard train system was connected to the 4G network. The result is shown in
Table 9.
Data decryption. The decryption testing was done in the server-side HSM, right after the data were received. The testing analyzed the program code to meet the required output for the decryption of the *.txt and *.dds files. The output can be seen in
Table 10.
From the whitebox testing, the program code of the Securebox application gave all the expected results. The whitebox analysis showed that all the module and the program code in the module were checked and analyzed to produce the expected output. There was no malfunction or wrong programming code, which can produce a system error.