Trusted Time-Based Verification Model for Automatic Man-inthe-Middle Attack Detection in Cybersecurity

Due to the prevalence and constantly increasing risk of cyber-attacks, new and evolving security mechanisms are required to protect information and networks and ensure the basic security principles of confidentiality, integrity, and availability—referred to as the CIA triad. While confidentiality and integrity can be achieved using Secure Sockets Layer (SSL)/Transport Layer Security (TLS) certificates, these depend on the correct authentication of servers, which could be compromised due to man-in-the-middle (MITM) attacks. Many existing solutions have practical limitations due to their operational complexity, deployment costs, as well as adversaries. We propose a novel scheme to detect MITM attacks with minimal intervention and workload to the network and systems. Our proposed model applies a novel inferencing scheme for detecting true anomalies in transmission time at a trusted time server (TTS) using time-based verification of sent and received messages. The key contribution of this paper is the ability to automatically detect MITM attacks with trusted verification of the transmission time using a learning-based inferencing algorithm. When used in conjunction with existing systems, such as intrusion detection systems (IDS), which require comprehensive configuration and network resource costs, it can provide a robust solution that addresses these practical limitations while saving costs by providing assurance.


Introduction
A Digital Certificate (DC), also known as Secure Sockets Layer (SSL) certificate, has been used extensively in the cybersecurity domain and operates based on the public key infrastructure (PKI) with public key cryptography.It provides authentication, privacy, confidentiality, encryption, and digital signatures.It uses a private key for signing in and a public key for verification along with the identification (ID) of the certificate authority (CA) and the user who requested the DC.While it generally works well in providing authentication and ensuring the integrity of digital transactions, DCs are abused for the purpose of malware diffusion, cyber espionage, and sabotage of targeted or general famous sites.For example, state-sponsored hackers may have a great interest in demonstrating their capability by attacking a trusted CA, as highlighted by the case of the Comodo certificate hack.A well-known CA, Comodo, issued fraudulent DCs for nine websites, including Google, Microsoft, Skype, Yahoo, and other major sites [1].An Iranian hacker, who was suspiciously sponsored by the state, hacked a registration authority (RA), a reseller of Comodo certificates, and claimed responsibility by posting the hacked incident at pastebin.com in 2011 [2].This proves that even hardened security measure and reputable DC providers can be vulnerable and are not free from man-in-the-middle (MITM) attacks.
While a digital signature (DS) provides the authenticity (A) and integrity (I) of the CIA triad, it does not provide confidentiality (C), as it uses a public key which is accessible by any user for decryption.PKI with symmetric key distribution is used to provide confidentiality.Multifactor authentication (MFA) has been introduced to harden security, with increased protection and prevention against any adversary attempting to gain access to a system.One-time passwords (OTP), timestamp, and nonce features are used to prevent replay attacks.Despite the aforementioned security measures and technologies, MITM attacks are known to be one of the most common and dangerous intrusions.
MITM attackers eavesdrop on the communication between two targets, and the attack is hard to detect.In particular, within an intranet environment, an adversary can freely surf or intercept confidential information.To prevent the attack, an intrusion detection/prevention system (IDS/IPS) can be used.However, several studies in the literature concur that such third-party solutions exhibit practical limitations due to persistent system complexities, escalating deployment costs, and several inconveniences, including business productivity losses due to high false alarms [3,4].To address these issues with the aim of ensuring secure information communications, we propose a novel inference scheme using trusted time-based verification for automatically detecting MITM attacks.
The rest of the paper is organized as follows.Section 2 highlights selected related works and the gaps in the literature.Section 3 presents the research background describing the research objectives.In Section 4, we propose a novel approach to detect MITM attacks based on a transmission time-based verification approach.Section 5 describes the implementation of our model with the algorithms developed for this study.Finally, we provide the conclusions and future works in Section 6.

Related Works
Several studies using time-based verification for authentication have been reported in the literature.In the past, approaches to combine various pieces of information, such as a personal identification number and a verifier, were adopted to generate an authentication code within a single time interval [5].To address their privacy drawbacks, more recently, the time-based one-time password (TOTP), as an extension of the HMAC-based OTP, was employed across a wide range of applications, such as remote virtual private network access, Wi-Fi logon, transaction-oriented web applications, and internet banking [6][7][8].To enforce an MFA, a mobile App on a smart device is used to display a QR code which contains two time-based codes.These codes are then used for login verification, in addition to the existing login mechanism, by converting the OTP to app-based authentication [9,10].These require manual intervention.
Network Time Protocol (NTP) has been used effectively to provide time synchronization between symmetric peers in a network.By using a basic pre-shared key scheme, authentication approaches have been explored, including the introduction of an autokey authentication protocol (RFC 5906) using PKI mechanisms [11].An NTP Working Group released a draft for network time security for NTP using a mechanism of Transport Layer Security (TLS) and authenticated encryption with associated data (AEAD) for cryptographic security [12].However, deploying such solutions are complex and expensive, leading to several practical limitations.Overall, existing approaches are quite inconvenient and lack automatic and seamless detection of MITM attacks.This forms the prime motivation of our research study.
While existing systems such as IDS/IPS may require complex configuration to the network by packet inspection and result in overhead, the proposed solution can be implemented with minimal impact to existing protocols.Existing approaches to the detection of such attacks have limitations, such as high implementation costs, unlike the proposed approach, which is advantageous as it does not require complex procedures and is inexpensive.The proposed inference procedure does not require modifications to existing protocols other than a minimal addition of a message prompting the subscription to the feature.When the proposed solution is used along with the existing IDS/IPS, it will help security operators reduce false positives (FPs) because they are provided with assurance from the solution.An average security professional can only deal with a limited number of issues per day, whereas current IDSs in typical company networks can generate hundreds of alarms.
There have been developments on IP piracy prevention that can be used with hardware Trojan attacks.Yu Bi et al. tried to simulate camouflaging gates, polymorphic gates, and power regulators to prove the high efficiency of silicon nanowire field effect transistors (FET) and graphene symmetric tunneling (SymFET) in applications.They evaluated the unique properties of the new devices and their ability to protect circuit designs and counter IP piracy [13].

Research Background
In today's digital business world, secured and trusted information communication in a network is important, and this is offered by the SSL/TLS protocol security mechanisms.However, SSL/TLS relies on the correct authentication of the server using digital certificates, which could be tampered with by an adversary using several approaches that lead to MITM attacks.There are several real-life instances of illegitimately generated digital certificates.Furthermore, a hacker could intercept a conversation between the sender and receiver of a message to access the information for impersonation or tampering of contents.We studied this problem to arrive at a robust and easy-to-deploy solution, which is the primary focus of this paper.
We considered the abovementioned practical limitations in order to arrive at the overarching question, given below, that guided this research: Can we design an automatic trusted model that adopts an intelligent mechanism to employ time-based verification of sent and received messages to detect MITM attacks?
In order to answer the above research question, we describe a typical problem scenario with a proposed high-level solution approach given below: When a user requests the issuance of DC to a CA, the user may select the proposed option of using a timestamp check at a trusted time server (TTS), which will collect and verify the transmission time between when the message is sent and when it is received by the sender and receiver, respectively.The outcome of the verification is to advise the user of any suspicious MITM activity based on the difference between timestamps and preset thresholds, which can be maintained by ongoing updates.The automatic inference scheme is developed using a self-adaptive learning approach.

Proposed MITM Detection Model
We propose the use of time-based verification of transmission along with an inferencing database at a TTS to automatically and effectively detect suspicious activities from MITM attacks.We present our proposed model here by modeling the problem scenario and the types of MITM attacks detected by the model.

Modeling the Problem Scenario
Let us consider sender A (Alice) who transmits a message to receiver B (Bob) as shown in Figure 1.We assume that the information about the message, such as source and destination IP addresses, message ID, etc., can be simultaneously sent to a TTS which is trustworthy and records the sent timestamp, T S .When the message arrives at the receiver, B sends the message information to the same TTS, which also records the received timestamp, T R .The TTS then calculates the difference between Sent and Received time, which is denoted as T D .This information is used to determine whether the message has been tampered with or not by comparing it with a predefined threshold value for tolerance along with appropriate learning algorithms within the Inference Database.This threshold table is adjusted continuously by the roundtrip time (RTT) obtained from the sender, who will ping before and/or after the message transmission.If required, the receiver may send the ping results to the TTS to check the network status.When suspicious activity has been identified as an attack, the TTS adds the case to the existing database to record the intruder's details for future use.The TTS also creates and maintains the network status as separate data by the RTT within the network.All these data can be used to create a situation determination table, which will be used to determine a suspected intrusion.
We consider a possible scenario of MITM intrusion and describe a high-level overview of our proposed MITM detection model, as shown in Figure 1.The steps involved in our model are explained below: Steps 1, 2, and 3: When A sends a message to B, the information about the message is also sent to a TTS.The server records the timestamp and responds to A with acknowledgment.
Steps 4 and 5: When the message is received by the recipient, B also sends the information about the message to the TTS.The server records the timestamp and responds to B with acknowledgment.
Steps 6 and 7: The TTS compares the sent and received timestamps against a preset threshold table to see whether the message has been delayed for longer than the expected transmit time period.If it detects a suspicious event by checking the database of situation determination, it sends an alert notification to the respective parties, A and B, about the possibility of an MITM attack.However, in order to avoid any false notification, the Inference Database is continuously updated with threshold table values using intelligent learning approaches.Sender A pings receiver B before and/or after the message transmission to see the response time, e.g., RTT, which will be used for updating the threshold table.If there was an MITM attack, there must be a sufficient time delay between the sent and received timestamp to arrive at a true anomaly, thereby confirming the inference.

Model Intelligence to Counter Evasive MITM Attacks
As shown in Figure 1, an MITM adversary (Eve) may try to emulate arbitrary latency so that additional time is earned to intervene delivery of the message.Once Eve inserts a synchronous pass through switch, it will slowly increase the latency time by merely storing and forwarding until both the sender and receiver are accustomed to the delay with the slightest degradation in latency, just as the adversary expected.To address this type of evasive MITM attack, our model exhibits intelligence using the inference engine to detect an abnormal delayed response time.TTS will infer the likelihood of the latency by continuously monitoring the network, as well as by learning through previous experiences and historical data.
An MITM adversary may adopt an evasive approach that involves manipulating timestamps.When the profile of messages from A and B have arrived at the TTS, our model is designed to intelligently check the validity of the timestamps, as these might have been replayed or invalid with elapsed time.To verify this, tools such as Casper [14] can be employed to validate the timestamps using the concept of age-stamps, where each timestamp actually specifies how long ago it was created.
Another evasive technique is one in which the adversary, such as Eve in Figure 1, monitors the communication channel and redirects all communication to fake sites to collect private information, such as the bank account details of Alice or Bob when they log into their bank's website.This can be avoided by using a TLS certificate, with additional checks on the use of the https application and further hardening with the proposed solution.
Our model makes use of intelligent learning approaches to counter the various evasive approaches that an MITM adversary may adopt.Furthermore, false inferences are avoided even with genuine network delays between A and B. For this, synchronization between the TTS and the network (including A and B) is important and can be achieved by employing network time security (NTS) in the network time protocol (NTP) [12].Particularly when A and B are geographically far from each other and their communication must travel long distances, a false alarm may be triggered as a result of the delay from passing through various heterogeneous networks, such as cellular networks.This type of issues can be inferred intelligently by the TTS based on previous learning experiences gained in the network.

Implementation of Time-Based Verification Model
This section describes the model implementation and algorithms.

Model Implementation
When a DC issuance is requested, the user can select an option to use TTS-based timestamp verification using a new file type or extension.There are various file types and extensions that are used for DS between a server and a client, such as the personal information exchange format (.pfx)/public key cryptography standards (PKCS#12)/(p12) to export a certificate and its private key, a certificate signing request (.csr) to submit a request to CA, a Base64-encoded X.509 certificate (.cer or .crt)for a single certificate, a certificate revocation list (.crl) to identify relocated certificates, a certificate trust list (.stl), privacy-enhanced electronic mail (.pem) and private key (.key) etc. [15].Our model implementation activates the feature of a new file type or extension, such as (.tbv) that can be used for time-based verification.For instance, when this type or extension is selected during a DC setup process between a server and a client, a TTS is appointed in the same way as the procedure for selecting a CA is selected, e.g., using (.stl).This way, all communications between the server and the client will send the message profile to the TTS to verify MITM intervention as an option.
To minimize the overhead of the network protocols, the TTS will only notify the related parties of suspicious activity when it has identified a likely attack.To improve the accuracy, the TTS is modeled to use an inferencing algorithm to determine a threshold customized for each network.Accordingly, the network will not be disrupted by an additional inquiry from the TTS unless it is necessary for further verification.When required, further clarifications, such as checking the timestamp validity of messages from A and B, may be required.Figure 2 depicts the message profile exchanged between the hosts and the TTS.The message profile details include the sent and received times transmitted by the sender and receiver hosts to the TTS.The TTS responds with timestamps and the inferred results to indicate whether the message is suspicious or not.

Inference Engine Using Threshold Table
The inference engine of our model creates and makes use of a threshold table in the Inference Database to compare and assess a suspicious intrusion using learning-based inference rules.Several information details and parameters serve as inputs into the inference engine as follows:

•
Ping results to and from A and B; • Previous data and learning experiences; • Network types (local, national, international, wired, wireless, cellular, metro, wide area network, etc.), which can be determined by source and destination IP that relates to geographic location; • Current network status, e.g., normal, congested, abnormal etc.
Based on these inputs, a threshold value is decided with a level of margin, e.g., 1500 ms +/− 10%, as an outcome, which will be compared with the value of T D (time difference between T S and T R ).Table 1 shows an example threshold table created and referred to by the inference engine and TTS of our model to make an inference about an intrusion.The physical distance is calculated by the distance vector (DV) routing parameters, such as number of hops using the Bellman-Ford algorithm, and the logical distance is determined using well-established methods, such as the link state (LS) cost-based link state routing using the Dijkstra algorithm and hybrid Bellman-Ford-Dijkstra algorithm [16].These algorithms aid in the reduction of convergence delays for link failure recovery [17].The status of normal, congested, abnormal, and so on can be determined by a comparison with previous traffic data.Network diagnostic commands could be adopted, e.g., 'netstat -s', to display protocol statistics, including user datagram protocol (UDP), transmission control protocol (TCP), internet control message protocol (ICMP), and internet group management protocol (IGMP).The threshold margin is determined by averaged data learned previously.The margin indicates the possible latency learned from the previous experiences and historical data.The final threshold data can be averaged with P d and L d , along with other variables which are used to calculate the likelihood of an MITM intrusion by the inferencing algorithm.The network type is grouped based on the distance between the source and destination node.The outcome of the inferencing process will be an upper value of the standard deviation (SD) probability which has been predefined by the previous learnings from sample data.For example, 160 ms in Table 1 indicates the most optimal response time for network type B with distance vector-based protocols satisfying a 95% likelihood level.In other words, any delay longer than this value can be regarded as a sign of unusual activity and potential intrusion, which would require further clarification by the inference engine to detect MITM attacks correctly.

Time-Based Verification and MITM Detection Algorithm
In Figure 3, we depict the workflow for the implementation of our time-based verification model for MITM detection.The detailed workflow is given in Figure A1 in Appendix A. Algorithm 1 provides the proposed algorithm to detect MITM attacks by comparing timestamps of current activity, previous learnings, and historical data stored in the Inference Database.The detailed algorithm implementation using Java is given in Appendix B. The algorithm was developed in detail for the high-level solution of our proposed MITM detection model discussed earlier with reference to Figure 1.The algorithm compares values from the threshold table of tolerant response time in order to determine suspected intrusion effectively.
We provide the definitions of notations (in alphabetical order) in Table 2 used in the algorithm given in Algorithm 1: Here, we describe the algorithm.
A pings B over a network of type N T before the message transmission begins to see the response time (RTT).The algorithm calculates T P 1.Then, A sends a message to B, and all the required information about the message, including the message unique identifier M ID , is sent to the TTS.N T can be local, national, international, wired, wireless, cellular, metro, or wide area network.The TTS records the timestamp in Inference Database db I when it receives the message from A, and it sends an acknowledgment confirmation to A. The algorithm generates a timestamp (T S ) when A sends a message to B. It then records M ID , T S , and T P 1 in the Inference Database db I .
When B receives the message, it pings A, and T P 2 is calculated and the information about the message is also sent to the TTS.The algorithm generates the timestamp of message receipt T R .It updates T R and T P2 in the Inference Database db I based on M ID .
The algorithm then proceeds to calculate the time difference T D .It also gets T TH , which is a threshold value derived from the threshold table of tolerant response time, with a margin (T M ) level and network status (N S ) based on the N T .T TH and T M are calculated from the average values of the physical distance (DV) and logical distance (LS) based on previous learnings.The mean (µ) and standard deviation (σ) are calculated continuously and updated.In addition, to provide intelligent inferences, the values of µ and σ are calculated with context-dependent weights associated with the calculated values for T P , T TH , and N S .The a% weight of T P and the b% weight of T TH and N S value would typically range between 0.25 and 1.
The algorithm finally performs various comparisons and checks for allowable threshold values of T D , T P , µ, and σ, with and without the level of margin T M .If the values lead to the inference of an anomaly, indicating that the message delivery took much longer than the expected transmission time, the algorithm suspects an MITM attack and the TTS sends an alert notification to A and B. The Inference Database also gets updated with all the recorded values, which is useful for forming previous learnings.

Results and Analysis
We conducted a simulated experimental test using threshold values of tolerant response time from Table 1 and additional test data in order to verify and evaluate our proposed MITM detection.The detailed implementation of the algorithm is given in Appendix B. Table 3 provides the values of the outcomes obtained for a network type A; the values are either calculated or retrieved from the threshold table in the database using our proposed inference engine.Based on the values of network type A, there are two cases to discuss, as follows: (1) Nonsuspicious outcome of network type A, as shown in Table 4-the SD value was calculated and used to run the algorithm program as in Appendix B, and the results show that the variables are within the thresholds as previously defined.

Result:
The TTS will mark this message as not suspicious and not send any notification to A and B.

Result:
The TTS will mark this message as suspicious and send necessary notifications to A and B.
(2) Suspicious outcome of network type A, as shown in Table 4-the SD value is outside of the predefined threshold values, and the results show that the algorithm works properly, as it compares the data received with the predefined threshold data.This is shown in Appendix B.

Performance Measures
As a quality measure, the performance of our proposed algorithm requires continuous monitoring and automatic updates of the database.To achieve this, the following standard performance measures are used to attain the desired level of accuracy for detecting MITM attacks:

•
Overall Accuracy: Percentage of correct inferences made.The overall accuracy is calculated as follows: Overall Accuracy = TP + TN TP + TN + FP + FN A True Positive (TP) is the most commonly used performance measure to evaluate the effectiveness of the detection model, as it is an indicator that the model can detect a suspected MITM and then raise an alarm before the attack happens.During real-time deployment, the higher the TP measure, the more accurate the model.
A False Negative (FN) is an indicator that the model does not detect the MITM attack when the attack exists.It is expected that in real-time deployments, very low FN values indicate a better performance of the model.
A False Positive (FP) is another commonly used performance measure to evaluate the level of tolerance and context-dependent inference exhibited by the model.It is an indicator of false detection when the network is conducting normal activities but appears to have an MITM attack due to unforeseen latency or network congestion.In real-time deployments, a low FP measure indicates a high level of learning and intelligence exhibited by the model.
A True Negative (TN) is an indicator that the model infers no suspected MITM activity when there is no MITM attack taking place.This measure also contributes to the overall accuracy and robustness of the model.
The True Positive Rate is an accuracy measure to determine the ability of the model to detect the MITM intrusions when MITM attacks exist.It is also referred to as the sensitivity.On the other hand, the False Positive Rate is used to evaluate the misjudgment ratio, which is also an important performance measure.It is also referred to as the specificity.In real-time deployments, a high TP Rate achieved along with a low FP rate indicates an effective model.Finally, the overall accuracy measure indicates the performance of the model in terms of correctness and robustness.
Another quality measure of the model based on a performance graphing method is called the receiver operating characteristic (ROC).It is a plot of the True Positive Rate on the Y-axis and False Positive Rate on X-axis.It can be used to evaluate different threshold schemes and compare the relative performance among different schemes to classify malware attacks.The overall accuracy measure is based on one specific cutoff point and hence can vary for different cutoff points.On the other hand, ROC considers all cutoff points and plots the sensitivity (TP rate) and specificity (FP rate).The area under the ROC curve (AUC) depicts the relative trade-offs between true positives and false positives.Hence, the AUC was used to compare the performance of our model under different threshold schemes for malware classification.For instance, Figure 4 shows that the AUC under B is greater than that of A; thus, B exhibits a better performance than A.

Conclusions and Future Works
Despite the application of secure SSL/TSL authentication techniques, MITM attacks are still prevalent.Security mechanisms such as SSL, TSL, HTTPs, and Certificate Authority are being enhanced for the provision of the CIA triad.However, evasive MITM attacks could even intercept certificates and forge them.In addition, existing solutions to the detection of MITM attacks have several practical limitations, including high implementation costs.
In this paper, we propose a trusted and effective model for detecting MITM attacks automatically using transmission time-based verification along with a novel learning-based inference scheme.The advantage of our model is that it requires neither complex system configurations nor expensive security implementations.Our inference algorithm works in the TTS without any need to modify existing protocols.By using a threshold table for tolerance in the Inference Database, a self-learning mechanism was established.Furthermore, the performance measures show our real-time deployment considerations for monitoring the effectiveness of the model.
The threshold table proposed as an example may not be accurate and could affect the accuracy of the detection model if applied straightaway.Future work will involve developing the threshold table for real-life implementation.Another interesting line of future work would be to locate an MITM adversary in the network.This could be possible by calculating distances from each node and the TTS and locating the intersection.For example, if an adversary is found with an unexpected location, such as a foreign country's IP address, the network operator may manually revise the policy to bypass the network from the point at which the adversary taps into the conversation.
When the TTS has built enough of a database knowledge base to compare with other servers in heterogeneous networks, it will be possible to develop and show pathways using data visualization techniques of the conversation paths.These paths can be classified with various credit levels, such as green, yellow, red, etc., so that network operators can use the information in their network topology configuration as additional attributes to the link costs or distances.
In order to measure the accuracy and performance and determine the usability and efficiency, a benchmarking with other solutions is required, such as IDS compared with the solution prototype.

Figure 1 .
Figure 1.High-level overview of proposed man-in-the-middle (MITM) detection model.

Figure 2 .
Figure 2. Message profile notification exchanged between sender/receiver hosts and the trusted time server (TTS).

Figure 3 .
Figure 3. Workflow of time-based verification and MITM detection.

→
Compare ((50 > 40) AND (50 > (40 + 40 * 5/100))) → Compare ((50 > 40) AND (50 > 42)) → return true → The Timedifference is greater than the PingRTT and the algorithm will perform the next Boolean comparison as below: → Compare ((100 > 40) AND (100 > (40 + 40 * 5/100))) → Compare ((100 > 40) AND (100 > 42)) → return true → The Timedifference is greater than the PingRTT and the algorithm will perform the next Boolean comparison as below: → Compare (50 > 61.13) AND (50 > 63.67)) AND (50 > (66.85)) → return false → Compare (100 > 61.13) AND (100 > 63.67)) AND (100 > (66.85) → return true Verification: As the above Boolean comparison returns a false, that means that the time difference is within the time limits based on the standard deviation of the average Ping RTT and Threshold time of Network type A. The time difference does not lead to the inference of an anomaly.Verification: As the above Boolean comparison returns a true, that means that the time difference is outside the time limits based on the standard deviation of the average Ping RTT and Threshold time of Network type A. The time difference does lead to the inference of an anomaly.

Figure 4 .
Figure 4. Comparison of performance measures using receiver operating characteristic (ROC) plots.

Table 1 .
Threshold of tolerant response time.

Table 2 .
Definitions of denotation Timestamp recorded by the TTS when machine A sends a message to machine B T TH Threshold time-A threshold value that is decided based on the table of thresholds of tolerant response time, with the level of margin and network status based on the network type T M Time margin-Level of margin which will be compared with the timestamp difference value T D percentage; it is determined by averaged data learned previously.

Table 3 .
Experimental outcome for a network type A.

Table 4 .
Stepwise verification of the algorithm for suspicious and nonsuspicious cases.