2. Advanced Persistent Threat
2.1. Characteristics of Advanced Persistent Threats
- Advanced: the enemy is familiar with the tools and techniques of intrusion, able to develop custom exploits.
- Persistent: the enemy intends to fulfil a purpose, receive orders, and attack specific goals.
- Threat: the enemy is coordinated, supported, and motivated.
2.2. APT Attack Process
2.3. Methods and Techniques
- Social engineering: Getting a user to compromise information systems. This technique is directed to people with privileged access, manipulating them to divulge personal information to carry out a malicious attack through control and persuasion, instead of involved aleatory attacks on systems .
- Spear-phishing: This technique is an attempt that primarily targets a specific organisation in order to collect user credentials, financial information, or other confidential information .
- Watering hole: It is similar to spear-phishing in cyberespionage. The attacks are adapted to the needs of the victims. In order to do this, attackers try to obtain information about the victim considering his/her personal interests .
2.4. Attribution Problem
- China: Chinese cyberattacks have been observed as being focused on industrial espionage, and aimed to steal intellectual property. APT1 have been the most persistent cyberthreat of this actor .
- United States: This actor could have perpetrated the most sophisticated cyberattacks. Attacks have been harmful, and advanced technologies have used, which means considerable resources for the development of this type of attack. The APT campaigns have mainly been used to enforce geopolitical interests. One example is the world-famous operation Stuxnet , which targeted SCADA (Supervisory Control and Data Acquisition) systems to cause substantial damage in the nuclear program of Iran.
- Russia: This actor is very active in terms of state-sponsored APT activity. These groups have been involved in high profile intrusions and because of this has been the subject of intense investigations . Recently, spear-phishing attacks from APT28 have been detected by Microsoft; their targets have been the employees of the German government. This group has attempted to gain access to employee credentials and infect sites with malware .
- Iran: In the Middle East, this actor controls the most attack capacity attributed to the country with several incidents perpetrated by diverse groups . Experts have monitored APT33 operations because this group has recently upgraded its infrastructure. The main objectives of this group have been the aviation industry and energy companies with connections to petrochemical production. The latest malware campaigns have been targeted at organisations in the United States, the Middle East, and Asia .
- Israel: This actor has been identified as a possible co-author of the Stuxnet  attack. It is publicly known as the high potential of the intelligence services of this country, one example is Unit 8200  of the Israeli army, the equivalent to the US intelligence agency NSA. The Duqu 2.0  attack was state-sponsored by this actor and it has infected numerous systems in several countries in recent years. This malware used zero-day vulnerabilities, and for sending data to the command and control (C&C) servers; different techniques were used to take the computers on the network.
3. Machine Learning
3.1. Techniques and Algorithms
3.1.1. Supervised Learning
- Artificial neural networks (ANN) are computational brain-inspired models and interlinked by a lot of interconnections (artificial synapses) of artificial neurons (nodes) capable of performing specific calculations at their inputs . An artificial neuron is composed of three or more layers, an input layer, one or more hidden layers, and an output layer. An ANN is capable of creating non-linear models to obtain the relationships between input attributes and label classification . The main characteristics of ANN are adapting from experience, learning capability, generalisation capability, data organisation, fault tolerance, distributed storage, and facilitated prototyping . These algorithms are useful for speech and pattern recognition , climate forecasting , and disease diagnosis ; although this model also solves classification and regression problems.
- Support vector machine (SVM) is one of the most accurate and robust methods of ML algorithms. This classifier works by identifying a hyperplane between two classes of labelled data in a set of training data. The SVM classifier uses several types of methods, e.g., non-linearity and use of kernels, separability, and margins or risk minimisation. Non-linearity and kernel usage are some of the pioneering discoveries in the field of ML; this method permits that a non-linear problem can be transformed into a linear problem. Several types of separating hyperplanes can be realized using a kernel, such as radial basis function (RBF), polynomial, linear, or sigmoid. Risk minimisation can be applied to cases that do not fit into the traditional SVM architecture, such as problems with missing data or unlabelled data [33,34,35].
- Decision tree (DT) models are accurate, stable, and straightforward to interpret. Their construction is based on decision rules that are represented in the form of a tree. The result of these models can represent non-linear relationships for problem-solving. Decision trees and random forests are the most remarkable because they are more precise and elaborate. Their predictive capacity is higher because of these characteristics, but their performance is low. Most commonly used algorithms for building decision trees are CART (Classification and Regression Tree), ID3 (Iterative Dichotomiser), and CHAID (Chi-Squared Automatic Interaction Detector) [33,34,36].
- Bayesian networks (BN) are probabilistic graphical models used to describe and analyse multivariate distributions. These variables can be continuous or discrete, however, when all variables are discrete, the notation is represented as a series of sums and products. In the graphic representation of a BN, the nodes represent an observable variable or state, and the edges symbolise the conditional dependencies between the nodes. BN has been used in different areas, for example, Microsoft Windows System, NASA mission control, and bioinformatics applications [34,37,38].
- k-nearest neighbour (k-NN) can be used for both regression and classification problems. Due to simplicity, effectiveness, and intuitiveness of the concept, this model can be used to identify the nearest neighbours for a given data point based on a distance measure [39,40]. The assumption is that similar elements are closer together. The idea of closeness is a measure of distance, which can be a simple Euclidean distance between two points. In this case, the classification decision may be influenced by the sensitivity of k, especially in small data sets with outliers. Numerous families of distance measurements exist, and the following can be highlighted: Minkowski, Inner product, Square Chord, Shannon entropy, and Vicissitude .
- Hidden Markov model (HMM) is a stochastic probabilistic model of discrete events and a variation of the Markov chain, a chain of linked states or events, where the next state depends only on the current state of the system. HMM is used to analyse features or observations to predict the most likely state sequence; these hidden states represent an unobserved attribute of the process. HMM have been used to solve problems of financial analysis, genetic sequencing, image processing, and natural language processing [34,42].
3.1.2. Unsupervised Learning
- Principal component analysis (PCA) is a procedure of dimension reduction. This statistical method is useful when there are a large number of variables, where each variable has more or less importance. PCA generates a score matrix T called a score matrix where the correlation between variables is displayed in a maximum of two or three dimensions. This procedure is used to assign a set of interrelated variables to a smaller set of non-linearly correlated variables while representing as much variance as possible in the original data set . Some examples of applications of this method are feature extraction , social science, medicine, and genome .
- k-means is a clustering algorithm. This technique consists of selecting the input data into k clusters for a predefined k group. Each data point in the input set is unlabelled data. The interpretation for each of the k groups is that the mean value of the group is representative of all elements in that group. Alternatively, each k groups could represent a type of input data. The user defines the number k of clusters. This algorithm uses computational distances to find the distance between two points, for example, the Euclidean distance. Also k-means can be used in intrusion detection systems (IDS) .
- Fuzzy c-means is a soft clustering algorithm. This method randomly selects the number of clusters; then, each data point is assigned a cluster membership. This process is continuously reviewed to minimise the distance and degree of cluster membership .
- Hierarchical clustering is used to cluster data points when the data is unlabelled. This method can be classified into two categories: divisive and agglomerative. In the divisive approach, the data points are considered as one large cluster and then divided into smaller clusters. In the agglomerative approach, each data point is considered as an individual element, and then it is added to a cluster .
3.2. Role of Machine Learning in Cybersecurity Applications for Apt Detection
- Detection: These are the tools that allow the detection of abnormal behaviour to generate alerts in real-time, and to facilitate decision-making.
- Protection: Detect vulnerabilities to install security fixes automatically.
- Prediction: Techniques and algorithms to predict attacks and develop anti-malware techniques.
- Termination: Automatically eliminating the threat.
Spam and Phishing Detection
3.3. Approaches Used for APT Detection
- Threat detection: The network traffic is scanned by eight detection modules to find techniques used by APT. The output of this phase consists of alerts, known as events.
- Alert correlation: The events generated by detection modules are correlated, and the output can be two types of alerts.
- Attack prediction: A machine learning-based prediction module is used to detect APT techniques.
- Network traffic, traffic flow is collected, processed, and analysed by a method of recognition using machine learning algorithms.
- Correlation event, through specific rules given by an administrator, the events generated in the previous phase are collected to be evaluated.
- Voting service, the previous information are analysed, and the alert is generated if an APT attack is detected.
- APT system architecture: Network data and information system was collected to be analysed.
- Big data processing technology: A Hadoop cluster was used to improve the analysis of an APT attack.
- APT analysis technology: The detection of malicious attack was detected from vulnerabilities and suspicious connections with anomalous behaviour.
- APT detection algorithm: This method used the tool Mahout because it can process big data and k-NN algorithm can be used for the detection. This model was divided into four phases: retrieve, reuse, revise, and retain.
4. Advanced Persistent Threat Life Cycle Analysis
4.1. Three Stage Attack
- Initial compromise (IC): In this stage, attackers attempt to access to the target system. The most commonly used techniques in this phase are spear-phishing (e.g., attaching an email or a link to a compromised server), watering-hole (malicious code on a regularly visited website), server-side attacks (exploiting vulnerabilities on servers or stealing brute-force credentials), and infected storage media (compromised USB, CD, or DVD).
- Lateral movement (LM): Attackers attempt to compromise other services on the target system or network. The objective is to try legitimate credentials that will allow them to persist in the system. Some of the LM techniques used are standard operating system tools (e.g., RDP, PsExec, and Powershell), and exploit a vulnerability (zero-day exploit).
- Command and control (C&C): When the system has been compromised, it is necessary to establish an external connection to exfiltrate data. Attackers use services such as HTTP, HTTPS, or FTP. Also, they can use tools such as remote connection tools like VNC (Virtual Network Computing) or RDP.
4.2. Four Stage Attack
- Information collection: In this initial stage, the recognition of the network is made using scanning or social engineering tools.
- Intrusion: In this stage spear-phishing techniques, malicious email attachments or backdoors are used to obtain access privileges.
- Latent expansion: The attacker attempts to maintain control in order to obtain data that will allow the attacker to continue with the expansion within the network.
- Information theft: The attacker establishes a connection to a server, and the stolen data is transferred. Encryption techniques can be used to camouflage the extracted data.
- Initial compromise: The techniques used are social engineering and spear-phishing.
- C&C: A communication channel is established between a committed server and the target.
- Lateral movement: Attackers seek to collect internal information and move between hosts with critical vulnerabilities.
- Attack achievement: The attack is completed, and the theft of sensitive information begins.
4.3. Five Stage Attack
- Delivery: Spear-phishing is used to send emails to recipients within the network.
- Exploit: The vulnerabilities of the services, system or applications are exploited.
- Installation: In this stage, it is possible to install malware such as RAT (Remote Access Tool).
- Command and control: The attacker has remote access to a compromised host or server.
- Actions: The actions carried out consist of gaining access to other hosts or servers on the same network to extract confidential information.
- Recount: The target is selected; the information related to the target that is published is sought.
- Incursion: The attacker obtains access to the network through stolen credentials with techniques such as SQL injection or with the use of malware.
- Discovery: The attacker searches for confidential data in the system.
- Capture: The attacker installs an undetectable rootkit to collect confidential data for an extended period.
- Exfiltration: The collected data is sent to the C&C servers.
4.4. Six Stage Attack Model
- Information Gathering: The objective of this stage is to gather information on the structure of the organisation through public social network profiles.
- Point of entry: Social engineering, spear-phishing and zero-day exploit are the most used techniques for the victim to allow the attacker to gain access to the computer.
- Command and control server: The attacker establishes a connection from the compromised host to the C&C server to maintain the connection. Secure Sockets Layer (SSL) encryption is the method usually used to send traffic to the C&C server.
- Lateral movement: The attacker can move through the network to find a vulnerable host when access has been gained.
- Data of interest: Critical information on hosts or servers is identified.
- External server: the data of interest is transmitted to the C&C servers of the attackers.
- Reconnaissance and weaponization is a preparation stage to study and collect technical information from the target organisation. Some techniques used are social engineering and open-source intelligence (OSINT).
- Delivery: The attackers send the exploits to the targets directly or indirectly, for example, a direct technique can be through spear-phishing and in an indirect way through watering-hole attack.
- Initial intrusion: The information obtained in the previous stage (such as credentials), allows attackers to gain access to the target, execute malicious code and exploit vulnerabilities.
- Command and control: The attackers establish a mechanism to take control of the compromised hosts; for this, the attackers create social networking sites, TOR anonymity networks or use remote access tools.
- Lateral movement: When attackers have established a connection to their C&C servers, they move around the network of the organisation looking for useful information to gain access to other systems.
- Data exfiltration: Attackers send critical encrypted information to servers.
4.5. Seven Stage Attack Model
- Research: The attackers seek publicly available information about the victim.
- Preparation: The attackers prepare an initial attack to exploit the vulnerabilities using network scanning to create custom exploits.
- Intrusion: The attackers launch the first attack which usually consists of spear-phishing.
- Conquering the network: Remote access tools or backdoors to control the system are used when the attacker has compromised at least one host.
- Hiding presence: The attacker seeks to remain hidden in the network for a long time. The attack can have periods of inactivity.
- Gathering data: The attacker looks for data of interest and masks it as legitimate traffic to be slowly extracted.
- Maintaining access: The attacker can modify or create exploits, remote access tools and C&C servers, to obtain prolonged access to the network.
- Reconnaissance: The attacker performs a preliminary reconnaissance of the network of the organisation, using spear-phishing techniques, port scanning, and social engineering.
- Weaponization: The attacker builds a payload that is sent to the victim. It usually consists of an exploit with a RAT/troyan delivery.
- Delivery: The payload created is sent to the victim through mail, websites or a removal devices.
- Exploitation: The attacker executes the exploit that has been sent to the victim.
- Installation: A Trojan and/or remote access trojan (RAT) is installed when the attacker gains access to the system.
- Command and control: The remote access software connects to C&C of the attacker.
- Actions and objectives: The attacker performs data exfiltration compromising the integrity and availability of the data. This stage can last weeks, months or even years.
4.6. Eight Stage Attack Model
- Initial recon: Initial recognition of the target.
- Initial compromise: Describes the methods used for the first intrusion of the target, e.g., spear-phishing.
- Establish foothold: Consists of ensuring control of the target from outside the network, for example, C&C servers.
- Escalate privileges: The attacker looks for credentials that permit access to more resources within the system.
- Internal recon: In this stage, the attacker collects all the possible information about the victim.
- Move laterally: The attacker can connect and share resources using legitimate credentials.
- Maintain presence: The attacker performs actions to remain for an extended period within the network without being detected.
- Complete mission: The information of interest is compressed to be sent to the C&C servers.
4.7. Eleven Stage Attack Model
- Initial access: Consists of the initial contact with the target to search for patient zero.
- Persistence: The attacker seeks to gain access for a long time in the target.
- Privilege escalation: To obtain privileges in the network is necessary to install malware or gain access to confidential data.
- Discovery: Consists of obtaining relevant information from the target, such as system location or usernames.
- Lateral movement: Refers to how the attacker moves within the network to search for important vulnerable information or services.
- Collection: Collecting relevant information.
- Exfiltration: Extracting the collected data.The following stages achieve the objective of the attack, and can be executed in parallel with the previous seven stages.
- Execution: The execution of malware through remote connections that are carried out between the initial access stage and lateral movement.
- Defence evasion: Consists of not being detected by the defence and detection mechanisms, for example, firewall or logs.
- Credential access: Refer to accessing the compromised system with valid credentials.
- Command and control: Consists of creating a C&C channel to communicate the attacker servers with the compromised systems of the target.
5. A Novel Proposal
- Target discovery: This stage consists of the passive exploration of the network organisation, to obtain the approximate details of the IT structure to be attacked. To achieve this goal, the attacker can perform port scanning techniques (e.g., Nmap tools), search for indexed services on the Internet (web surveillance cameras, servers or SCADA systems, with tools such as Shodan), public profiles in social networks of the employees, and OSINT reconnaissance tools (e.g., spider foot).These types of techniques used to recognising the resources of an organisation are difficult to detect by ML techniques because these attacks are usually made passively. A passive attack does not modify or interfere with communication but rather listens to or monitors the information that was transmitted. Information that can be found on the Internet can be collected for sale on the darknet; these attacks can require the use of multiple specialised tools over a long period.Therefore, it is recommended to close unused ports, use firewalls, IDS, and secure private virtual connections (VLAN and VPN), create password policies, and user awareness of the organisation.
- Exploitation toolset: This stage objective is to gain access on the target network through the vulnerabilities detected in the previous stage, or by tricking an employee of the organisation. The process starts with the elaboration of a method to reach the target. For this, the attacker uses techniques such as spear-phishing in different ways, such as valid accounts or replication through USB. Later, the attacker exploits the detected vulnerability using scripting, Powershell, and user execution; then, remote management tools are used to establish a connection with the target network.To prevent an employee from being attacked, it is recommended to avoid using personal devices within the network of an organisation and to avoid opening suspicious files when in doubt. However, ML techniques allow for the creation of automated solutions to detect possible attacks at an early stage. For example, a module can be created that scans email for malicious links or malicious files.Another solution would be to scan network traffic for remote connection packets from unauthorised servers, the analysis of logs to detect anomalous activity within the network, and finally, software updates. The implementation of these ML solutions requires a training dataset in the normal flow of the organisation and another dataset with anomalous network flow. Then, the ML algorithm that provides the best accuracy must be chosen. Finally, tests must be performed in a controlled environment.The ML algorithms that have given the best results have been k-NN and SVM. During initial training or retraining of the algorithm, datasets with flows from other attack techniques can be added to improve detection.
- Internal intrusion: When the attacker has compromised the first host on the network organisation, the next objective is to escalate privileges to access confidential and critical information. For this, the attacker must be able to maintain persistence during an extended period since this stage is the longest one. Persistence on a network can be done through redundant access, account manipulation, or a web shell. Access to credentials can be obtained through brute force techniques, account manipulation, forced authentication, or credential dumping. Another essential step performed by the attacker is the evasion of defence systems (e.g., IDS, IPS, and firewall); this can be done through proxy connections and the obfuscation of files or information.Some solutions are to use ML techniques for the analysis of logs generated by IDS/IPS for the detection of possible APT attack patterns, (failed access to SSH, FTP, or telnet services), analysis of system logs (unauthorised program installations, directories, and files with coded names, unknown hosts on the network). Some ML algorithms that can be used are k-means, NB, and SVM.
- Set data extraction channels: This stage consists of establishing a connection with the C&C attacker server to send all the collected information, usually sending the data compressed and encrypted and limiting the size of the packets. The data are usually sent during hours of lower network bandwidth usage. The attacker can use fast-flux techniques to make the connections. Data can be stored on a host within the network and sent to the C&C server when the target is completed or sent in small packets at different times.Some techniques for data collection are automated collection, email collection, and man in the browser. Data extraction can be automated and on different media (e.g., alternative protocols, network medium, and physical medium). The tools used in C&C servers are domain generation protocols, remote access tools, and multilayer encryption.As a solution for the detection of sending data to C&C servers, ML techniques can be used to search for hosts with encrypted data, connections with random IP addresses and DNS, and encrypted data flows to unknown and unauthorised servers. In this stage, k-NN and k-means algorithms can be used for APT detection.
- Eliminate footprints: When the attacker has completed the mission, the next step is to remove all possible attack traces on the network and compromised systems, for example, these traces can be logs, compressed files, installed software, or malware. If the attacker has reached this stage, the organisation may not know that it has been compromised and attacked with an APT. Therefore, it would be difficult to check how much information the attacker has extracted and how long it has remained within the network. For this reason, the attack must be identified early.
6. Conclusions and Future Work
Conflicts of Interest
- Swisscom. Targeted Attacks Cyber Security Report 2019; Technical report; Swisscom (Switzerland) Ltd. Group Security: Bern, Switzerland, 2019. [Google Scholar]
- Chen, P.; Desmet, L.; Huygens, C. A Study on Advanced Persistent Threats. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin, Germany, 2014; Volume 8735 LNCS, pp. 63–72. [Google Scholar]
- Fireeye. M-Trends 2019: Fireeye Mandiant Services Special Report; Technical report; Fireeye: Milpitas, CA, USA, 2019. [Google Scholar]
- Lemay, A.; Calvet, J.; Menet, F.; Fernandez, J.M. Survey of publicly available reports on advanced persistent threat actors. Comput. Secur. 2018, 72, 26–59. [Google Scholar] [CrossRef]
- Bai, T.; Bian, H.; Daya, A.A.; Salahuddin, M.A.; Limam, N.; Boutaba, R. A Machine Learning Approach for RDP-based Lateral Movement Detection. In Proceedings of the 2019 IEEE 44th Conference Local Computer Networks, Osnabrueck, Germany, 14–17 October 2019; pp. 242–245. [Google Scholar]
- Ghafir, I.; Hammoudeh, M.; Prenosil, V.; Han, L.; Hegarty, R.; Rabie, K.; Aparicio-Navarro, F.J. Detection of advanced persistent threat using machine-learning correlation analysis. Futur. Gener. Comput. Syst. 2018, 89, 349–359. [Google Scholar] [CrossRef][Green Version]
- Zhang, R.; Huo, Y.; Liu, J.; Weng, F. Constructing APT Attack Scenarios Based on Intrusion Kill Chain and Fuzzy Clustering. Secur. Commun. Netw. 2017, 2017, 1–9. [Google Scholar] [CrossRef][Green Version]
- Threat Intelligence Team, M.L. APT36 Jumps on the Coronavirus Bandwagon, Delivers Crimson RAT. Available online: https://blog.malwarebytes.com/threat-analysis/2020/03/apt36-jumps-on-the-coronavirus-bandwagon-delivers-crimson-rat/ (accessed on 16 March 2020).
- Jeun, I.; Lee, Y.; Won, D. A Practical Study on Advanced Persistent Threats. Commun. Multimed. Secur. 2012, 8735, 144–152. [Google Scholar]
- Falliere, N.; Murchu, L.O.; Chien, E. W32. stuxnet dossier. White Pap. Symantec Corp., Secur. Response 2011, 5, 29. [Google Scholar]
- FireEye. Follow the money: Dissecting the Operations of the Cyber Crime Group FIN6; Technical Report; FireEye: Milpitas, CA, USA, 2016. [Google Scholar]
- Coopers, Pricewaterhouse. Operation Cloud Hopper; Technical report; PwC UK Cyber Security and Data privacy: London, UK, 2017. [Google Scholar]
- FireEye. Double Dragon: APT41, a Dual Espionage and Cyber Crime Operation; Technical report; FireEye: Milpitas, CA, USA, 2019. [Google Scholar]
- Mandiant. APT1 Exposing One of China’s Cyber Espionage Units; Technical report; Mandiant: Alexandria, VA, USA, 2013. [Google Scholar]
- Krombholz, K.; Hobel, H.; Huber, M.; Weippl, E. Advanced social engineering attacks. J. Inf. Secur. Appl. 2015, 22, 113–122. [Google Scholar] [CrossRef]
- Aleroud, A.; Zhou, L. Phishing environments, techniques, and countermeasures: A survey. Comput. Secur. 2017, 68, 160–196. [Google Scholar] [CrossRef]
- Symantec. Internet Security Threat Report; Technical Report 2; Symantec: Tempe, AZ, USA, 2016. [Google Scholar]
- Tanaka, Y.; Akiyama, M.; Goto, A. Analysis of malware download sites by focusing on time series variation of malware. J. Comput. Sci. 2017, 22, 301–313. [Google Scholar] [CrossRef]
- Paganini, P. Turla APT Group’s Espionage Campaigns Now Employs Adobe Flash Installer and Ingenious Social Engineering. Available online: https://www.cyberdefensemagazine.com/turla-apt-groups-espionage-campaigns-now-employs-adobe-flash-installer-and-ingenious-social-engineering/ (accessed on 20 August 2019).
- ThaiCERT. Threat Group Cards: A Threat Actor Encyclopedia. Available online: https://www.thaicert.or.th/downloads/files/A_Threat_Actor_Encyclopedia.pdf (accessed on 24 June 2019).
- Paganini, P. Iran-Linked APT33 Updates Infrastructure Following Its Public Disclosure. Available online: https://securityaffairs.co/wordpress/87784/apt/apt33-updates-infrastructure.html (accessed on 21 November 2019).
- Adams, C. Learning the lessons of WannaCry. Comput. Fraud Secur. 2018, 2018, 6–9. [Google Scholar] [CrossRef]
- Cordey, S. Trend Analysis: The Israeli Unit 8200—An OSINT-based study; Technical Report; Center for Security Studies (CSS), ETH Zürich: Zürich, Switzerland, 2019. [Google Scholar]
- Kasperky Lab. The Duqu 2.0-Technical Details (V2.1); Technical Report; Kasperky Lab: Moscow, Russia, 2015. [Google Scholar]
- Kaspersky Lab. Targeted Cyberattacks LOGBOOK; Kasperky Lab: Moscow, Russia, 2019. [Google Scholar]
- Dua, S.; Du, X. Data Mining and Machine Learning in Cybersecurity; Auerbach Publications: London, UK, 2011. [Google Scholar]
- Kaviani, S.; Sohn, I. Influence of random topology in artificial neural networks: A survey. ICT Express 2020. [Google Scholar] [CrossRef]
- Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
- Da Silva, I.N.; Hernane Spatti, D.; Andrade Flauzino, R.; Liboni, L.H.B.; dos Reis Alves, S.F. Artificial Neural Networks; Springer International Publishing: Cham, Switzerland, 2017; pp. 1–307. [Google Scholar] [CrossRef]
- Dahl, G.E.; Dong, Y.; Li, D.; Acero, A. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Trans. Audio. Speech. Lang. Process. 2012, 20, 30–42. [Google Scholar] [CrossRef][Green Version]
- Valverde Ramírez, M.C.; de Campos Velho, H.F.; Ferreira, N.J. Artificial neural network technique for rainfall forecasting applied to the São Paulo region. J. Hydrol. 2005, 301, 146–162. [Google Scholar] [CrossRef]
- Erkaymaz, O.; Ozer, M.; Perc, M. Performance of small-world feedforward neural networks for the diagnosis of diabetes. Appl. Math. Comput. 2017, 311, 22–28. [Google Scholar] [CrossRef]
- Chu, W.L.; Lin, C.J.; Chang, K.N. Detection and Classification of Advanced Persistent Threats and Attacks Using the Support Vector Machine. Appl. Sci. 2019, 9, 4579. [Google Scholar] [CrossRef][Green Version]
- Joshi, A.V. Machine Learning and Artificial Intelligence; Springer International Publishing: Cham, Switzerland, 2020; Volume 64, pp. 49A–60A. [Google Scholar] [CrossRef]
- Martínez Torres, J.; Iglesias Comesaña, C.; García-Nieto, P.J. Review: Machine learning techniques applied to cybersecurity. Int. J. Mach. Learn. Cybern. 2019, 10, 2823–2836. [Google Scholar] [CrossRef]
- Alloghani, M.; Al-Jumeily, D.; Hussain, A.; Mustafina, J.; Baker, T.; Aljaaf, A.J. Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks. In Nature-Inspired Computation in Data Mining and Machine Learning; Yang, X.S., He, X.S., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 47–76. [Google Scholar] [CrossRef]
- Cleophas, T.J.; Zwinderman, A.H. Modern Bayesian Statistics in Clinical Research; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar]
- von Davier, M.; Lee, Y.S. Handbook of Diagnostic Classification Models; Methodology of Educational Measurement and Assessment, Springer International Publishing: Cham, Switzerland, 2019; p. 646. [Google Scholar] [CrossRef]
- Gou, J.; Ma, H.; Ou, W.; Zeng, S.; Rao, Y.; Yang, H. A generalized mean distance-based k-nearest neighbor classifier. Expert Syst. Appl. 2019, 115, 356–372. [Google Scholar] [CrossRef]
- Pan, Y.; Pan, Z.; Wang, Y.; Wang, W. A new fast search algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based check strategy. Knowl.-Based Syst. 2020, 189, 105088. [Google Scholar] [CrossRef]
- Abu Alfeilat, H.A.; Hassanat, A.B.; Lasassmeh, O.; Tarawneh, A.S.; Alhasanat, M.B.; Eyal Salman, H.S.; Prasath, V.S. Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review. Big Data 2019, 7, 221–248. [Google Scholar] [CrossRef][Green Version]
- Awad, M.; Khanna, R. Hidden Markov Model. In Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers; Apress: Berkeley, CA, USA, 2015; pp. 81–104. [Google Scholar] [CrossRef][Green Version]
- Portugal, I.; Alencar, P.; Cowan, D. The use of machine learning algorithms in recommender systems: A systematic review. Expert Syst. Appl. 2017, 97, 205–227. [Google Scholar] [CrossRef][Green Version]
- Olivieri, A.C. Principal Component Analysis. In Introduction to Multivariate Calibration: A Practical Approach; Springer International Publishing: Cham, Switzerland, 2018; pp. 57–71. [Google Scholar] [CrossRef]
- Joshi, V.B.; Raval, M.S.; Gupta, D.; Rege, P.P.; Parulkar, S.K. A multiple reversible watermarking technique for fingerprint authentication. Multimed. Syst. 2016, 22, 367–378. [Google Scholar] [CrossRef]
- Wang, D.; Xu, J. Principal Component Analysis in the local differential privacy model. Theor. Comput. Sci. 2020, 809, 296–312. [Google Scholar] [CrossRef]
- Yang, L.; Deng, M. Based on k-Means and Fuzzy k-Means Algorithm Classification of Precipitation. In Proceedings of the 2010 International Symposium on Computational Intelligence and Design, Hangzhou, China, 29–31 October 2010; Volume 1, pp. 218–221. [Google Scholar] [CrossRef]
- Ahuja, R.; Chug, A.; Gupta, S.; Ahuja, P.; Kohli, S. Classification and Clustering Algorithms of Machine Learning with their Applications. In Nature-Inspired Computation in Data Mining and Machine Learning; Yang, X.S., He, X.S., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 225–248. [Google Scholar] [CrossRef]
- Guan, Z.; Bian, L.; Shang, T.; Liu, J. When Machine Learning meets Security Issues: A survey. 2018 IEEE Int. Conf. Intell. Saf. Robot. 2018, 158–165. [Google Scholar] [CrossRef]
- Geluvaraj, B.; Satwik, P.M.; Ashok Kumar, T.A. The Future of Cybersecurity: Major Role of Artificial Intelligence, Machine Learning, and Deep Learning in Cyberspace. In Lecture Notes on Data Engineering and Communications Technologies; Springer Singapore: Singapore, 2019; Volume 15, pp. 739–747. [Google Scholar] [CrossRef]
- Mohanty, S.; Vyas, S. Cybersecurity and AI. In How to Compete Age Artificial Intelligence; Apress: Berkeley, CA, USA, 2018; pp. 143–153. [Google Scholar] [CrossRef]
- OWASP. Unvalidated Redirects and Forwards. 2019. Available online: https://cheatsheetseries.owasp.org/cheatsheets/Unvalidated_Redirects_and_Forwards_Cheat_Sheet.html (accessed on 19 September 2019).
- Paganini, P. Phishers Continue to Abuse Adobe and Google Open Redirects. Available online: https://securityaffairs.co/wordpress/91877/cyber-crime/adobe-google-open-redirects.html (accessed on 11 October 2019).
- Bhadane, A.; Mane, S.B. Detecting lateral spear phishing attacks in organisations. IET Inf. Secur. 2019, 13, 133–140. [Google Scholar] [CrossRef]
- Lamprakis, P.; Dargenio, R.; Gugelmann, D.; Lenders, V.; Happe, M.; Vanbever, L. Unsupervised Detection of APT C&C Channels using Web Request Graphs. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin, Germany, 2017; Volume 10327 LNCS, pp. 366–387. [Google Scholar]
- Zhao, G.; Xu, K.; Xu, L.; Wu, B. Detecting APT Malware Infections Based on Malicious DNS and Traffic Analysis. IEEE Access 2015, 3, 1132–1142. [Google Scholar] [CrossRef]
- Buczak, A.L.; Guven, E. A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection. IEEE Commun. Surv. Tutorials 2016, 18, 1153–1176. [Google Scholar] [CrossRef]
- Liang, F.; Hatcher, W.G.; Liao, W.; Gao, W.; Yu, W. Machine Learning for Security and the Internet of Things: The Good, the Bad, and the Ugly. IEEE Access 2019, 7, 158126–158147. [Google Scholar] [CrossRef]
- Su, Y.; Li, M.; Tang, C.; Shen, R. A Framework of APT Detection Based on Dynamic Analysis. In Proceedings of the 2015 4th National Conference on Electrical, Electronics and Computer Engineering, Xi’an, China, 12–13 December 2015; pp. 1047–1053. [Google Scholar]
- Giura, P.; Wang, W. A Context-Based Detection Framework for Advanced Persistent Threats. In Proceedings of the 2012 International Conference on Cyber Security, Washington, DC, USA, 14–16 December 2012; pp. 69–74. [Google Scholar]
- Wang, X.; Zheng, K.; Niu, X.; Wu, B.; Wu, C. Detection of command and control in advanced persistent threat based on independent access. In Proceedings of the 2016 IEEE International Conference on Communications (ICC), Kuala Lumpur, Malaysia, 22–27 May 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Aparicio-navarro, F.J.; Kyriakopoulos, K.G.; Ghafir, I.; Lambotharan, S.; Chambers, J.A.; Technology, F. Multi-Stage Attack Detection Using Contextual Information; Loughborough University: Loughborough, UK, 2018; pp. 920–925. [Google Scholar]
- Brogi, G.; Tong, V.V.T. TerminAPTor: Highlighting advanced persistent threats through information flow tracking. In Proceedings of the 2016 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Larnaca, Cyprus, 21–23 November 2016. [Google Scholar]
- Quintero-Bonilla, S.; del Rey, A.M. Proposed models for advanced persistent threat detection: A review. Adv. Intell. Syst. Comput. 2020, 1004, 141–148. [Google Scholar]
- Sharma, P.K.; Moon, S.Y.; Moon, D.; Park, J.H. DFA-AD: A distributed framework architecture for the detection of advanced persistent threats. Clust. Comput. 2017, 20, 597–609. [Google Scholar] [CrossRef]
- Siddiqui, S.; Khan, M.S.; Ferens, K.; Kinsner, W. Detecting Advanced Persistent Threats using Fractal Dimension based Machine Learning Classification. In Proceedings of the 2016 ACM on International Workshop on Security And Privacy Analytics, New Orleans, LA, USA, 11 March 2016; pp. 64–69. [Google Scholar]
- Shenwen, L.; Yingbo, L.; Xiongjie, D. Study and research of APT detection technology based on big data processing architecture. In Proceedings of the 2015 IEEE 5th International Conference on Electronics Information and Emergency Communication, Beijing, China, 14–16 May 2015; pp. 313–316. [Google Scholar]
- Ussath, M.; Jaeger, D.; Cheng, F.; Meinel, C. Advanced persistent threats: Behind the scenes. In Proceedings of the 2016 Annual Conference on Information Science and Systems (CISS), Princeton, NJ, USA, 16–18 March 2016; pp. 181–186. [Google Scholar]
- Sexton, J.; Storlie, C.; Neil, J. Attack chain detection. Stat. Anal. Data Min. ASA Data Sci. J. 2015, 8, 353–363. [Google Scholar] [CrossRef]
- Ghafir, I.; Prenosil, V. Proposed Approach for Targeted Attacks Detection. Lect. Notes Electr. Eng. 2016, 362, 73–80. [Google Scholar] [CrossRef][Green Version]
- Trend Micro. The Custom Defense Against Targeted Attacks; Technical report; Trend Micro: Tokyo, Japan, 2013. [Google Scholar]
- Vukalovic, J.; Delija, D. Advanced Persistent Threats-detection and defense. In Proceedings of the 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 25–29 May 2015; pp. 1324–1330. [Google Scholar]
- Lockheed Martin. Cyber Kill Chain; Lockheed Martin: Bethesda, MD, USA, 2009. [Google Scholar]
|Feature||APT Attacks||Common Malware Attacks|
|Definition||APT is a sophisticated, targeted and highly organised attacks. (e.g., Stuxnet)||Malware is malicious software used to attack and disable any system. (e.g., ransomware)|
|Attacker||Government actors and organised criminal groups||A cracker (a hacker in illegal activities)|
|Target||Diplomatic organisations, information technology industry and others sectors||Any personal or business computer|
|Purpose||Filter confidential data or cause damage to a specific target||Personal recognition|
|Attack life cycle||Maintain persistence as possible using different ways||It ends when it is detected by the security actions (e.g., anti-virus software)|
|Discovery Date||First Known Sample||Name||State||Targeted Platform|
|Authors||Algorithm||Approach||Approach Detail||APT Life Cycle Used||Detection Accuracy|
|Ghafir et al. ||DT, SVM, k-NN and Ensamble learning||MLAPT||Phases:|
|Sharma et al. ||Genetic programming, classification and regression tree, dynamic bayesian game model and SVM.||DFA-AD||Phases:|
|Siddiqui et al. ||k-NN and Correlation fractal dimension.||Fractal-based anomaly.||Steps:|
Combined packet capture (pcap files)
Feature vector extraction
Anomaly classification with ML algorithms
|Non-specified||93.58% (FD), 92.83% (k-NN)|
|Shenwen et al. ||k-NN||Detection based on Big Data||Phases:|
|Bai et al. ||LR, GNB, DT, RF and LB||RDP-based LM detection||Steps:|
Preprocessing of dataset
Apply ML techniques
|1 phase||99.99% (LB)|
|Chu et al. ||PCA, SVM, NB, DT and MLP||Early discovery of APT attack||Steps:|
|Zhang et al. ||Fuzzy clustering||APT attack scenarios||Steps:|
Attack event classification
Attack scenario mining
|IKC model (4 phases)||Non-specified|
|3 Stages ||4 Stages ||4 Stages ||5 Stages ||5 Stages ||6 Stages ||6 Stages ||7 Stages ||7 Stages ||8 Stages ||11 Stages |
|Initial Compromise||Information Collection||Initial Compromise||Reconnaissance||Delivery||Intelligence gathering||Reconnaissance and weaponization||Research||Reconnaissance||Initial recon||Initial access|
|Intrusion phase||Incursion||Initial Compromise||Delivery||Preparation||Weaponization||Initial compromise||Persistence|
|Initial intrusion||Intrusion||Delivery||Privilege Escalation|
|Lateral movement||Lateral expansion||C&C||Discovery||Exploit||C&C||C&C||Conquering network||Exploitation||Establish foothold||Discovery|
|Lateral movement||Capture||Installation||Lateral movement||Lateral movement||Hiding presence||Installation||Escalate privileges||Lateral movement|
|Assets/Data discovery||Internal recon|
|Command and control||Information theft phase||Attack achievement||Ex-filtration||C&C||Data ex-filtration||Data ex-filtration||Gathering data||C&C||Maintain presence||Collection|
|Actions||Maintaining access||Actions on objective||Complete mission||Exfiltration|
|Stages executed in parallel: Execution, Defence evasion, Credential access, and Command & control|
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).