A Deep Learning-Based Phishing Detection System Using CNN, LSTM, and LSTM-CNN

: In terms of the Internet and communication, security is the fundamental challenging aspect. There are numerous ways to harm the security of internet users; the most common is phishing, which is a type of attack that aims to steal or misuse a user’s personal information, including account information, identity, passwords, and credit card details. Phishers gather information about the users through mimicking original websites that are indistinguishable to the eye. Sensitive information about the users may be accessed and they might be subject to ﬁnancial harm or identity theft. Therefore, there is a strong need to develop a system that efﬁciently detects phishing websites. Three distinct deep learning-based techniques are proposed in this paper to identify phishing websites, including long short-term memory (LSTM) and convolutional neural network (CNN) for comparison, and lastly an LSTM–CNN-based approach. Experimental ﬁndings demonstrate the accuracy of the suggested techniques, i.e., 99.2%, 97.6%, and 96.8% for CNN, LSTM–CNN, and LSTM, respectively. The proposed phishing detection method demonstrated by the CNN-based system is superior.


Introduction
Life has become faster and more accessible because of the evolution of communication technologies and digitalization, especially during the lockdown due to the COVID-19 pandemic, when all transactions and life needs needed to procured online, i.e., shopping and transactions, as compared to doing so physically. To fulfil daily needs on online systems, you can simply open your smart device, and search for the website as you want, such as a pharmacy, shopping store, learning platform, or bookstore. On the other hand, the growth of E-services expands attackers' opportunities to gain or misuse users' information such as their names, phone numbers, identification, and credit card information. As a result, users face a variety of online threats and cyber-attacks every day. Phishing has different types, it could be via electronic mail (E-mail), SMS (Short Message Service), or URL (Uniform Resource Locator), to name a few. Phishing can compromise all types of data sources including personal information and online accounts, and gain access and modification to connected systems [1].
In some cases, hackers stop phishing when they steal enough information for financial gain while other hackers seek to earn more information by logging into specific companies to make more malicious attacks against their employees. Consequently, hackers use different and new techniques to fool users such as sending URLs that look like a website for banking or shopping; at the time when the user opens the URL and conducts transactions, A convolutional neural network is a kind of neural network that requires large, labeled data for training. CNNs play a significant role in many problems such as image classification, object recognition, phishing detection, and diagnosis of medical diseases. Input, convolution, pooling, and fully connected layers are the main layers needed to construct a CNN as shown in Figure 2. Accelerating the learning process has led CNN to accomplish great and high results for many problems [17].
LSTM-CNN architecture involves both CNN and LSTM methods as shown in Figure  3 in order to make use of the benefits of both methods and accomplish excellent performance. Since CNN and LSTM show high performance in overcoming classification, detection, and recognition tasks [17], to using these three methods for the phishing detection task is promising.  As a result, we were motivated to find a solution to phishing websites effectively using deep learning. This paper used an empirical method to investigate the performance of the three techniques LSTM, CNN, and LSTM-CNN in order to produce great results in phishing detection. The goal of this paper was to classify whether the URL was phished or legitimate by using LSTM, CNN, and LSTM-CNN. A convolutional neural network is a kind of neural network that requires large, labeled data for training. CNNs play a significant role in many problems such as image classification, object recognition, phishing detection, and diagnosis of medical diseases. Input, convolution, pooling, and fully connected layers are the main layers needed to construct a CNN as shown in Figure 2. Accelerating the learning process has led CNN to accomplish great and high results for many problems [17]. A convolutional neural network is a kind of neural network that requires large, labeled data for training. CNNs play a significant role in many problems such as image classification, object recognition, phishing detection, and diagnosis of medical diseases. Input, convolution, pooling, and fully connected layers are the main layers needed to construct a CNN as shown in Figure 2. Accelerating the learning process has led CNN to accomplish great and high results for many problems [17].
LSTM-CNN architecture involves both CNN and LSTM methods as shown in Figure  3 in order to make use of the benefits of both methods and accomplish excellent performance. Since CNN and LSTM show high performance in overcoming classification, detection, and recognition tasks [17], to using these three methods for the phishing detection task is promising.  As a result, we were motivated to find a solution to phishing websites effectively using deep learning. This paper used an empirical method to investigate the performance of the three techniques LSTM, CNN, and LSTM-CNN in order to produce great results in phishing detection. The goal of this paper was to classify whether the URL was phished or legitimate by using LSTM, CNN, and LSTM-CNN. LSTM-CNN architecture involves both CNN and LSTM methods as shown in Figure 3 in order to make use of the benefits of both methods and accomplish excellent performance. Since CNN and LSTM show high performance in overcoming classification, detection, and recognition tasks [17], to using these three methods for the phishing detection task is promising. A convolutional neural network is a kind of neural network that requires large, labeled data for training. CNNs play a significant role in many problems such as image classification, object recognition, phishing detection, and diagnosis of medical diseases. Input, convolution, pooling, and fully connected layers are the main layers needed to construct a CNN as shown in Figure 2. Accelerating the learning process has led CNN to accomplish great and high results for many problems [17].
LSTM-CNN architecture involves both CNN and LSTM methods as shown in Figure  3 in order to make use of the benefits of both methods and accomplish excellent performance. Since CNN and LSTM show high performance in overcoming classification, detection, and recognition tasks [17], to using these three methods for the phishing detection task is promising.  As a result, we were motivated to find a solution to phishing websites effectively using deep learning. This paper used an empirical method to investigate the performance of the three techniques LSTM, CNN, and LSTM-CNN in order to produce great results in phishing detection. The goal of this paper was to classify whether the URL was phished or legitimate by using LSTM, CNN, and LSTM-CNN. As a result, we were motivated to find a solution to phishing websites effectively using deep learning. This paper used an empirical method to investigate the performance of the three techniques LSTM, CNN, and LSTM-CNN in order to produce great results in phishing detection. The goal of this paper was to classify whether the URL was phished or legitimate by using LSTM, CNN, and LSTM-CNN.
To determine if the URLs are phished or legitimate, we suggest a phishing detection system based on deep learning techniques. The suggested approach is useful for deep learning-based detection and classification systems in the fields of information security and cybersecurity. In order to classify phishing URLs and stop financial losses and cybercrimes, our work offers a great contribution to the efficacy of using LSTM, CNN, and LSTM-CNN.
The following points state the contribution of the proposed work: • An examination of the methods currently used to identify phishing websites. • Analysis and use of three state-of-the-art deep learning methods, LSTM, CNN, and LSTM-CNN, to predict phishing URLs.

•
Presentation of an efficient deep learning architecture based on CNN due to its capacity to identify patterns, extract features, and automatic and accurate classification of URLs.

•
Comparison and evaluation of suggested LSTM, CNN, and LSTM-CNN models.

•
Consideration of a dataset with 30 features after a feature selection process.

•
Highlighting several restrictions based on the conclusions of earlier investigations and suggestion of potential fixes for these issues.
The remainder of this paper is structured as follows: A literature review is presented in Section 2. Section 3 discusses our proposed solution along with its methodology. Section 4 contains experimental results and a discussion. Section 5 is a comparison of existing works. Section 6 focuses on the conclusion and future work.

Literature Review
The phishing website problem is complex and is a challenge in itself, because no definitive solution exists to put an end to all the threats effectively. To identify phishing websites, deep learning-based phishing website detection solutions have arisen. Moreover, deep learning has become more promising in cyber security. In this section, several previous works that use deep learning approaches for phishing website detection are shown in Table 1.

Long Short-Term Memory (LSTM)
Yang et al. [19] presented a new method that uses the LSTM and recurrent neural network (RNN) algorithms for detecting phishing attacks that adopts the LSTM deep learning method and optimizes the training of the model with the combined characteristics of RNN. The main advantages of using LSTM are its ability to incorporate large volumes of data and capacity to automatically learn complex features. This solves a complex problem for other machine learning methods. The datasets used were from yahoo and PhishTank. This work showed an accuracy of 99.1%.

Convolutional Neural Network (CNN)
A model based on deep learning proposed in [20] utilized a character-level CNN to detect phishing URLs. The study implemented a system of phishing detection by using CNN at a character level to learn the URL's sequential information, then max-pooling was applied to determine important features, which were then fed to fully connected layers for classification. To train the network, the stochastic gradient descent algorithm (SGD) was used. The results show that the suggested model attained an accuracy of 95.02% on the given dataset. Furthermore, the model's accuracy on benchmark datasets was 98.58%, 95.46%, and 95.22%, which performed better than the current phishing URL models compared to the various machine and deep learning algorithms.
Shweta et al. [21] presented a phishing detection system using deep learning techniques to prevent phishing attacks. The dataset contained 37,175 phishing URLs and 36,400 legitimate ones. The study was conducted by applying CNN. The advantage of this system is that no feature engineering is required since the CNN extracts features from the URLs automatically through its hidden layers. The framework consists of the input text being passed through the embedding layer, and a matrix being created and passed to CNN. The accuracy of the proposed system achieved was 98.00%.

80%
20% Accuracy and precision A relative detection method was suggested in [22], which allowed for the identification of a two-dimensional code phishing attempt. Information was gathered from the FlickrLogos-32 dataset, a publicly accessible logo dataset with 32 unique logo brands. The study was conducted by enhancing the traditional approach, which is an improved feature pyramid network (FPN) combined with a faster R-CNN logo identification technique. The three logo processes were the main processes of the system, which are extraction, recognition, and identification. Extracting logo images from two-dimensional code is known as logo extraction. Based on the retrieved logos, the identification and recognition of the logos were performed using faster R-CNN. The final step in the identification process involves assessing the logo's consistency between the actually identified object and its described identity. In comparison to other logo recognition methods and phishing detection methods, the findings demonstrated the method's effectiveness in logo recognition, which may be used for two-dimensional code phishing assault detection.
HTMLPhish is a deep learning-based platform that relies on data-driven end-to-end automatic phishing web page classification, as proposed by Chidimma et al. [23]. The dataset includes more than 50,000 HTML documents and a full dataset of HTML contents was presented in a real-world distribution. The data were acquired from HTML documents using a web crawler. HTMLPhish employed CNNs to learn the semantic dependencies in the textual contents of HTML documents in order to learn the relevant feature representations. Additionally, they used convolutions on a combination of the character and word embedding matrix to ensure that new words were effectively incorporated into the test HTML documents. Without taking into account intensive manual feature engineering, this technique could analyze context features from HTML pages. The results showed that HTMLPhish obtained over 93% accuracy, which indicates good result.
Due to internet users' exposure to cyber threats and security flaws, artificial intelligencebased algorithms through machine learning and deep learning techniques were developed [24]. The authors aimed to construct a system that detects phishing to overcome cyberattacks using a CNN with n-gram features. The system extracts these features from URLs, determining which n-gram feature extraction technique is more effective and which parameter works best. The best results are achieved with single characters. Using 70 characters in model training gives 34 s for training one epoch and 0.008 s for URL classification. With the high-risk URL dataset, reaching an accuracy of around 88.90% is excellent.
Texception is a new deep learning architecture [25] that predicts whether the input URL is a phishing link or not. Texception is different from classical approaches since it uses two levels of information from the URL, which are character-level and word-level, depending less on manually crafted features. Texception grows wider or deeper through different parallel convolutional layers. For new URLs using the Microsoft SmartScreen service dataset, Texception generalizes better. The results of production data showed that Texception achieved magnificent performance. The true positive rate increased by 126.7% with a (0.01%) false-positive rate.
The improvement of cyber defense and effective phishing detection is required to cope with the increased exposure to various cyberattacks owing to the faster growth of phishing websites. Yerima et al. [26] used a 1D CNN-based model that utilizes CNN for its capability in differentiating sites of legitimate or phishing. According to the authors, the model evaluated a website dataset including 4898 and 6157 phishing and legitimate websites, respectively. The model is used to detect unseen phishing websites. Furthermore, the model gained 98.2% and 0.976 as a phishing detection rate and F1-score, respectively.

Integration of LSTM and CNN
Quang et al. [27] concentrated on analyzing the performance of different deep learning algorithms in detecting phishing websites to aid organizations in choosing and adopting suitable solutions based on their technological needs. The data contains 11,055 phishing and benign URLs. They utilized various deep learning algorithms, which comprised DNN, CNN, gated recurrent unit (GRU), and LSTM. In order to find the optimal parameter to Electronics 2023, 12, 232 8 of 18 achieve good accuracy, the model was tested on different architectures for each of the deep learning algorithms. The results demonstrated that a deep learning algorithm gains the best measure of overall performance metrics.
Image classification and natural language can both benefit from deep learning approaches. Adebowale et al. [28] proposed an intelligent phishing detection system (IPDS) to explore the potential of distinguishing phishing URLs from unique legitimate URLs. IPDS builds a hybrid classification model using LSTM and CNN. Around one million legitimate and phishing URLs were used on the dataset collected from PhishTank and Common Crawl. To build the IPDS, the LSTM and CNN classifier used over 10,000 images and one million URLs for training. The sensitivity of IPDS was determined by several factors such as split issues, number of misclassifications, and the type of feature. IPDS achieved 93.28% as the accuracy of classification.
The detection rules of many phishing detection techniques are difficult to update in response to changes in attack trends and computationally expensive. PhishTrim was proposed by Zhang et al. [29], which is a lightweight phishing URL detection method based on deep representation learning. The skip-gram pretraining model was used to obtain the URLs' initial embedding representation. Furthermore, to extract context dependency and learn the deep representation of URLs, Bi-LSTM was used. the local n-gram features were extracted via CNN, and the PhishTrim dataset was used.
As a result of the increase in electronic shopping (e-shopping) and electronic banking (e-banking), hackers can steal users' personal information and critical details through different ways by passing themselves off as trusted websites. To protect users from such cases, Yazhmozhi et al. [30] proposed an anti-phishing system based on LSTM and CNN. The dataset comprised nearly 200,000 URLs taken from PhishTank, VirusTotal, and by using Yandex search API. The proposed system performs well, with 97% precision and 96% accuracy. The model can be used in web browsers since it is deployed with a simple UI.
After a comprehensive literature review, phishing detection research is a challenging task, since phishers are rapidly developing efficient ways to bypass the current detectors. Research on phishing detection approaches can be categorized depending on their input such as URL, email, visual screenshot, logos, and HTML content. In terms of URL as input, most of the studies have proven that URL features such as URL length, characters, frequency of keywords, and frequency of auspicious symbols signify well on the datasets collected from VirusTotal, PhishTank, OpenPish, and other open phishing platforms. The results of these studies showed accuracy reaching 90% and more using deep learning-based methodologies, mainly DNN, CNN, and LSTM. On the other hand, some studies use small datasets, which affect the accuracy of the proposed systems. Furthermore, some studies used the same deep learning method for feature extraction and classification obtaining different accuracies; in addition, the training time was long. Hence, there is a need for a system that can help detect phishing URLs efficiently and effectively. Deep learning has attracted increased interest recently due to its performance and ability to learn the features instantaneously without any manual feature engineering. Under those promises, we used deep learning to detect phishing URLs using LSTM, CNN, and LSTM-CNN to show their performances in detecting phishing URLs. To the best of our knowledge, no previous work uses the three DL methods and compares their results. The dataset used in this work contains 20,000 URLs including 9800 phishing ones [31]. The primary difference of our approach with regard to the previously cited deep learning-based ones is that we extracted the most discriminative features for the dataset and proposed the use of a light-weight CNN-based model for the accurate detection of phishing websites, which turned out to be conducive to the improvement of phishing detection performance.

Methodology
Detecting phishing URLs is an important aspect of cybersecurity. Commonly, many phishing URLs appear as legitimate URLs to the users because of the complex formulation of URLs by attackers. As a result, attackers can gain access to the personal information of users, which can be misused. This paper proposed a phishing detection system for detecting phishing URLs. In order to detect phishing URLs and show the robustness of the system, the system was implemented by using two different techniques. The following sections describe the methodology used, dataset preparation, deep learning approaches, and the model's training and testing detail.

Proposed System
In this section, the important details of the models' configuration are discussed. The framework of the model incorporates of four stages as shown in Figure 4. The first stage concerns the features of the URLs, which are obtained from the dataset [31]; the second stage involves pre-processing, in which we detected null values and scaling values of feature selection, which contributes most to the target variable by using SelectKBest; the third stage is the training of three different models, namely LSTM, CNN, and LSTM-CNN by building a deep learning approach. Finally, as the evaluation of the approach using a number of indicators to measure how the model performs in detecting phishing websites, the fourth stage is the classification of the webpage URLs as legitimate or phishing.

Methodology
Detecting phishing URLs is an important aspect of cybersecurity. Commonly, many phishing URLs appear as legitimate URLs to the users because of the complex formulation of URLs by attackers. As a result, attackers can gain access to the personal information of users, which can be misused. This paper proposed a phishing detection system for detecting phishing URLs. In order to detect phishing URLs and show the robustness of the system, the system was implemented by using two different techniques. The following sections describe the methodology used, dataset preparation, deep learning approaches, and the model's training and testing detail.

Proposed System
In this section, the important details of the models' configuration are discussed. The framework of the model incorporates of four stages as shown in Figure 4. The first stage concerns the features of the URLs, which are obtained from the dataset [31]; the second stage involves pre-processing, in which we detected null values and scaling values of feature selection, which contributes most to the target variable by using SelectKBest; the third stage is the training of three different models, namely LSTM, CNN, and LSTM-CNN by building a deep learning approach. Finally, as the evaluation of the approach using a number of indicators to measure how the model performs in detecting phishing websites, the fourth stage is the classification of the webpage URLs as legitimate or phishing.

Dataset Preparation and Preprocessing
Data collection plays an essential role in terms of research validity and reliability. In our approach, we made use of appropriate and consistent data, so the system's training is robust. After prepossessing the dataset containing the URL features, with 20,000 records of 80 features, there were a lot of features in the dataset; therefore, the SelectKBest method was used with the value of the 30 best features. The dataset under consideration was processed in the data preprocessing stage, which included detecting null values in addition to scaling each feature to a given range using the MinMaxScaler method. The obtained dataset after preprocessing was individually taken into account during various experiments over the LSTM, CNN, and LSTM-CNN.

Training and Testing
The dataset was divided into 20% as testing and 80% as training. The distribution of training and testing sets is shown in Table 2. One of the aspects affecting the effectiveness

Dataset Preparation and Preprocessing
Data collection plays an essential role in terms of research validity and reliability. In our approach, we made use of appropriate and consistent data, so the system's training is robust. After prepossessing the dataset containing the URL features, with 20,000 records of 80 features, there were a lot of features in the dataset; therefore, the SelectKBest method was used with the value of the 30 best features. The dataset under consideration was processed in the data preprocessing stage, which included detecting null values in addition to scaling each feature to a given range using the MinMaxScaler method. The obtained dataset after preprocessing was individually taken into account during various experiments over the LSTM, CNN, and LSTM-CNN.

Training and Testing
The dataset was divided into 20% as testing and 80% as training. The distribution of training and testing sets is shown in Table 2. One of the aspects affecting the effectiveness of deep learning algorithms is the selection of hyperparameters during training. Hyperparameter values can be optimized to improve the accuracy of phishing website detection models. These parameters comprise the number of layers, the number of neurons in each layer, the batch size, the learning rate, the dropout rate, the number of epochs, the type of activation function, the type of optimizer, the learning rate, and the dropout rate [32]. Choosing an appropriate number of parameters enhances the LSTM, CNN, and LSTM-CNN models' performance, so each parameter was selected based on the value that enhanced performance. One of the main parameters of the system is the age, which is considered as the number of iterations of training after the deep learning model is built and compiled, its value set to 50 epochs. The parameters are stated in Table 3.

Deep Learning Approaches
Deep learning, a subfield of machine learning, has gained great attention over the previous decade. Recent advances in processing power and increased data storage capacities have greatly aided the ability to apply deep learning approaches. Deep learning models have produced excellent results using large datasets for a variety of challenges, including image processing, natural language processing and machine translation. Moreover, the challenge of phishing URL classification has also been undertaken using deep learning systems, with encouraging results [33]. Different classification methods are applied to detect phishing websites and then evaluated by different performance metrics. The models examined in this study are LSTM, CNN, and LSTM-CNN. Convolutional layers are defined by their ability to learn internal representations and retrieve meaningful knowledge of data; LSTM networks, on the other hand, are efficient at detecting both short-and long-term dependencies. Based on the experimental results, the CNN model shows great results in terms of performance. Furthermore, we explain each of the three models below.

•
Long short-term memory (LSTM): Long short-term memory is an adaptive recurrent neural network (RNN), which is a type of recurrent neural network in which a memory cell, in addition to the conservative neuron, switches each neuron on account of an internal state. The layers of LSTM comprise memory blocks, which repeatedly link blocks; one or more memory cells with recurrent connections can be found in each block. As a result, a typical LSTM cell has an input gate that controls data input from outside the cell and determines whether the data in the internal state is kept or overlooked, as well as an output gate that prohibits or enables the inner state's ability to be viewed from the outside [34]. LSTM has been shown to be an effective strategy for detecting phishing URLs [35,36]. The workflow of LSTM for classifying a URL starts after loading, preprocessing, and splitting the dataset. The LSTM model starts with the first layer, which is the input layer that uses a 79-length vector, and then the LSTM layer, which includes 128 neurons and acts as the model's memory subset. Following LSTM, the dense layer-an output layer with a sigmoid function-assists in providing the labels.

•
Convolutional neural network (CNN): CNN is a discriminative architecture that works effectively at processing grid-based two-dimensional data, including images and videos. In terms of time delay, the CNN outperforms the neural network (NN). The weights are shared in a temporal dimension in the CNN, which reduces calculation time. The standard NN's generic matrix multiplication is thus replaced in the CNN. As a result, the CNN technique minimizes the weights, lowering the network's complexity [34]. The workflow of the CNN for classifying a URL starts with the first step by fetching the labeled training data of the URLs, then divides into train and test sets at random. After we prepared the training and test data, the data was finally trained by creating the architecture of the CNN including the input, output, and layers. After each convolution, we incorporated a max-pool layer to capture the essential elements from each convolution and convert them into a feature vector. Next, we added dropout regularization to ensure that that model did not overfit. The model classifies the output produced by this layer when a sigmoid function is used. • LSTM-CNN: The model consists of CNN layers that extract features from input data and LSTM layers that predict sequences [37]. Furthermore, a study [38] found that combining a 1D convolution layer and an LSTM layer improves the accuracy of malicious URL identification when compared to models that exclusively use LSTM layers. As a result, when constructing the system, we chose 1D CNN and LSTM architecture to train the URL features.
The workflow of CNN-LSTM as shown in Figure 3; after preprocessing the dataset, it splits into train and test sets, followed by data normalization before feeding into the model; lastly, the model is passed to the CNN and LSTM layers, in addition to the dense layer to avoid overfitting of the dataset, and finally, the model classifies the results of the output produced by this layer when a sigmoid function is used.

Evaluation and Results
This section evaluates the proposed system and presents the results.

Evaluation Metrics
This section summarizes the metrics used to measure the results of the deep learning approaches. Generally, using results of the classification algorithm, the performance of machine learning prediction algorithms are evaluated. In this study, the prediction outcomes were examined using metrics including precision, recall confusion matrix, and accuracy of the system to estimate the system [39].
Precision: The precision of the prediction algorithm is the number of phishing webpages correctly classified as actual phishing webpages.

F1-Score:
The process of taking the harmonic mean of a classifier's precision and recall. It can be combined into a single metric.

Results
For the experimental results, we calculated the accuracy, precision, recall and F1 score of the prediction algorithms. In the majority of prediction models the proposed system was evaluated based on the accuracy of the prediction model, which has been identified as one of the common performance measures. The prediction accuracy of the approaches presented in this paper can be found in Section 3. We used a dataset that consists of 20,000 records of URLs consisting of 80 features. In the preprocessing stage we detected null values and scaled features, and then selecting 30 features using SelectKBest, we trained the LSTM, CNN, and LSTM-CNN classifiers based on these features.
The three proposed methods showed good results, which are shown in Table 4, also reflecting the optimal choice of parameters. After implementing, training, and testing the LSTM, CNN, and LSTM-CNN techniques, the results showed some level of improvement in phishing detection through the CNN algorithm, since it had the highest accuracy at 99.2%, followed by the LSTM-CNN algorithm, which achieved 97.6%, while LSTM achieved 96.8% prediction accuracy as illustrated in Figure 5. Because CNN outperforms the other two models in terms of accuracy and other performance metrics, it is superior to them due to different reasons: First, CNN can perform well on text classification problems while LSTM performs for sequential data, since LSTM can learn the texts and the relation between the tokens very well. Moreover, CNN takes less time and is more effective than the LSTM-based approach. In addition, it requires fewer parameters for training compared to LSTM, which reduces the complexity of the model. Additionally, CNN runs one order of magnitude faster than both LSTM and LSTM-CNN. Finally, the computations in CNNs can occur in parallel, in contrast to LSTM, which captures the dependency across time sequences in the input vector.  For the LSTM, in Figure 6, the confusion matrix of the LSTM model is shown. The percentage of predicted values is shown on the x-axis, and the percentage of true values is shown on the y-axis. It is obvious that the LSTM algorithm predicted 1912 (true positive) samples correctly, with 80 (false positive) misclassifications.  For the LSTM, in Figure 6, the confusion matrix of the LSTM model is shown. The percentage of predicted values is shown on the x-axis, and the percentage of true values is shown on the y-axis. It is obvious that the LSTM algorithm predicted 1912 (true positive) samples correctly, with 80 (false positive) misclassifications. After analyzing by considering the outcome, we could say that the CNN algorithm outperforms the LSTM-CNN and CNN algorithms in the detection of phishing.

Comparison with Existing Approaches
It is important to shed light on previous works that have used similar approaches and methodology to our work. The proposed CNN architecture provides excellent results After analyzing by considering the outcome, we could say that the CNN algorithm outperforms the LSTM-CNN and CNN algorithms in the detection of phishing.

Comparison with Existing Approaches
It is important to shed light on previous works that have used similar approaches and methodology to our work. The proposed CNN architecture provides excellent results compared to LSTM-CNN and LSTM. Furthermore, we also compare our proposed model with already existing techniques that have used CNN and LSTM in Table 5. The comparison is based on the proposed methodology, the data set used, advantages and disadvantages, and the system accuracy of the existing works.

Limitations
After testing and evaluating our proposed system, we can see that the system outperforms existing methodologies and showed excellent results. However, the proposed system has some shortcomings. The model does not check the status of the URL of the website, i.e., whether the website is active or not, which impacts the results. To overcome this limitation, it might be necessary to speed up the training process and improve feature engineering, which would then allow us to verify the website's state and improve training process accuracy.

Conclusions and Future Work
The improvement of technologies has had a significant impact on increasing online purchases and transactions, which make our day-to-day tasks easier. On the other hand, online transactions lead to unauthorized access to the sensitive information of users, individuals, or enterprises. Security is the most important aspect of protecting users from phishers who steal information while they are communicating through internet applications. Phishing is one of the known attacks that gain users' information through a URL that looks identical to the actual webpage. Detecting phishing attacks plays a significant role in preventing attackers from gaining access to users' information. As there is a growth in the number of victims owing mainly to inefficient security technology adoption, an intelligent technique is needed to protect users from cyber-attacks. With the rapid development of deep learning techniques, deep learning has proven a valuable development compared to traditional signature-based and classic machine learning-based solutions due to its high performance and end-to-end problem-solving. In this work, the LSTM, CNN, and LSTM-CNN algorithms were proposed to detect and classify the URLs of the websites as either phishing or legitimate. Based on the evaluation of the proposed system, the detection of phishing websites accomplished excellent results. The proposed deep learning algorithms applied to the same dataset varied in their performance. The CNN algorithm outperformed LSTM-CNN and LSTM in terms of accuracy, which reached 99.2%, while LSTM-CNN and LSTM achieved accuracies of 97.6%, and 96.8%, respectively. In the future, we aim to enhance the training process by reducing training time and improving feature engineering in order to verify websites' states and improve the training processes' overall accuracy. Furthermore, we also intend to present an approach that considers the webpage context as well as the URL in order to detect phishing websites.