A Customized VGG19 Network with Concatenation of Deep and Handcrafted Features for Brain Tumor Detection

: Brain tumor (BT) is one of the brain abnormalities which arises due to various reasons. The unrecognized and untreated BT will increase the morbidity and mortality rates. The clinical level assessment of BT is normally performed using the bio-imaging technique, and MRI-assisted brain screening is one of the universal techniques. The proposed work aims to develop a deep learning architecture (DLA) to support the automated detection of BT using two-dimensional MRI slices. This work proposes the following DLAs to detect the BT: (i) implementing the pre-trained DLAs, such as AlexNet, VGG16, VGG19, ResNet50 and ResNet101 with the deep-features-based SoftMax classiﬁer; (ii) pre-trained DLAs with deep-features-based classiﬁcation using decision tree (DT), k nearest neighbor (KNN), SVM-linear and SVM-RBF; and (iii) a customized VGG19 network with serially-fused deep-features and handcrafted-features to improve the BT detection accuracy. The experimental investigation was separately executed using Flair, T2 and T1C modality MRI slices, and a ten-fold cross validation was implemented to substantiate the performance of proposed DLA. The results of this work conﬁrm that the VGG19 with SVM-RBF helped to attain better classiﬁcation accuracy with Flair ( > 99%), T2 ( > 98%), T1C ( > 97%) and clinical images ( > 98%).


Introduction
The brain is one of the primary organs in humans, and it assesses the complete physiological gestures coming from other sensory parts and takes the necessary control measures. The normal operations of brain are badly affected if any infection or disease arises, and an unnoticed and untreated abnormality may lead to various difficulties, including death [1,2].
The regular state of the brain may be exaggerated due to different reasons, such as birth defects, a head injury due to an accident or uncontrolled cell growth (UCG) in a central brain section [3,4]. An irregularity will cause various problems in the physiological system and the untreated brain abnormality also will lead to several major illnesses. A brain abnormality due to UCG is a major threat, and the untreated growth will lead to brain-cancer, which is one of the rapidly increasing cancer burdens globally. The work of Louis et al. [5] clearly discusses the ranking/classification of brain tumors (BTs) as per the 2016 report of World Health Organization (WHO).
Recently, a substantial number of awareness programs have been initiated to protect people from such abnormalities. However, because of different unavoidable causes, such as contemporary lifestyles, food behavior, heredity factors and age, most individuals are suffering due to developed BT [6,7]. If the BT is detected at a premature stage; a promising treatment can be employed to heal/manage the cell growth. The clinical level detection of BT is performed with; (i) single/multi-channel EEG signals and (ii) brain imaging techniques. The image-assisted technique provides more meaningful information compared to the signal assisted technique. Hence, in most clinical-level detection, the imaging procedures are widely preferred, and the image recording procedures, such as computed tomography (CT) and magnetic resonance imaging (MRI), are widely considered to record and check the brain abnormalities using three-dimensional (3D) and 2D images. Compared to CT, MRI is widely preferred due to its varied modalities, and the visibility of the BT in a brain MRI is very clear compared to the CT. Hence, MRIs are largely preferred to evaluate the various brain abnormalities, including the BT. BT evaluation with the modalities such as Flair, T2 and T1C has enhanced tumor visibility compared to the T1 and diffused weight (DW) modality [8][9][10][11][12].
In the literature, a significant amount of conventional and modern BT detection procedures are proposed and implemented by researchers with a chosen machine learning (ML) or deep learning (DL) technique [13][14][15][16][17][18]. The chief aim of the existing automated and semi-automated disease evaluation procedure is to develop an accurate disease detection system to assist the doctor during the diagnosis and treatment-planning process. Most of the new disease diagnosis systems implement the DL technique due to its superiority and detection accuracy. The work of Talo et al. [19] implemented a transfer-learning-based deep learning architecture (DLA) to detect tumors using 2D MRI slices and achieved a classification accuracy of >98%. Further, Talo et al. [20] presented a detailed analysis on the existing DLA in the literatures and confirmed that the ResNet50 offers a better classification accuracy (>95%) during the brain tumor detection process. Amin et al. [21] implemented a brain tumor evaluation procedure using BRATS2013, 2015 and clinical database and achieved an accuracy of >98%. The work of Sharif et al. [22] implemented an enhanced binomial thresholding and multi-features selection-based technique to classify the brain tumor and attained enhanced result. The work of Fabelo et al. [23] implemented a DLA to detect the glioblastoma using hyperspectral 3D and 2D brain images. Sajid et al. [24] implemented a DL based brain tumor detection procedure and attained better values of sensitivity and specificity. The higher order spectra feature-based detection and classification of the abnormal section in a brain MRI is discussed by Acharya et al. [25]. Further, a considerable number of approaches are proposed and implemented by a considerable number of researchers to improve the detection accuracy on a class of brain MRI images ranging from the benchmark datasets and clinical images [26][27][28][29][30][31].
The work in this paper aimed to evaluate the performances of the existing DLAs, such as AlexNet, VGG16, VGG19, ResNet50 and ResNet101 [20] for detecting the BT using the 2D brain MRI slices. Initial detection was performed with a transfer learning procedure using the attained deep-features and the SoftMax classifier. After finding the best suitable DLA for the considered task, the concatenation of the deep and hand-crafted features was performed to enhance the classification accuracy with the SoftMax classifier. Further, a detailed comparative study with other classifiers, such as random forest (RF), decision tree (DT), k-nearest neighbor (KNN), SVM-linear and SVM-RBF was also performed to attain better classification accuracy.
For the experimental investigation, the 2D brain MRIs recorded using Flair, T2 and T1C images of the dimensions 227 × 227 × 1 were considered, and the essential images for this task were collected using the benchmark datasets, such as BRATS (without skull section) [32,33] and TCIA (with skull section) [34][35][36][37] to train, test and validate the DLAs. Further, a clinical level dataset [38] with the skull was considered to validate the DLA. This clinical level dataset was already used to test the machine-learning systems [2][3][4]. In this work, the performance of the proposed system was confirmed by computing the accuracy, precision, sensitivity, specificity, F1-Score and negative predictive value (NPV). This work also implemented a ten-fold cross validation, and based on the average value, the performance of the proposed DLA was confirmed.
The remaining parts of this work are set out as follows: Section 2 presents the materials and methods. Section 3 presents the details of experimental investigation, and conclusion of this work is discussed in Section 4.

Materials and Methods
In the literature, a number of DLAs are proposed to detect the abnormalities in medical images using conventional and customized DLAs [17,[39][40][41]. Development of a new DLA from the scratch is complex and requires complex work to build, train, test and validate the architecture for a chosen problem. Hence, most of the earlier works adapt the proven DLAs existing in the literature to solve a disease detection problem. Furthermore, selecting and implementing a particular architecture requires prior knowledge about its structure, complexity in implementation, initial tuning and validation procedures [17,[39][40][41][42].
This work initially considers the existing DLAs discussed in [20] to detect the brain tumors from the considered MRI database. This work employs the transfer learning concept to train, test and validate the adopted DLAs using the SoftMax classifier. The DLA which offered the enhanced classification accuracy is then selected and its performance is further enhanced using the proposed method. The initial experimental outcome of this research is confirmed: the VGG19 offers better classification accuracy compared to the alternatives, and hence conventional and customized VGG19 are then considered in this research to attain the better tumor detection accuracy.
After selecting the VGG19 to solve the considered image examination problem, its performance enhancement is tried using the following approaches: (i) replacing the SoftMax classifier with DT, KNN, SVM-Linear and SVM-RBF classifiers, and (ii) enhancing the outcome of VGG19 using a new feature vector obtained by fusing the handcrafted and deep features. The earlier research work confirms that if the feature vector is enhanced, then the pre-trained DLA will offer better detection accuracy compared to the conventional DLA [1,29,30]. Figure 1 depicts the customized VGG19 proposed and implemented in this research. The pre-trained DLA will provide a one-dimensional deep-feature vector, and the hand crafted texture features are attained using counterlet transform (COT) (38 features), curvelet transform (CUT) (121 features) and discrete wavelet transform (DWT) (40 features) [43]. The existing deep and handcrafted features are sorted and serially combined based on principal component analysis (PCA), and these features are then used to train, test and validate the classifier unit, which separates the given image into normal/tumor classes [1].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 13 (NPV). This work also implemented a ten-fold cross validation, and based on the average value, the performance of the proposed DLA was confirmed.
The remaining parts of this work are set out as follows: Section 2 presents the materials and methods. Section 3 presents the details of experimental investigation, and conclusion of this work is discussed in Section 4.

Materials and Methods
In the literature, a number of DLAs are proposed to detect the abnormalities in medical images using conventional and customized DLAs [17,[39][40][41]. Development of a new DLA from the scratch is complex and requires complex work to build, train, test and validate the architecture for a chosen problem. Hence, most of the earlier works adapt the proven DLAs existing in the literature to solve a disease detection problem. Furthermore, selecting and implementing a particular architecture requires prior knowledge about its structure, complexity in implementation, initial tuning and validation procedures [17,[39][40][41][42].
This work initially considers the existing DLAs discussed in [20] to detect the brain tumors from the considered MRI database. This work employs the transfer learning concept to train, test and validate the adopted DLAs using the SoftMax classifier. The DLA which offered the enhanced classification accuracy is then selected and its performance is further enhanced using the proposed method. The initial experimental outcome of this research is confirmed: the VGG19 offers better classification accuracy compared to the alternatives, and hence conventional and customized VGG19 are then considered in this research to attain the better tumor detection accuracy.
After selecting the VGG19 to solve the considered image examination problem, its performance enhancement is tried using the following approaches: (i) replacing the SoftMax classifier with DT, KNN, SVM-Linear and SVM-RBF classifiers, and (ii) enhancing the outcome of VGG19 using a new feature vector obtained by fusing the handcrafted and deep features. The earlier research work confirms that if the feature vector is enhanced, then the pre-trained DLA will offer better detection accuracy compared to the conventional DLA [1,29,30]. Figure 1 depicts the customized VGG19 proposed and implemented in this research. The pre-trained DLA will provide a one-dimensional deep-feature vector, and the hand crafted texture features are attained using counterlet transform (COT) (38 features), curvelet transform (CUT) (121 features) and discrete wavelet transform (DWT) (40 features) [43]. The existing deep and handcrafted features are sorted and serially combined based on principal component analysis (PCA), and these features are then used to train, test and validate the classifier unit, which separates the given image into normal/tumor classes [1].

Image Collection and Processing
In every medical evaluation procedure, the performance of the developed diagnosis system depends mainly on the database considered based on the problem to be solved. To solve the brain tumor detection problem, the most commonly considered images are attained form the well-known benchmark images of the Multimodal Brain Tumor Segmentation Challenge (BRATS) [32,33]. Figure  2 depicts the image dataset considered in this work along with the available MRI modalities. BRATS and TCIA will not have the MRI modalities, such as DW and T1C respectively, and hence, it is denoted as not available (NA) in Figure 2. In this work, the 2D MRI slices of Flair, T2 and T1C are considered for the assessment [2][3][4]6].

Image Collection and Processing
In every medical evaluation procedure, the performance of the developed diagnosis system depends mainly on the database considered based on the problem to be solved. To solve the brain tumor detection problem, the most commonly considered images are attained form the well-known benchmark images of the Multimodal Brain Tumor Segmentation Challenge (BRATS) [32,33]. Figure 2 depicts the image dataset considered in this work along with the available MRI modalities. BRATS and TCIA will not have the MRI modalities, such as DW and T1C respectively, and hence, it is denoted as not available (NA) in Figure 2. In this work, the 2D MRI slices of Flair, T2 and T1C are considered for the assessment [2][3][4]6]. The Cancer Imaging Archive (TCIA) also provides clinical grade medical images for research purposes [34][35][36][37]. In this work, the glioma images associated with the skull section are chosen for the assessment. Furthermore, the clinical grade MRIs collected from the Proscans Ltd are considered to validate the proposed DLA [2][3][4]38]. In this work, the data augmentation is implemented to increase the BRATS and TCIA images with the help of image-flip and image-rotate (90 • left/right) operations. This procedure helped to achieve the considerable number of test images for both the normal and tumor classes. Table 1 presents the details of the test image datasets and its modalities considered in this work. This table also depicts the images considered for training and testing the classifier unit.

Handcrafted Feature Extraction
In ML and DL techniques, feature extraction is the principal procedure which helps to extract the meaningful information from the image based on its shape and the texture values. Based on these features, the implemented classifier units are trained, tested and validated. In the literature, a substantial number of feature extraction techniques are implemented for a class of RGB/gray scaled pictures [44][45][46][47][48][49]. The implemented CUT, COT and DWT helped to get a sum of 199 features, which were then fused with the deep-feature vector to improve its feature dimension. The features extracted with the conventional scheme are called the handcrafted features and every approach provides a 1D feature vector. The feature vectors of chosen procedures are represented as: FV 1 = 1x1x38, FV 2 = 1x1x121, FV 3 = 1x1x40 and the handcrafted-feature vector FV h = FV 1 + FV 2 = FV 3 = 1x1x199. The other details on the feature extraction and selection can be found in [1,30].

Feature Selection and Concatenation
The major objective of this research was to achieve a solitary feature vector by fusing the FV h with deep features (DF) of the considered DLA. In the literature, features are fused with the serial and parallel process and in this work; serial concatenation is adopted due to its simplicity and the FV h to be combined with the DF selected and sorted based on the PCA. Normally, PCA transfer n-vectors (p 1 , p 2 , . . . , p n ) of D-dimensional space into D' space with values (p 1 , p 2 , . . . , p n ) where D and D' are positive integers with size D' ≤ D.
The new feature of PCA can be represented as; where S k = eigenvectors and R k,i = primary components [1,26,29,30]. After implementing the feature concatenation, the final feature vector will be (1x1x199) + (1x1x1024) = 1x1x1223.

Classification
The general performance of the DLA relies mainly on the classifiers employed to classify the considered images into normal/tumor class. The traditional DLA considers the SoftMax, which provides a reasonable accuracy with using transfer learning approach. Further, the performance of the DLA can be enhanced by employing appropriate classifiers [50][51][52][53].
In this work, the SoftMax classifier is replaced with other classifiers, discussed below: • Decision tree: DT is one of the more famous methodologies used to categorize the linear and non-linear information with a sequence of testing methods, which expands like a tree. The DT utilizes a quality exploration situation as the root and internal nodes, and the class label forms terminal nodes. Once a DT has been shaped, categorization is accomplished by the conclusions taken in each branch of the tree. Other particulars of DT can be found in [43][44][45][46]. • K-nearest neighbor: KNN is a well-known technique often considered to classify medical images based on an existing feature set. In this work, KNN is considered to classify the brain MRIs of varied modality. During the classification task, the KNN evaluates the space among new features to each training feature and discovers the best neighbor. The earlier works on the KNN can be found in [43][44][45][46]. • Support vector machine: SVM categorizer uses a hyperplane for labeling of dataset based on features gathered throughout the training stage. SVM is one of the most frequently used to categorize MRI images. Radial basis function-based SVM (SVM-RBF) is used to sort the 2D MRI with the elected features. In SVM-RBF, the kernel value is controlled by a scaling parameter "σ"; and this value is varied from 0.2 to 1.9 with a step size of 0.1. Furthermore, the SVM with linear polynomial kernel (SVM-Linear) is also adopted to grade the MRI database [43].

Performance Measures and Validation
The evaluation of the performance of classifiers is normally carried by computing the essential performance values. In this work, the classifier performance is to be assessed based on a chosen performance values. The initial assessment computes the essential measures, such as true-positive (TP), true-negative (TN), false-positive (FP) and false-negative (FN) values [43][44][45]54].

Experimental Outcome and Discussions
This part of the work presents the experimental results and discussions. This work is executed using the workstation I5 processor, 8 GB RAM and 2 GB VRAM within a Matlab environment. During this work, the following initial values are assigned for every DLA: epoch size = 55, iteration size = 1200, iteration per epoch = 110, updating frequency = five iterations, learning error rate = 1e-5, stopping criteria = best validation or maximum iteration.
Initially, the DLAs such as AlexNet, VGG16, VGG19, ResNet50 and ResNet101 are considered to examine and classify the dataset into normal/tumor classes using SoftMax classifier trained and tested with the deep-features. Initially, the AlexNet is implemented to solve the classification problem and the attained results are depicted in Figure 3. Figure 3a, b presents the accuracy and the loss value attained during the training and testing, respectively. A similar procedure is repeated with the other DLAs and the corresponding results attained are depicted in Table 2. The results available in this table confirm that the performance values attained with the VGG19 DLA are better for all the MRI modalities, such as Flair, T2 and T1C. This result confirms that, for the chosen dataset, the VGG19 offered better results compared to the alternatives, and hence, the VGG19 is then considered in this work for further enhancement to attain a better accuracy using the proposed methodology.  The performance of the chosen VGG19 is then verified by replacing the SoftMax classifier with other approaches, such as DT, KNN, SVM-Linear and SVM-RBF, and the obtained results are depicted in Table 3. This table confirms that the pre-trained VGG19 offered a classification accuracy of 96.70% for Flair modality (SVM-RBF), 96.10% for T2 modality (DT) and 94.60% for the T1C modality MRI slices. These results confirm that, if the SoftMax is replaced with a chosen classifier unit, the detection accuracy can be improved. The performance of the traditional VGG19 is further improved by combining the deep-features with the handcrafted features (FV h ). During this process, a serial concatenation procedure is implemented to combine the deep-features of dimension 1x1x1024 with the FV h of dimension 1x1x199 to attain a new feature vector size of 1x1x1223. This feature set is then considered to train, test and validate the classifier implemented in the VGG19 network. The sample results attained with various layers of the VGG19 are presented in Figure 4 and the area under the curve (AUC) of 98.5% attained using the VGG19 with SVM-RBF classifier is presented in Figure 5.    Table 4 shows the overall performance measures achieved after a 10-fold cross validation for the BRATS, TCIA and clinical brain MRI datasets, and averages of the performance measures are considered for the validation. The results shown in this table confirm that the proposed network helped to achieve better classification accuracy for all the considered datasets irrespective of the modalities of the test images. Figure 6 depicts the sample classification result attained with the proposed VGG19 with SVM-RBF classifier for the TCIA database. The TCIA database consists of the brain MRI slices with the skull section, and hence, the average detection accuracy attained for the TCIA database is less compared with the results attained with the BRATS database. Figure 7 depicts the overall results attained with the customized VGG19 for the considered test images, and these results confirm that the VGG19 works well on the considered image dataset. Further, this architecture helped to attain classification accuracy of 98.17%. This outcome confirmed that the proposed DLA is clinically significant, and in future, it can be considered to detect tumors in clinical-grade brain MRI slices. During the real time implementation, this DLA can be used as an assisting tool for the doctor to make the possible decisions during brain tumor detection and treatment-planning processes.   Table 4 shows the overall performance measures achieved after a 10-fold cross validation for the BRATS, TCIA and clinical brain MRI datasets, and averages of the performance measures are considered for the validation. The results shown in this table confirm that the proposed network helped to achieve better classification accuracy for all the considered datasets irrespective of the modalities of the test images. Figure 6 depicts the sample classification result attained with the proposed VGG19 with SVM-RBF classifier for the TCIA database. The TCIA database consists of the brain MRI slices with the skull section, and hence, the average detection accuracy attained for the TCIA database is less compared with the results attained with the BRATS database. Figure 7 depicts the overall results attained with the customized VGG19 for the considered test images, and these results confirm that the VGG19 works well on the considered image dataset. Further, this architecture helped to attain classification accuracy of 98.17%. This outcome confirmed that the proposed DLA is clinically significant, and in future, it can be considered to detect tumors in clinical-grade brain MRI slices. During the real time implementation, this DLA can be used as an assisting tool for the doctor to make the possible decisions during brain tumor detection and treatment-planning processes.   The performance of the proposed DLA is then validated with other methods existing in the literature, and the results are presented in Table 5. These results confirm that customized VGG19 offers enhanced outcomes for the BRATS, TCIA and clinical level images compared to the results of existing methods. This study was focused on implementing a DLA to segregate MRI slices into normal/tumor classes, and after the segregation, the MRI slices with the tumor were further examined by the doctor. In future, the outcome of the proposed system along with the clinically collected data can be used to develop a computerized model to track the ependymal tumor dissemination.
The future scope of the proposed research includes: (i) Enhancing the handcrafted feature vector by considering the additional texture and shape features.
(ii) Adjusting the fully-connected and drop-out layers to improve the categorization accuracy.
(iii) Improving the feature-concatenation technique to attain better results.
(iv) Implementing the proposed VGG19 DLA to classify the tumors into low/high grade gliomas.
(v) Developing a neural-network model for ependymal tumor dissemination.

Conclusions
The main objective of the proposed research was to identify and improve a suitable deep-learning architecture which would help to achieve a better detection of brain tumors from 2D MRIs. With an experimental investigation, this work identified that the VGG19 helped to attain better results compared to AlexNet, VGG16, ResNet50 and ResNet101. The proposed work implemented the following techniques to improve the detection accuracy of the VGG19; (i) replacing the SoftMax classifier with well-known classifiers, such as decision tree, k-nearest Neighbor, SVM-linear and SVM-RBF, and (ii) improving the performance of the pre-trained VGG19 by implementing a future fusion technique to help improve the detection accuracy. In this work the customized VGG was developed using the handcrafted features of dimension 1x1x199 and deep features of dimension 1x1x1024, and then these features were sorted based on the PCA and fused using the serial concatenation technique. The final feature vector of size 1x1x1223 is then considered to enhance the classification accuracy. In this work, the brain images of BRATS, TCIA and clinical datasets were considered for the examination, and the overall results attained with the proposed VGG19 with SVM-RBF classifier helped to attain better results on Flair, T2 and T1C modality images. The performance was confirmed with ten-fold cross validation and classification accuracies of >99%, >98% and >97% for the modalities Flair, T2 and T1C, respectively.