Lung Cancer Prediction Using Robust Machine Learning and Image Enhancement Methods on Extracted Gray-Level Co-Occurrence Matrix Features

: In the present era, cancer is the leading cause of demise in both men and women world-wide, with low survival rates due to inefﬁcient diagnostic techniques. Recently, researchers have been devising methods to improve prediction performance. In medical image processing, image enhancement can further improve prediction performance. This study aimed to improve lung cancer image quality by utilizing and employing various image enhancement methods, such as image adjustment, gamma correction, contrast stretching, thresholding, and histogram equalization methods. We extracted the gray-level co-occurrence matrix (GLCM) features on enhancement images, and applied and optimized vigorous machine learning classiﬁcation algorithms, such as the decision tree (DT), naïve Bayes, support vector machine (SVM) with Gaussian, radial base function (RBF), and polynomial. Without the image enhancement method, the highest performance was obtained using SVM, polynomial, and RBF, with accuracy of (99.89%). The image enhancement methods, such as image adjustment, contrast stretching at threshold (0.02, 0.98), and gamma correction at gamma value of 0.9, improved the prediction performance of our analysis on 945 images provided by the Lung Cancer Alliance MRI dataset, which yielded 100% accuracy and 1.00 of AUC using SVM, RBF, and polynomial kernels. The results revealed that the proposed methodology can be very helpful to improve the lung cancer prediction for further diagnosis and prognosis by expert radiologists to decrease the mortality rate.


Introduction
Every year, the American Cancer Society estimate the new cancer cases in the United States regarding the new number of cancer cases and deaths on population-based cancer.In the year 2021, about 1,898,160 new cancer cases were encountered, and 608,570 deaths were projected in the United States [1], and 85% will be non-small cell lung cancer (NSCLC) [2,3], among others.NSCLC can be detected with radiofrequency (RF) excision and stereotactic body radiotherapy (SBRT) methods.Lung cancer has two main types, i.e., NSCLC and small cell lung carcinoma (SCLC).Both of these types of cancer spread in different ways and require different treatment accordingly.NSCLC spreads very slowly and is different than SCLC, which is related to smoking, and propagates very rapidly, forms tumors, and spreads in whole body.The deaths based on SCLC are proportional to the number of cigarettes smoked [4].
This study is specifically designed to propose automated tools which improve lung cancer prediction.The tumor type and stage can further improve the early detection of NSCLC.Usually, due to the late detection of cancer cases, there is a 16% survival rate of NSCLC for five years.Additionally, chemotherapy is a standard therapy for SCLC, and the response for SCLC patients is 60%.Thus, the recurrence rate is a few months, resulting in an overall survival of 6% for SCLC.In the past few decades, the survival rates for these two types of cancer have been changed.
The diagnosis of lung cancer is mainly based on findings from traditional chest radiography [5], bronchoscopy, computed tomography (CT) [6], magnetic resonance (MR) [7], positron emission tomography (PET), and biopsy.Due to late diagnosis, lung cancer prognosis remains poor, and advanced metastasis is usually present at the time of presentation.The 5-year persistence ratio for lung cancer confined to the lungs is approximately 54%, but only 4% for inoperable, advanced lung cancer.Treatment modalities depend on the extent and type of the cancer, and include surgery, radiotherapy, and chemotherapy.Computed tomography (CT) is a very sensitive tool for diagnosing, assessing the extent of tumor growth, and monitoring disease progression.
The main objective of this study is to utilize different image enhancement methods to improve lung cancer prediction by improving the image quality, and to further investigate which enhancement method is more robust.To the best of our knowledge, these methods have not been utilized on this dataset.We utilized image enhancement methods, such as image adjustment, gamma correction, thresholding, contrast stretching for improving input image quality and noise removal, etc., and compared results without image enhancement methods.The image enhancement methods improved the prediction performance on the Lung Cancer Alliance dataset.After utilizing image enhancement methods, the texture features based on GLCM were computed.Previously, researchers utilized different image enhancement methods, such as histogram equalization (HE), to improve the image appearance.For low-contrast foreground and background images, HE increases contrast and decreases intensity [8].The image adjustment method was used by [9] to improve the resolution in a 3-dimensional glasses viewing system.Moreover, contrast adjustment was utilized by [10] to enhance MRI images of visual attenuation.Similarly, different authors [11] utilized enhanced methods to remove gamma correction, haze, external noise, etc. Section 1 is the introduction, Section 2 explains the material and methods utilized in our study, Section 3 describes the results and discussions, and Section 4 details the conclusion along with limitations and future recommendations.

Dataset
The public dataset [12] was utilized as provided by the Lung Cancer Alliance (LCA).The LCA is a national nonprofit organization which provides patient advocacy and support.Moreover, this web-based database repository facilitates the researchers.A Digital Imaging and Communications in Medicine (DICOM) format is used for database images.There were 76 patients, with a total of 945 images, including 377 from NSCLC and 568 from SCLC.
The same dataset is utilized and detailed in [13].We utilized 10-fold cross validation, which minimizes the chances of overfitting [14].

Data Augmentation
To avoid overfitting, the researchers used different image data augmentation methods for increasing data comprising geometric transformation, and photometric shifting and primitive data augmentation methods.These methods include rotation, flipping, cropping, shearing, and translation in the geometric transformation.
Flipping Vyas et al. 2018 [1] proposed the flipping method, which reflects an image around its horizontal or vertical axis or both.This method is helpful for maximizing the image number in the dataset without requiring other artificial processing techniques.
Rotation Sifre and Mallat 2013 [2] proposed another geometric data augmentation method known as rotation, which rotates the images around the axis in the right or left direction by an angle between 1 to 359.
Shearing Vyas et al. 2018 [1] proposed the shearing method, which changes the original image along the x and y directions.In this case, the existing object shape is changed to an image.
Cropping Image cropping was utilized by Sifre and Mallat 2013 [2], which is also known as scaling or zooming.It magnifies the image using cropping.
Translation Using the translation method [1], the object is moved from one position to another position in the image.During this process, the image's black or white part is left after translation, which preserves the image data, or it includes the Gaussian noise or is randomized.The translation can also take place in X, Y, or both directions.

Image Enhancement
For machine learning, feature extraction is utilized to extract the most relevant information from images, which are then fed as input to different machine learning algorithms.However, image enhancement methods before computing the features can further improve the prediction performance.Good-quality images and the improved visual effects of images can be further helpful for diagnostic systems.In this study, we utilized different image enhancement methods based on various factors to improve the image quality before computing features, and fed them to machine learning algorithms.
Figure 1 shows the workflow in predicting lung cancer.First, we applied image enhancement methods, such as gamma correction at various thresholds, image adjustment, threshold, contrast stretching, and histogram equalization.We then computed the texture and GLCM features, and fed these features into machine learning classification algorithms.

Image Adjustment
With the image adjustment, the grayscale images are converted into new image.The resultant images are adjusted with high and low intensities of the input image.The grayscale image contrast is enhanced in this way, used for further diagnostic procedures [15].
Here, the lower range is denoted by 'l', and the higher range is represented by 'h'; 'n' denotes the input, and 'o' is the output, where c and d denote prevailing lower and higher pixel values, respectively.
The quality of the image is determined by gamma; the brighter or higher image value is determined by gamma, where gamma > 1 reduces the output image and the image becomes darker [16].The medical diagnostic system was designed by [17] using contrast enhancement.

Image Adjustment
With the image adjustment, the grayscale images are converted into new image.The resultant images are adjusted with high and low intensities of the input image.The grayscale image contrast is enhanced in this way, used for further diagnostic procedures [15].
Here, the lower range is denoted by '', and the higher range is represented by 'h'; 'n' denotes the input, and 'o' is the output, where c and d denote prevailing lower and higher pixel values, respectively.
The quality of the image is determined by gamma; the brighter or higher image value is determined by gamma, where gamma > 1 reduces the output image and the image becomes darker [16].The medical diagnostic system was designed by [17] using contrast enhancement.

Gamma Correction
This method was utilized as a nonlinear method for brightness decoding and encoding to adopt with human visual insight.The transformation method is mathematically represented as follows: Here,  denotes the intensity level, and   denotes the maximum intensity level of the image [18].Usually, most of the image-taking devices do not take exact luminance and familiarize nonlinearity, thus gamma correction is required to improve the luminance.The power function in Equation (1) compensates for the nonlinearity introduced to

Gamma Correction
This method was utilized as a nonlinear method for brightness decoding and encoding to adopt with human visual insight.The transformation method is mathematically represented as follows: Here, l denotes the intensity level, and l max denotes the maximum intensity level of the image [18].Usually, most of the image-taking devices do not take exact luminance and familiarize nonlinearity, thus gamma correction is required to improve the luminance.The power function in Equation (1) compensates for the nonlinearity introduced to properly reproduce the actual luminance.The gamma transformation curves for different values of gamma are plotted in Figure 1a.The value of γ < 1 raises the brightness, where a value of γ > 1 increases the darkness of the image.The value of γ = 1 shows that the gamma correction curve reduces towards the identity, and, hence, does not alter the identity.The luminance nonlinear dynamics for lower gray-levels greatly increase, whereas the dynamics for moderate gray-levels slightly increase, and the dynamic ranges for higher gray-levels are compressed.In this study, we applied the gamma values as γ = [0.03,0.4, 0.5, 0.1, 1, 2, 4] according to the scientific requirements [19][20][21].

Contrast Stretching
In the image enhancement, contrast stretching is also recognized as normalization.By stretching the intensity value ranges, the image quality is enhanced.To enhance the images, we specified the upper-and lower-pixel value limits as (0.02-0.98) and (0.05-0.95) by considering the highest and lowest pixel values in the images.Consider a as lower limit and b as upper limit, where existing lowest and highest pixel values are denoted by c and d, respectively.Then, each pixel, p, is scaled according to the following equation: The outlier having very low and high value can affect severely the value of c or d.To avoid this, first, a histogram for the images is taken, and then, c and d as 5th and 95th percentiles are selected in the histogram.

Thresholding
In the first step, Otsu's Thresholding Technique [22] is used to remove the noisy part of the image.In 1975, Otsu proposed Otsu's Method to determine the optimum threshold.The Otsu value depends on the discrimination analysis, which maximizes the separability level of classes in gray-level images [23].Ostu proposed a method between the summation of sigma function and class variance according to the below equations.
The input image mean intensity is denoted by µ T .For bi-level thresholding, the mean level (µ i ) of two classes is obtained using the below equations [24]: The optimal threshold value by maximizing the between-class variance function can be computed using the following:

Feature Extraction
In medical image diagnostic systems, the next important step is to compute the most relevant features.Researchers are paying much attention to compute the most relevant features [17].In the field of medical imaging problems, the medical data are collected without sacrificing the result quality [25][26][27][28][29][30][31].To capture the most relevant properties, we computed GLCM features.

Gray-Level Co-Occurrence Matrix (GLCM)
In the machine learning (ML) techniques, the most important step is to compute most relevant features to capture the maximal hidden information present in the data of interest.The GLCM features were applied from an input image by applying transition on two pixels with gray-level.GLCM features were originally proposed by [32] in 1973, which characterized texture utilizing diverse quantities acquired from second-order image statistics.Two steps are required to compute the GLCM features.Firstly, the pair-wise spatial co-occurrence of image pixels are separated by distance, d, and direction angle, θ.A spatial relationship between two pixels is created for neighboring and reference pixels.In the second step, GLCM features are computed with scaler quantities, which utilize the representation of numerous aspects of an image.This process produces the gray-level co-occurrence matrix that contains several gray-level pixel combinations in an image of interest or the specific portion of an image [32].The resultant GLCM matrix comprises MxM, where M denotes the gray-level numbers in an image.To compute the GLCM, we utilized the distance, d = {1, 2, 3, 4}, and angle, θ = {0 • , 45 • , 90 • and 135 • }, for directions.Consider the pixel probability, P(i, j, d, θ), indicates the two pixels' probability separated by a particular distance having gray-levels i and j [33][34][35].The GLCM-based texture features were composed of contrast, sum of square variance, cluster shade [36], correlation [37], and two values of homogeneity [37][38][39].GLCM features have been successfully utilized in the classification of breast tissues [40], and many other medical imaging problems [36,[41][42][43] detailed in [44][45][46].

Classification
After extracting the features, the next important step is to train and choose the classification algorithm.We applied and optimized the SVM with polynomial, radial base function (RBF) and Gaussian kernels, naïve Bayes, and decision tree.Vladimir Vapnik, in 1979, proposed SVM for classification problems.SVM has recently gained much popularity, being widely used in large margin classification problems, including medical diagnosis areas [47,48], machine learning [49], and pattern recognition [50,51].SVM has also been successfully used in many other applications, such as signature and text recognition, face expression recognition, speech recognition, biometrics, emotion recognition, and several content-based applications, as detailed in [52][53][54].Naïve Bayes belong to the family of probabilistic networks based on Bayes' theorem.Based on its performance, naïve Bayes is used in many applications, detailed in [55][56][57][58].Moreover, the decision tree algorithms have also been used in many applications of medical, economic, and scientific applications [59].

Support Vector Machine (SVM)
In classification problems, the SVM is utilized as a supervised ML algorithm.It is used in many applications, including medical systems [47,48], pattern recognition [60], and machine learning [61].Recently, SVM has been utilized in speech recognition, text recognition, biometrics, and image retrieval problems.To separate the nonlinear problems, SVM using the largest margin constructed a hyperplane.The good margin produced the lowest generalization error.

Decision Tree (DTs)
Breiman, in 1984, proposed the DTs algorithm [62], which serves as a learning algorithm, decision support tool, or predictive model to handle large input data, and predict the class label or target on numerous input variables.The DTs classifiers check and compare similarities in the dataset, and rank them to distinct classes.The DTs algorithm was utilized by [63] to classify data based on a choice of attributes to fix and maximize the data division.

Naïve Bayes (NB)
The NB algorithm was introduced by Wallace and Masteller in 1963, based on the family of the probabilistic classifier.In 1960, this algorithm was used for clustering problems.Due to the large computational errors, the NB methods are biased.This problem can be minimized by reducing the probability valuation errors.However, this may not guarantee the reduction of errors.The poor performance is obtained due to the bias-variance decomposition among probability computation performance and classification errors [64].Recently, NB has been used in many applications, detailed in [55][56][57][58], due to its good performance [65].

Training/Testing Data Formulation
The jack-knife 10-fold cross-validation method was utilized for training/testing data validation.This is one of the most widely used methods used for validation when there is a small dataset, to avoid the overfitting.During this process, our data are simultaneously utilized for training and testing.The data are initially divided into 10 folds, where, in training, 9 folds participate, and the remaining folds are used for predictions based on training data performance.The entire process is repeated 10 times to predict each class sample.The unseen sample-predicted labels are finally utilized to determine the classifi-cation accuracy.To avoid overfitting, k-fold cross-validation is used where the dataset is small, as one of the standard approaches used by the researchers [14].In our case, there were a total of 945 images from both classes.For a larger dataset, the holdout method is usually preferred.For model tuning, the dataset is split into multiple train-test bins.In the standard iterative process, the k-1 folds are involved in the training of the model, and the rest of folds are used for model testing.The general k-fold cross-validation method is reflected in Figure 2 below: The jack-knife 10-fold cross-validation method was utilized for training/testing data validation.This is one of the most widely used methods used for validation when there is a small dataset, to avoid the overfitting.During this process, our data are simultaneously utilized for training and testing.The data are initially divided into 10 folds, where, in training, 9 folds participate, and the remaining folds are used for predictions based on training data performance.The entire process is repeated 10 times to predict each class sample.The unseen sample-predicted labels are finally utilized to determine the classification accuracy.To avoid overfitting, k-fold cross-validation is used where the dataset is small, as one of the standard approaches used by the researchers [14].In our case, there were a total of 945 images from both classes.For a larger dataset, the holdout method is usually preferred.For model tuning, the dataset is split into multiple train-test bins.In the standard iterative process, the k-1 folds are involved in the training of the model, and the rest of folds are used for model testing.The general k-fold cross-validation method is reflected in Figure 2 below:

Results and Discussions
This study was specifically conducted to improve lung cancer prediction performance by optimizing feature extraction and machine learning methods.We first extracted the 22 gray-level co-occurrences (GLCM) features, and applied vigorous ML algorithms to classify the NSCLC from SCLC.We then used an image enhancement method, i.e., image adjustment, before extracting the GLCM feature, and then applied machine learning algorithms.The results reveal that the prediction performance based on image enhancement methods was improved with all the classification algorithms.
Figure 3a reflects the texture features extracted from NSCLC and SCLC MRI images.We utilized the supervised machine learning algorithms.The SVM Gaussian and RBF

Results and Discussions
This study was specifically conducted to improve lung cancer prediction performance by optimizing feature extraction and machine learning methods.We first extracted the 22 gray-level co-occurrences (GLCM) features, and applied vigorous ML algorithms to classify the NSCLC from SCLC.We then used an image enhancement method, i.e., image adjustment, before extracting the GLCM feature, and then applied machine learning algorithms.The results reveal that the prediction performance based on image enhancement methods was improved with all the classification algorithms.
Figure 7 reflects the AUC separation at different folds from 1 to 10 for each of the classification algorithms.The maximum AUC was covered at all the folds for SVM Gaussian, RBF, polynomial, and decision tree, whereas the naïve Bayes yielded a slight lower AUC for folds 1 to 10 using gamma correction at a gamma value of 0.04.
Figure 8 reflects the lung cancer prediction performance using the decision tree model by extracting GLCM-based texture features and applying image enhancement gamma correction methods with gamma = 0.04 and 0.5 using 10-fold cross-validation techniques.The corresponding prediction plots with mean ± standard deviation are reflected in Figure 9.Using gamma correction at gamma = 0.04, the NSCLC TP predicted are 318, and 38 were FP.Likewise, for SCLC, 552 were predicted as TN, and 11 were predicted as FN.Using gamma correction at gamma = 0.5, NSCLC yielded 349 as TP, and 7 as FP, whereas SCLC yielded 559 as TN, and 4 as FN.In Figure 8, the blue color represents the NSCLC, and the red color represents the SCLC.The solid line represents correct predictions, and the bold line with crosses represents incorrect predictions.As the confusion matrix indicates, using gamma = 0.04, there were 38 misclassification examples represented as FP, i.e., NSCLC misclassified to SCLC, and 11 examples were misclassified as FN, i.e., SCLC misclassified to NSCLC.In the case of gamma = 0.5, for NSCLC, only seven examples were misclassified to SCLC, and for SCLC, only four examples were misclassified to NSCLC, as reflected in Figure 9a   Figure 7 reflects the AUC separation at different folds from 1 to 10 for each of the classification algorithms.The maximum AUC was covered at all the folds for SVM Gaussian, RBF, polynomial, and decision tree, whereas the naïve Bayes yielded a slight lower AUC for folds 1 to 10 using gamma correction at a gamma value of 0.04.Table 1 reflects the results based on data augmentation methods to avoid overfitting.We utilized the augmentation methods using 'RandRotation'= (−20, 20), 'RandXReflection' = 1, 'RandYReflection' = 1, 'RandXTranslation' = (−3, 3), and 'RandYTranslation' = (−3, 3) on GLCM features extracted from lung cancer SCLC and NSCLC, and then applied the image enhancement method gamma correction with gamma = 0.5.Using data augmentation, we obtained mostly similar results as those obtained without data augmentation.The highest accuracy was yielded using SVM Gaussian, with 100% accuracy and 1.0 of AUC; followed by SVM RBF, with accuracy (99.89%),AUC (1.0); SVM polynomial, with accuracy (99.89%),AUC (0.9999); decision tree, with accuracy (98.91%),AUC (0.98); and naïve Bayes, with accuracy (98.91%) and AUC (0.98).
Table 2 reflects the lung cancer detection results using different feature extraction and classification methods.In the past, researchers utilized different automated approaches to detect lung cancer.Sousa et al. [69] applied different features of extraction methods, such as gradient, histogram, and spatial methods, and obtained an overall accuracy of 95%.Dandil et al. [6] also extracted different features, such as GLCM, shape-based features, and statistical and energy features, and achieved an accuracy of 95%; however, a sensitivity of 97% was obtained, and a specificity of 96%.Nasrulla et al. [70] computed the statistical features and obtained a sensitivity of 94%, specificity of 90%, and AUC of 0.990.Han et al. [71] used machine learning techniques to distinguish the SCLC types, and achieved an accuracy of 84.10%.Grossman et al. [72] applied EfficientNet to deep learning, and obtained a highest accuracy of 90%.Hussain et al. [13] computed different entropic-based features and computed the nonlinear dynamics to distinguish the SCLC from NSCLC with the highest significant results (p-value < 0.000000).In this study, we first applied image enhancement methods, such as gamma correction at different gamma values, contrast stretching at different thresholds, image adjustment, and histogram processing methods, and then computed the GLCM texture features.We obtained the improved detection results.The researchers in the existing studies did not utilize image enhancement methods and data augmentation techniques.Image enhancement methods on acquired data further improves the image quality, thereby improving the detection results.The proposed methods improved the classification results which can best be utilized by concerned health practitioners to further improve diagnostic capabilities.Table 1 reflects the results based on data augmentation methods to avoid overfitting We utilized the augmentation methods using 'RandRotation'= (−20, 20), 'RandXReflec tion' = 1, 'RandYReflection' = 1, 'RandXTranslation' = (−3, 3), and 'RandYTranslation' (−3, 3) on GLCM features extracted from lung cancer SCLC and NSCLC, and then applied the image enhancement method gamma correction with gamma = 0.5.Using data aug mentation, we obtained mostly similar results as those obtained without data augmenta tion.The highest accuracy was yielded using SVM Gaussian, with 100% accuracy and 1. of AUC; followed by SVM RBF, with accuracy (99.89%),AUC (1.0); SVM polynomial, wit accuracy (99.89%),AUC (0.9999); decision tree, with accuracy (98.91%),AUC (0.98); an naïve Bayes, with accuracy (98.91%) and AUC (0.98).  2 reflects the lung cancer detection results using different feature extractio and classification methods.In the past, researchers utilized different automated ap proaches to detect lung cancer.Sousa et al. [69] applied different features of extractio methods, such as gradient, histogram, and spatial methods, and obtained an overall accu racy of 95%.Dandil et al. [6] also extracted different features, such as GLCM, shape-base features, and statistical and energy features, and achieved an accuracy of 95%; however a sensitivity of 97% was obtained, and a specificity of 96%.Nasrulla et al. [70] computed the statistical features and obtained a sensitivity of 94%, specificity of 90%, and AUC o 0.990.Han et al. [71] used machine learning techniques to distinguish the SCLC types, an  For contrast stretching, we utilized the different threshold ranges, i.e., (0.1, 0.90; 0.02, 0.98; 0.05, 0.95).The interval ranges (0.02, 0.98) and (0.05, 0.95) yielded the highest improved detection performance, which indicates that these ranges are more appropriate to further enhance the image quality for this lung cancer dataset.The threshold range (0.02, 0.98) yielded further improved performance for the selected classifiers.Likewise, the image adjustment enhancement method also yielded higher detection performance for all the classifiers, similar to the contrast stretching range (0.02, 0.98).For gamma correction, we set different values of gamma, such as 0.04, 0.4, 0.5, 0.7, 0.9, and 4.0.The detection performance at lower and higher gamma values reduced the detection performance; however, the mid gamma values yielded a higher detection performance.The gamma value of 0.9 yielded the highest detection performance for all the classifiers, and the decision tree and naïve Bayes algorithms yielded a higher detection performance than other contrast stretching and image adjustment methods.The gray-level thresholding and histogram equalization methods do not enhance the detection performance much.

Conclusions
Lung cancer is the most threatening cancer type in the world.It is the most common and leading cause of deaths internationally.The incidences of cancer-related deaths have multiplied unexpectedly, and lung cancer has come to be the most prevalent cancer in the majority of countries.This study was conducted to distinguish between the groups of NSCLC and SCLC by first extracting hand-crafted texture features and employing supervised machine learning algorithms, such as naïve Bayes, and decision tree and SVM with RBF, Gaussian, and polynomial kernels.We then applied different image enhancement methods, such as image adjustment contrast stretching, thresholding, gamma correction, etc., before computing texture features, and fed them into machine learning algorithms.The image enhancement methods further improved the detection performance to detect lung cancer.In order to ovoid overfitting, we also applied data augmentation methods.The results revealed that the proposed methods are very robust in improving the further diagnosis of lung cancer prognosis by expert radiologists.
Limitations and Future Recommendations: The present study was carried out on a small lung cancer dataset provided by the Lung Cancer Alliance on lung cancer types, i.e., NSCLC and SCLC.In the future, we will apply the proposed methods based on image enhancement, feature extraction, and ranking and machine learning methods on larger datasets with more clinical details, disease severity levels, and more types, and larger datasets acquired on different imaging modalities.

22 Figure 1 .
Figure 1.Schematic diagram for lung cancer detection based on image enhancement methods on GLCM texture features by utilizing machine learning techniques.

Figure 1 .
Figure 1.Schematic diagram for lung cancer detection based on image enhancement methods on GLCM texture features by utilizing machine learning techniques.

Figure 4 .
Figure 4. Area under the receiver operating characteristic (AUC) curve: (a) texture with image adjustment, (b) gamma correction at gamma value of 0.04 to distinguish the NSCLC from SCLC.

Figure 4 .Figure 5 .
Figure 4. Area under the receiver operating characteristic (AUC) curve: (a) texture with image adjustment, (b) gamma correction at gamma value of 0.04 to distinguish the NSCLC from SCLC.Appl.Sci.2022, 12, x FOR PEER REVIEW 10 of 22

Figure 8
Figure 8 reflects the lung cancer prediction performance using the decision tree model by extracting GLCM-based texture features and applying image enhancement

Figure 9 .
Figure 9. Parallel prediction plots (a) using gamma correction at gamma = 0.04 on predictions, (b) gamma correction with gamma = 0.5 on predictions, (c) gamma correction with gamma = 0.04 data using 1std.

Table 1 .
Lung cancer detection performance based on data augmentation with gamma correctio (gamma = 0.5), and applying the ML methods.

Table 1 .
Lung cancer detection performance based on data augmentation with gamma correction (gamma = 0.5), and applying the ML methods.

Table 2 .
Comparison of findings with previous studies.