Breast Density Classification Using Local Quinary Patterns with Various Neighbourhood Topologies

: This paper presents an extension of work from our previous study by investigating the use of Local Quinary Patterns (LQP) for breast density classiﬁcation in mammograms on various neighbourhood topologies. The LQP operators are used to capture the texture characteristics of the ﬁbro-glandular disk region ( FGD roi ) instead of the whole breast area as the majority of current studies have done. We take a multiresolution and multi-orientation approach, investigate the effects of various neighbourhood topologies and select dominant patterns to maximise texture information. Subsequently, the Support Vector Machine classiﬁer is used to perform the classiﬁcation, and a stratiﬁed ten-fold cross-validation scheme is employed to evaluate the performance of the method. The proposed method produced competitive results up to 86.13% and 82.02% accuracy based on 322 and 206 mammograms taken from the Mammographic Image Analysis Society (MIAS) and InBreast datasets, which is comparable with the state-of-the-art in the literature.


Introduction
In 2014, there were more than 55,000 malignant breast cancer cases diagnosed in the United Kingdom (UK), with more than 11,000 mortalities [1].In the United States (US), it was estimated that more than 246,000 malignant breast cancer cases were diagnosed in 2016, with approximately 16% of women expected to die [2].Many studies have indicated that breast density is a strong risk factor for developing breast cancer [3][4][5][6][7][8][9][10][11][12] because breast cancer has a very similar appearance to dense tissues, which makes it difficult to detect in mammograms.Keller et al. [13]investigate the associations between breast cancer and the Laboratory for Individualized Breast Radiodensity Assessment (LIBRA) tool.Their study found that there is a significant association between breast cancer and the Gail risk factor plus Body Mass Index (BMI).
Therefore, an accurate breast density estimation is an important step during the screening procedure because women with dense breasts can be six times more likely to develop breast cancer [1,2].Although most experienced radiologists can do this task, manual density classification is impractical, tiring, time-consuming and often results vary between radiologists.Computer Aided Diagnosis (CAD) systems can reduce these problems providing robust, reliable, fast and consistent diagnosis results.Currently, in the US, breast density is visually assessed by radiologists who classify density according to the fourth edition of Breast Imaging-Reporting and Data System (BI-RADS) into four classes based on the following criteria: (a) BI-RADS I (0-25% dense tissues, predominantly fat); (b) BI-RADS II The remainder of the paper is organised as follows: Section 2 summarises related work in the literature; we present the technical details of the proposed method in Section 3 and present experimental results in Section 4 including the quantitative evaluation and comparisons with existing studies in the literature.Finally, we discuss and present possible future work in Section 5 and conclude the study in Section 6.

Literature Review
One of the earliest approaches to breast density assessment was a study by Boyd et al. [16] using interactive thresholding known as Cumulus, where regions with dense tissue were segmented by manually tuning the grey-level threshold value.The most popular approaches are based on the first and second-order (e.g., Grey Level Co-occurrence Matrix) statistical features as used by Oliver et al. [3], Bovis and Singh [4], Muštra et al. [17] and Parthaláin et al. [6].Texture descriptors such as local binary patterns (LBP) were employed in the study of Chen et al. [7] and Bosch et al. [8] and Textons were used by Chen et al. [7], Bosch et al. [8] and Petroudi et al. [12].Other texture descriptors also have been evaluated such as fractal-based [5,10], topography-based [9], morphology-based [3] and transform-based features (e.g., Fourier, Discrete Wavelet and Scale-invariant feature) [4,8].
There are many breast density classification methods in the literature, but only a few studies have achieved accuracies above 80%.The methods of Oliver et al. [3] and Parthaláin et al. [6] extract a set of features from dense and fatty tissue regions segmented using a fuzzy c-means clustering technique for input into the classifier.Oliver et al. [3] achieved 86% accuracy, and Parthaláin et al. [6], who used a sophisticated feature selection framework, achieved 91.4% accuracy based on the Mammographic Image Analysis Society (MIAS) database [18].Bovis and Singh [4] produced 71.4% accuracy based on a combined classifier paradigm in conjunction with a combination of the Fourier and Discrete Wavelet Transforms, and first and second-order statistical features.Chen et al. [7] made a comparative study on the performance of local binary patterns (LBP), local grey-level appearance (LGA), Textons and basic image features (BIF) and reported accuracies of 59%, 72%, 75% and 70%, respectively.Later, they proposed a method by modelling the distribution of the dense region in topographic representations and reported a slightly higher accuracy of 76%.Petroudi et al. [12] implemented the Textons approach based on the Maximum Response 8 (MR8) filter bank.The χ 2 distribution was used to compare each of the resulting histograms from the training set with all the learned histogram models from the training set and reported 75.5% accuracy.He et al. [11] achieved an accuracy of 70% using the relative proportions of the four Tabár building blocks.Muštra et al. [17] captured the characteristics of the breast region using multi-resolution of first and second-order statistical features and reported 79.3% accuracy.
Tamrakar and Ahuja [19] investigate the study of two-class classification (benign versus malignant) patch-based density problem by considering each of the BI-RADS classes.They investigated various texture descriptors and proposed a new feature extraction method called Histogram of Orientated Texture to achieve more than 92% classification accuracy.Ergin and Kilinc [20] used the Histogram of Oriented Gradients (HOG), Dense Scale Invariant Feature Transform (DSIFT), and Local Configuration Pattern (LCP) methods to extract the rotation-and scale invariant features for all tissue patches to achieve above 90% accuracy.Gedik [21] used fast finite Shearlet transform as a feature extraction technique and used t-test statistics as a feature selection procedure.The author reported an average of more than 97% accuracy for two-class classification across two different datasets.
With the advent of deep learning techniques achieving performance similar to human readers [22], they are gaining attention from computer scientists across different research fields.In medical image analysis, deep learning techniques have been used in segmentation and classification problems related to the brain, eye, chest, digital pathology, breast, cardiac, abdomen and musculoskeletal imaging [23].In the application of breast imaging, particularly in mammography, Kallenberg et al. [24] showed the potential of unsupervised deep learning methods applied to breast density segmentation and mammographic risk scoring on three different clinical datasets, and a strong positive correlation was found compared with manual annotations from expert radiologists.In a study of Ahn et al. [25], a Convolutional Neural Network (CNN) was used to learn the characteristics of dense and fatty tissues from 297 mammograms.Based on the 100 mammograms used as a testing dataset, the authors reported the correlation coefficient between the breast estimates by the CNN and those by the expert's manual measurement to be 0.96.Arevalo et al. [26] proposed a hybrid CNN method to segment breast mass and reported 0.83 in the area under the curve (AUC) value.Qui et al. [27] proposed three pairs of convolution-max-pooling layers that contain 20, 10, and five feature maps to predict short-term breast cancer risk and achieved 71.4%.Using a deeper network approach, Cheng et al. [28] proposed an eight multi-layer deep learning algorithm to employ three pairs of convolution-max-pooling layers for mammographic masses classification and reported AUC of 0.81.In a similar problem domain (mass classification), Jiao et al. [29] used CNN and a decision mechanism along with intensity information and reported an average of 95% accuracy on different sizes of datasets.
Based on the results reported in the literature, the majority of the proposed methods has used traditional machine learning algorithms tested on 4-class breast density classification and achieved below 80% accuracy.This indicates that breast density classification is a challenging task due to the complexity of tissue appearance in the mammograms such as wide variations in texture and ambiguous texture patterns within the breast region.Although numerical results from deep learning based methods are quite promising, none of these studies has attempted the 4-class density classification problem.In fact, the majority of the deep learning based methods are based on 2-class classification problems (e.g., malignant versus benign, and fatty versus dense tissues).In this paper, texture features were extracted from the fibro glandular region (FGD roi ) only instead of from the whole breast region in order to obtain more specific descriptive information.The motivation for this approach is that, in most cases, the non-FGD roi contains mostly fatty tissues regardless of its BI-RADS class, and most dense tissues are located and start to develop within the FGD roi .Therefore, extracting features from the whole breast region means obtaining multiclass texture information, which makes the extracted features less discriminant for the BI-RADS classes.

Methodology
Figure 2 shows an overview of the proposed methodology.Firstly, we segment the breast area and estimate the FGD roi area.Subsequently, we use a simple median filter using a 3 × 3 sliding window for noise reduction and employ the multiresolution LQP operators with different neighbourhood topologies and orientations to capture the microstructure information within the FGD roi .For dimensionality reduction, we select only dominant patterns in the feature space to remove redundant or unnecessary information.Finally, we train the Support Vector Machine (SVM) classifier to build a predictive model and use it to test each unseen case.

Pre-Processing
To segment the breast and pectoral muscle region, we used the method in [30], which is based on Active Contours without edges for the breast boundary estimation and contour growing with edge information for the pectoral muscle boundary estimation.The leftmost image in Figure 3 shows the estimated FGD roi area.To extract FGD roi , we find B w , which is the longest perpendicular distance between the y-axis and the breast boundary (magenta line in Figure 3).The width and the height of the square area of the FGD roi (amber line Figure 3) can be computed as B w × B w with the centre located at the intersection point between B h and B w lines.B h is the height of the breast, which is the longest perpendicular distance between the x-axis and the breast boundary.B h is then relocated to the middle of B w to get the intersection point.The size of the FGD roi varies depending on the width of the breast.Figure 3 shows examples of segmentation results using our method in [30].

Feature Extraction
In the feature extraction stage, we used the LQP mathematical operators to extract local pattern information within the FGD roi , which is a variant of Local Binary Pattern (LBP).The LBP operators were first proposed by Ojala et al. [31,32] to encode pixel-wise information.The value of the LBP code of the pixel (i, j) is given by: where R is the radius of the circle that form the neighbourhood, P is the number of pixels in the neighbourhood, g c is the grey level value of the centre pixel, and g p is the grey level value of the p th neighbour.Later, Tan and Triggs [33] modified the approach by introducing Local Ternary Pattern (LTP) operators, which threshold the neighbouring pixels using a three-value encoding system based on a constant threshold set by the user.In our previous study [34], experimental results suggest that the LTP operators using a combination of uniform and nonuniform mapping patterns ('riu2 ) produced robust texture descriptors that can be used for density classification.Once the LTP code is generated, it is split into two binary patterns by considering its positive and negative components: where τ 1 is a threshold value set by the user.Nanni et al. [35] introduced a five-value encoding system called LQP.The LBP, LTP and LQP are similar in terms of architecture as each is defined using a circle centred on each pixel and the number of neighbours.The main difference is that the LBP, LTP and LQP threshold the neighbouring pixels into two (1 and 0), three (−1, 0 and 1) and five (2, 1, 0, −1 and −2) values, respectively.This means that, for LQP, the difference between the grey-level value of the centre pixel (g c ) and a neighbour's grey level (g p ) can assume five values, whereas LBP and LTP consider two and three values, respectively.The value of LQP code of the pixel (i, j) is given by: where τ 1 and τ 2 are threshold values set by the user, and pattern ∈ {1, 2, 3, 4}.Once the LQP code is generated, it is split into four binary patterns by considering its positive, zero and negative components, as illustrated in Figure 4, using the following conditions: In this case, feature extraction is the process of computing the frequencies of all binary patterns and presenting the occurrences in a histogram, which represents the number of appearances of edges, corners, spots, and lines within the FGD roi .This means that the size of the histogram is 2 P .To enrich texture information, we compute texture information at different resolutions, which can be achieved by concatenating histogram features extracted using different values of R and P in the LQP operators.In this paper, resolution is controlled by the radius of the circle (e.g., different window sizes) and a different number of neighbours.Figure 4 shows an example of converting neighbouring pixels to an LQP code and binary code, resulting in four binary patterns.In the LQP code, there are five numerical values generated based on Equation ( 5), whereas binary patterns 1, 2, 3 and 4 are generated based on Equations ( 6)-( 9), respectively.Figure 6 shows four images of binary patterns generated based on the conditions in Equations ( 6)-( 9).The LQP code image is generated using the term in Equation ( 5).All histograms from binary patterns are concatenated to produce a single histogram.This process is repeated depending on the number of resolutions chosen by the user.In the proposed approach, this process was repeated three times as we used three different resolutions such as

Neighbourhood Topology
The typical neighbourhood topology used in the LBP operators is a circle.In this study, we are interested in investigating the effects of the proposed method when using different neighbourhood topologies.For this purpose, we employed three further topologies, namely ellipse, parabola and hyperbola, as illustrated in Figure 7, where eight neighbours (black dots) were used and distributed equally.The central point is represented by the red dot in each topology.The value for each neighbour is the gray level value of a pixel that is located nearest to it.We summarise the equation and parameters involved in Table 1 for each topology as defined by Nanni et al. [35].Table 1.Parameter descriptions for all neighbourhood topologies.Note that a and b are the semimajor and semiminor axis lengths, and c is the distance between vertex and focus.

Topologies Equations Parameters
Circle (ci) R is the radius of the circle Ellipse (el) a and b are the semimajor and semiminor axis lengths, where a = b Parabola (par) y = − 1 c x 2 + 2c c is the distance between vertex and focus Hyperbola (hy) a and b are the semimajor and semiminor axis lengths

Topology Orientations
In this study, we further investigate the effects on the classification results when using local pattern information extracted at different orientations (θ).For this purpose, we investigated three other directions: 45 • , 90 • and 135 • (note that θ = 0 • is the default orientation).Furthermore, these orientations also can be combined to create a new topology as illustrated in the rightmost image in Figure 8. Topology orientation can be achieved through spatial rotation of the binary pattern.To illustrate this process, consider Figure 9, which represents a binary pattern in an ellipse topology (for simplicity, assume that the pattern was generated after the process in Figure 6 was implemented).In this example, the initial decimal value for the binary pattern 01100001 is 134 but rotating the pattern at θ = 45 • produces a new decimal value of 67 (binary pattern becomes 11000010).Similarly, rotation at θ = 90 • and θ = 135 • result in 161 and 208 decimal values, respectively.The same rotation process applies to the circle, parabola and hyperbola topologies.

Dominant Patterns
Selecting dominant patterns is an important process for dimensionality reduction purposes due to the large number of features (e.g., concatenation of several histograms).According to Guo et al. [36], a set of dominant patterns for an image is the minimum set of pattern types that can cover n(0 < n < 100) of all patterns of an image.In other words, dominant patterns are patterns that occur frequently in training images.Therefore, to find the dominant patterns, we apply the following procedure.Let I 1 , I 2 , ..., I j be images in the training set.Firstly, we compute the multiresolution histogram feature (H LQP I j ) for each training image.Secondly, we perform a bin-wise summation for all the histograms to find the pattern's distribution from the training set.Subsequently, the resulting histogram (H LQP ) is sorted in descending order, and the patterns corresponding to the first D bins are selected, where D can be calculated using the following equation: Here, N is the total number of patterns and n is the threshold value set by the user.For example, n = 97 means removing patterns that have less than 3% occurrence in H LQP .This means only the most frequently occurring patterns will be retained for training and testing.The smaller the value of n, the smaller the number of patterns selected.

Classification
Once the feature extraction is completed, the SVM is employed as our classification approach using a polynomial kernel.The GridSearch technique is used to explore the best two parameters (complexity (C) and exponent (e)).We test all possible values of C and e (C = 1, 2, 3, ..., 10 and e = 1, 2, 3, ..., 5 with interval 1.0) and select the best combination based on the highest accuracy in the training phase.The SVM classifier was trained, and, in the testing phase, each unseen FGD roi from the testing set is classified as BI-RADS I, II, III or IV.

Experimental Results
To test the performance of the method, we used the Mammographic Image Analysis Society (MIAS) dataset [18], which consists of 322 mediolateral-oblique (MLO) mammograms of 161 women.The spatial resolution of the images is 50 × 50 and quantised to eight bits with a linear optical density in the range 0 to 3.2.The density distribution for BI-RADS classes are as follows: 60 (BI-RADS I), 105 (BI-RADS II), 129 (BI-RADS III) and 31 (BI-RADS IV).The second database used in this study is the InBreast dataset [15], which consists of 410 images of 115 women.Since our study concentrates on breast representation in MLO views, only 206 MLO mammograms were taken into our experiment.The pixel size of all images is 70 mm (microns) and 14-bit contrast resolution.The image matrix was 3328 × 4084 or 2560 × 3328 pixels.The density distribution for BI-RADS classes are as follows: 69 (BI-RADS I), 74 (BI-RADS II), 49 (BI-RADS III) and 14 (BI-RADS IV).Each image contains BI-RADS information (e.g., BI-RADS class I, II, III or IV) provided by an expert radiologist.A stratified ten runs 10-fold cross validation scheme was employed, where the patients were randomly split into 90% for training and 10% for testing, and repeated 100 times.Accuracy (Acc) is used to measure the performance of the method, which represents the total number of correctly classified images as a proportion of the total number of images.Both datasets were annotated by expert radiologists based on the fourth edition of the BI-RADS system.Therefore, evaluation of the proposed method is based on the fourth edition guidelines.All images in both datasets are in .tifformat (8 bits) with greylevel range from 0 to 255.

Results on Different Multiresolutions
This section presents quantitative results for the proposed method when using different parameter values (e.g., P, R, a, b and c) in the LQP operators.We evaluated the method using the following three different multiresolutions and tested it on various neighbourhood topologies: 1. Small multiresolution (LQP small ), 2. Medium multiresolution (LQP medium ), 3. Large multiresolution (LQP large ).
All associated parameters are summarised in Table 2.Note that these parameter values were determined empirically, as the aim of our study is not to optimise the LQP parameters but to investigate the effects of neighbourhood topology in the feature extraction process using the LQP operators.Moreover, since there are many possible combinations of parameter values in each topology, testing all the values is impractical and time-consuming.Nevertheless, we selected several values empirically to show the effects of these parameters in our study.We present quantitative results for the proposed method in Figure 10 using small, medium and large multiresolutions covering four different neighbourhood topologies.Experimental results suggest that the LQP medium outperformed LQP small and LQP large regardless of its topology and the number of dominant patterns selected, with the best accuracy of 84.91% (n = 99.2),followed by the LQP medium el approach at 83.03%(n = 93.8).The LQP medium hy achieved the third best accuracy and the LQP medium par outperformed both LQP large par and LQP small par .This suggests that using a combination of medium sizes of R captures more discriminant features regardless of the topology.Previous studies [37][38][39] on texture descriptors across different window sizes found that using a small value of R does not capture sufficient information about the regions due to the limited intensities and grey level variations.On the other hand, using a large R tends to alter the actual representation of the area, especially when one class dominates over another class.For example, if there is a small dense region within the large circle area that mostly contains fatty tissues, combining different multi-resolutions such as LQP small ci (1, 8) + LQP medium ci (5, 10) + LQP large ci (11,16) do not provide significant improvement on the classification accuracy (Acc = 79.65%)due to insufficient textural information in LQP small (1, 8) and over-representation of the actual textural information from LQP large (11,16).We further tested LQP small el (1, 3, 10) + LQP medium el (7,14,14) + LQP large el (15,19,18) and achieved an accuracy of 78.62%.par .Regarding accuracy, the LQP medium el produced just under 2% higher compared to the LQP medium ci but approximately 1% higher than the result of LQP medium hy .For the parabola topology, none of the multiresolution approaches achieved Acc > 80%, and the highest accuracy was obtained using LQP small par , with just over 77% correct classification.
Figure 11 shows quantitative results for LQP medium ci using the following thresholds: ([τ 1 , τ 2 ]): [5,12], [4,9], [3,13] and [5,15].Note that these parameters are determined empirically, and there are many possible values that can be tested.It can be observed that threshold values of [τ 1 = 5, τ 2 = 12] produced the best accuracy of 84.91% with n = 99.2followed by [τ 1 = 5, τ 2 = 15] with n = 96.8.The other threshold values still produced good results compared with most of the proposed methods in the literature.Once again, it can be observed that selecting dominant patterns play a significant role in the classification accuracy.In Figure 11, the proposed method produced very good results when 95.5 < n < 99.

Fibro-Glandular Region versus Whole Breast Region
In this section, we present quantitative results when classifying unseen cases based on local pattern information extracted from the entire breast area (as all of the current studies in the literature have done) versus pattern information obtained from the FGD roi only.
For this purpose, we conducted an experiment using LQP medium ci with [τ 1 = 5, τ 2 = 12].We selected medium multiresolution, as this approach produces the best results compared with small and large multiresolution methods.Results can be seen in Figure 12 (top left) that suggest that textures from the fibro-glandular disk region are sufficient to differentiate breast density.Extracting features from the whole breast produced up to 77.88% accuracy with n = 97.2, which is 7% lower than when features are extracted from the FGD roi only.Secondly, to further support these results, we conducted additional experiments using LQP medium el , LQP medium hy and LQP medium par .As can be observed in Figure 12, regardless of topology, features extracted from the FGD roi once again produced more discriminant descriptors than the ones obtained from the whole breast region.Our explanation for these results is that, in many cases, non-fibro-glandular disk areas predominantly contain fatty tissues regardless of their BI-RADS class because most dense tissues start to develop within the FGD roi .Therefore, capturing texture information outside the FGD roi means extracting similar information resulting in less discriminative features across BI-RADS classes [40].In cases where the non-fibro-glandular disk region is dominated by dense tissue, the FGD roi mostly contains dense masses.For example, Figure 13 shows histograms extracted from the whole breast regions (wb) and the FGD roi .To measure the difference quantitatively, we used the χ 2 distance (d) to gauge the difference between these histograms and found d = 0.0705 for H wb 1 and H wb 2 , and d = 0.122 for H roi 1 and H roi 2 .This means that H wb 1 is more similar to H wb 2 than H roi 1 and H roi 2 .

Results on Different Neighbourhood Topologies
To summarise the performance of the proposed method using different neighbourhood topologies, we present the results evaluated based on the MIAS dataset [18], all in a single graph as shown in Figure 14.The results are based on medium size multiresolution (e.g., LQP medium ci ).All associated parameters remain the same as in Table 2. Experimental results in Figure 14 suggest that three of the topologies employed, namely circle, ellipse and hyperbola, produced consistent results (Acc > 80%) across different values of n (90 < n < 99.9), whereas LQP medium par failed to achieved accuracy above 80%.The circle topology achieved the highest accuracy of 84.91% followed by ellipse and hyperbola with Acc = 83.7% and 82.15%, respectively.In terms of accuracy, the main difference between the LQP medium ci and the LQP medium el is that the LQP medium ci tends to produce higher results when n > 95 and n < 95 for LQP medium el .We further tested the proposed method using the InBreast dataset [15] and present the results in Figure 15.All associated parameters are the same as in Table 2.It can be observed that the ellipse topology performed the best with over 82% accuracy at n = 99.7,whereas the circle topology produced close to 80% at n = 94.4,94.8 and 99.5.The parabola topology achieved the highest accuracy (78.5%) at n = 99.9 and 78% at n = 95.1 for the hyperbola topology.On average, the proposed method achieved 75.1% , 74.6%, 73.9% and 72.5% accuracy for the circle, ellipse, hyperbola and parabola, respectively, across different n.

Results on Different Orientations
To investigate the effect on the performance at various orientations (θ), we conducted several experiments by varying θ with the following values: 0 • (which is the default orientation), 45 • , 90 • and 135 • .The parameter values remain the same for all medium size multiresolution approaches with [τ 1 = 5, τ 2 = 12].Therefore, the new parameter settings are presented in Table 3.Note that only the circle and ellipse topologies are tested due to their promising results presented in Section 4.4.Combining different orientations for the ellipse results in the neighbourhood architecture becoming more complex as presented in the rightmost image in Figure 8.To assess the effect of a single orientation, first, we set a constant value of θ across different resolutions.For example, for the circle (R, P, θ), (1,   The left image of Figure 16 shows classification accuracies using the circle neighbourhood topology.Individually, experimental results suggest that the LQP medium ci S(90 • ) produced the best accuracy of 85.50%, which indicates density patterns are more visible at this orientation followed by θ = 45 • with accuracy 84.93%.Overall results suggest that multiresolution LQP medium ci S(θ) can produce consistent results (>83%) regardless of the orientation with 97 ≤ n ≤ 99.5.Combining all orientations suggests that LQP medium ci S( * ) produced 85.56%, which is very similar to using a single orientation.On the other hand, the right image in Figure 16 shows that LQP medium el S( * ) produced 86.12% accuracy, which is noticeable compared to the second best accuracy of LQP medium el S(0 • ) (83.03%).The LQP medium el S(45 • ), LQP medium el S(90 • ) and LQP medium el S(135 • ) produced the best accuracy of 80.68%, 81.37% and 82.31%, respectively, when tested using 90 ≤ n ≤ 100.
Figure 16.Quantitative results using different orientations evaluated on the MIAS dataset [18].Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.
Figure 17 shows classification accuracies using the circle and ellipse neighbourhood topologies for the InBreast [15] dataset.Note that we used the same parameter values as presented in Table 3.It can be observed that combining all orientations (θ = {0 • , 45 • , 90 • , 135 • }) produced the highest accuracy of just over 80% for both topologies.However, each orientation tends to produce an accuracy between 74% to 77%, which is lower than the results when tested on the MIAS dataset [18].This might be due to a smaller number of samples and imbalance number of instances across different classes, particularly BI-RADS IV (only 6.7%).The accuracy variations shown in Figure 17 indicate that the selection of n is crucial in order to optimise the performance of the proposed method.Individually, the ellipse topology produced the second best accuracy of 80.02% at n = 99.6,whereas the circle topology produced an accuracy of 79.99% at n = 92.8.

Figure 17.
Quantitative results using different orientations evaluated on the InBreast dataset [15].Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.

Summary of Quantitative Results
The quantitative results for the sensitivity (Sen), specificity (Spe) and area under the curve (AUC) are summarised in Tables 4 and 5 for both MIAS [18] and InBreast [15] datasets, respectively.We only tested the LQP medium for circle and ellipse topologies because they are proven to be more effective based on the results presented in Sections 4.1-4.5 compared to the other topologies.Since the classification procedure in this study is a non-binary or multiclass [41] problem, we used the pairwise analysis approach [42] to calculate the Sen, Spe and AUC values to further evaluate the proposed method.The Sen measures the proportion of actual true positives that are correctly identified and the Spe represents the proportion of actual true negatives correctly identified.The AUC (or also known as the A z value) indicates the trade-off between the true positive rate against the false positive rate.Note that the standard deviations in Tables 4 and 5 are quite large due to the effect of imbalance in the number of instances across different classes.For example, the proportion of samples for BI-RADS IV are quite small in both datasets.In a case where the testing set contains only a small number (e.g., 4 samples) of samples from BI-RADS IV and only two of the samples are correctly classified, this results in 50% sensitivity that not only increases the standard deviation but the average Sen, Spe and AUC.Overall, the proposed method achieved the highest Sen = 80.50% (LQP medium ci S(0)) with Spe = 92.81%(LQP medium ci S(0)) and AUC = 89.92%(LQP medium el S(All)) when evaluated on the MIAS dataset [18].Nevertheless, the results are lower when tested on the InBreast dataset [15] with the best values of 76.67%, 87.06% and 86.89% for Sen, Spe and AUC, respectively.

Discussion
We present this section as three subsections that cover quantitative comparison with current methods in the literature, effects of parameter selection on the proposed method and future work.

Quantitative Comparisons
For quantitative comparison with other methods in the literature, to minimise bias, we selected those studies that have used the MIAS database [18], four-class classification, and used the same evaluation technique (10-fold cross validation) as in this study.Table 6 shows studies that have been conducted using the same evaluation technique on four-class classification.Note that all studies have used all of the 322 mammograms in the MIAS database [18].

Authors Summary Accuracy
Parthaláin et al. [6] First and second-order statistical features and morphological features extracted from fatty and non-fatty tissues.The Fuzzy C-means (FCM) clustering approach was used to segment fatty and non-fatty tissues.Fuzzy-rough approaches were employed for feature selection and classification.The feature extraction pipeline is the same as the study of Oliver et al. [3]. 91.4%

Our method
The Local Quinary Pattern (LQP) operators were used to extract local features within the fibro-glandular region using multiresolution and multi-orientation approaches.Only dominant patterns were used for classification using the support vector machine classifier.
86.13% 82.02% Oliver et al. [3] First and second-order statistical features and morphological features extracted from fatty and non-fatty tissues.The Fuzzy C-means clustering approach was used to segment fatty and non-fatty tissues.This is the same as the study of Parthaláin et al. [3].Several machine learning algorithms (e.g., k-Nearest Neighbours, decision tree and Bayes classifiers) were combined as a classification approach.

86%
Rampun et al. [40] The LTP operators were used to extract local features within the fibro-glandular region at different orientations.Feature from all orientations were concatenated to create a long feature vector.Subsequently, the support vector machine (SVM) classifier was employed in the testing phase.

82.33%
Muštra et al. [17] The first and second order statistical features were extracted at different orientations and resolutions.A combined paradigm of feature selection approach was employed to simplify complexity in the feature space.Finally, the k-Nearest Neighbours and Naive Bayesian classifiers were used to classify unseen cases.

79.3%
Chen et al. [7,9] Several texture descriptors based on the following feature extraction techniques were tested: the local binary pattern, local grey-level appearance, basic image features and Texton approaches.The k-NN classification approach was used to differentiate unseen cases into four BI-RADS classes.

59-76%
Petroudi et al. [12] The Maximum Response 8 (MR8) filter bank approach was used to extract the breast local information.Subsequently, the χ 2 distribution was used to compare each of the resulting histograms from the training set to all the learned histogram models from the training set.

75.5%
Bovis and Singh [4] Texture features were extracted using Fourier and Discrete Wavelet transforms, and first and second-order statistical features.In the classification phase, a combined classifier paradigm (e.g., the SVM, Random Forest and k-NN) was used as a classification approach.

71.4%
He et al. [4] Class density was determined based on the relative proportions of the four Tabár building blocks within the whole breast region.

70%
Based on the classification results presented in Table 6, Parthaláin et al. [6] achieved the best accuracy of 91.4% using a combination of first and second-order statistical features and morphological features.Our proposed method achieved up to 86.13% accuracy, which is better than the majority of the methods presented in Table 6.Although Parthaláin et al. [6] achieved the best accuracy, our method is simpler because we use only the LQP operators to extract local pattern information, whereas the method in [6] used several feature extraction techniques.In fact, the Fuzzy C-means clustering is time-consuming, as every pixel needs to be classified as to whether it belongs to the fatty or non-fatty class.Similar to the study of Oliver et al. [3], the FCM clustering technique was employed and the sequential forward selection (SFS) algorithm was used to select a set of discriminant features.Once again, this process is more complicated because each feature will be compared with the other available features when calculating discriminant features.In contrast, our method selected dominant patterns that are based on the number of occurrences.It does not quantitatively compare each feature vector to the other features, which reduces computational time.
Previously, Rampun et al. [40] used the LTP operator to extract local information from the FDG roi and achieved a promising result of 82.33% accuracy.However, this method suffers from having to deal with a large number of features due to the multi-orientation approach (e.g., eight histograms from eight orientations were concatenated).The main difference compared with the proposed method is that the method in [40] concatenated eight orientations and no feature selection was performed, whereas our method performed feature selection by selecting dominant patterns only and investigated different neighbourhood topologies and different resolutions.Muštra et al. [17] developed a method using first and second-order statistical features and employed a complex feature selection framework (e.g., combining several feature selection techniques).Nevertheless, considering the types of features used, they achieved comparable results of almost 80% accuracy.Chen et al. [7,9] investigated several 'bag-of-words' techniques (e.g., LBP, BIF, Textons and LGA) and reported that the Textons (based on MR8) and the LGA approaches achieved the highest accuracies of 76% and 75%, respectively.
Bovis and Singh [4] reported 71.4% accuracy based on a combined classifier paradigm and He et al. [4] produced 70% accuracy using the proportion of four Tabár building blocks.In an unsupervised approach, Petroudi et al. [12] achieved 75.5% classification accuracy using a Textons (MR8) approach to characterise breast appearance.Based on the numerical results presented in the literature, there is still scope for improvement regarding accuracy, which indicates that four-class classification is a challenging task.In our case (based on MIAS database [18]), this might be due to: (a) strong artefacts because images were scanned which also may have altered the actual representation of the breast tissues and (b) an imbalanced number of data across different class.The MIAS database [18] contains 87, 103, 95, 37 cases for BI-RADS I, II, III and IV, respectively.The number of cases for BI-RADS IV is small, which may cause the prediction model to under represent this class.

Effects of Parameter Selection
There are many parameters involved in this study, and all of them were determined empirically.The aim of our study is not to optimise these parameters in our classification method but to investigate whether the LQP operators can produce discriminant features that can represent breast density based on different neighbourhood topologies.Based on the experimental results shown in Sections 4.1-4.5, all parameters have a significant effect on the performance results.However, the LQP medium ci and LQP medium el produced better results than the other topologies.This may be due to the texture appearance in breasts being mostly anisotropic, and these topologies are better at dealing with these types of structural information (due to their anisotropic shapes) than the parabola and hyperbola.In terms of the selection of R, a large value of R (e.g., 15 or 17) caused one class to be over represented compared with the other classes, whereas a small value of R (e.g., 1 or 2) limits the texture information within the region.On the other hand, a large value of P (e.g., 22) increases the number of patterns but at the same time increases the information complexity in the feature space, and small P (e.g., 4 or 6) results in insufficient texture patterns.In terms of τ 1 and τ 2 values, we found that the results in our study are higher than the ones found in the study of Nanni et al. [35].
Using a multiresolution approach increases the opportunity for capturing textural information that is not represented on a single scale.This is similar when dealing with textures that are more visible in multi-orientations rather than a single orientation.Nevertheless, multiresolution and multi-orientation approaches also have their disadvantages such as increasing the number of features, which also means it increases the possibility of capturing similar information.One way to reduce this problem is to select dominant patterns by taking features that have high occurrences only, but this still does not solve the problem of retaining two sets of similar features with high occurrences.In this study, before selecting dominant features, the number of features is approximately 200.Performing feature selection results in a much smaller number of features (between 50 to 80).For example, selecting n = 99.2results in 70 to 80 dominant features, whereas n = 95 retains 55 to 65 dominant features.The main reason for the significant reduction is due to sparse features (large percentage of zeros) that has been reported in [7,9,35].Since the LQP operators produced sparse features in our case, this has significant variation on the classification accuracies.As can be observed in Figures 10-12, even a small n can change the results significantly because a small n can remove a large number of many features.
When combining all four orientations, we found that the LQP medium el S( * ) produced the highest accuracy, over 3% better than using a single orientation.In contrast, the LQP medium ci S( * ) produced only 0.06% improvement over LQP medium el S(90 • ).This is probably because the topology of neighbourhood distribution has changed to more flexible to a 'circle-like' and 'ellipse-like' (see the rightmost image in Figure 8, where the central region is 'circle-like' and the outer shape is 'ellipse-like').In this case, the topology is able to capture information using both circle and ellipse neighbourhood distribution.

Future Work
For future work, we plan to develop a method that can automatically estimate τ 1 and τ 2 as well as combine multiresolution LQP features with 'bag-of-words' features such as Textons and MR8.Furthermore, we plan to use a deep learning approach to find discriminant features and combine them with handcrafted features (e.g., local patterns from LQP operators).Although several studies [22][23][24][25][26][27][28][29] have investigated the use of deep learning in breast density classification, to the best of our knowledge, the separation was only based on two-or three-class classification (none of the deep learning based approaches has been applied to four-class classification).In fact, no study has been conducted combining non-handcrafted and handcrafted features in breast density classification.

Conclusions
In conclusion, we have developed a breast density classification method using multiresolution LQP operators applied only within the fibro-glandular disk area, which is the most prominent region of the breast instead of the whole breast region as used in other current studies [3][4][5][6][7][8][9]11,12].Experimental results indicate that the multiresolution LQP features are robust compared with the other methods such as LBP, Textons based approaches and LTP due to the five encoding system that generates more texture patterns.Moreover, the multiresolution approach provides complementary information from different parameters that cannot be captured in a single resolution.We further studied the effects on performance when using different neighbourhood topologies (e.g., ellipse, parabola and hyperbola), and found that the LQP medium ci outperformed LQP medium el , LQP medium hy and LQP medium par .Nevertheless, when combining four orientations (θ ∈ {0 • , 45 • , 90 • , 135 • }), the proposed method achieved the best accuracy-up to 86.12% (LQP medium el ) based on the MIAS dataset [18], which is better than the other LQP approaches employed in this study.Finally, the proposed method produced competitive results based on two datasets (MIAS [18] and InBreast [15]) compared with some of the best accuracies reported in the literature.

Figure 2 .
Figure 2.An overview of the proposed breast density methodology.

Figure 3 .
Figure 3.Example of segmentation results using our method in[30].

Figure 4 .
Figure 4.An illustration of computing the Local Quinary Pattern (LQP) code using P = 8 and R = 1, resulting in four binary patterns.

Figure 5
Figure 5  shows an example of the feature extraction process using multiresolution LQP operators.Note that each resolution produces four binary patterns (see Figure4) resulting in four histograms.Subsequently, these four histograms are concatenated into a single histogram, which represents the

Figure 5 .
Figure 5.An overview of the feature extraction using multiresolution LQP operators.Black dots in multiresolution LQP operators mean neighbours with less value than the central pixel (red dot).Note that the outer circle with bigger dots represent a larger R value.

Figure 6 .
Figure 6.Four images of binary patterns generated from the LQP code image.

Figure 7 .
Figure 7. Five different neighbourhood topologies employed in our study.

Figure 8 .
Figure 8. Ellipse topology at different orientations and its combination.

Figure 9 .
Figure 9. Spatial rotation in an ellipse topology resulting in different decimal values.The decimal value is calculated in a clockwise direction.

Figure 10 .
Figure 10.Quantitative results using different multiresolutions based on different neighbourhood topologies.Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.

Figure 11 .
Figure 11.Quantitative results using different values of τ 1 and τ 2 on circle neighbourhood topology.Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.

Figure 12 .
Figure 12.Quantitative results based on features extracted from the whole breast versus fibro-glandular disk region covering four different topologies.Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.

Figure 13 .
Figure 13.Histograms extracted from the wb versus region of interest (ROI) with BIRADS class I and IV.

Figure 14 .
Figure 14.Quantitative results (MIAS dataset [18]) of LQP medium ci,el,hy,par based on features extracted from the fibro-glandular disk region on different neighbourhood topologies.Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.

Figure 15 .
Figure 15.Quantitative results (InBreast dataset [15]) of LQP medium ci,el,hy,par based on features extracted from the fibro-glandular disk region on different neighbourhood topologies.Note that the x-axis and y-axis represent the accuracy (percentage of correctly classified cases) and the percentage of dominant patterns, respectively.

Table 3 .
Parameter values for different neighbourhood topologies and orientations.

Table 4 .
[18]all quantitative results for MIAS[18]database using circle and ellipse topologies.The parameters R and P are based on the values in Table3.

Table 5 .
[15]all quantitative results for InBreast[15]database using circle and ellipse topologies.The parameters R and P are based on the values in Table3.

Table 6 .
Quantitative comparison of the current studies in the literature.