A Novel Unsupervised Spectral Clustering for Pure-Tone Audiograms towards Hearing Aid Filter Bank Design and Initial Conﬁgurations

Featured Application: Application: This work will contribute to easing and reducing the complexity of the ﬁlter bank structure of hearing aids. In addition, it will facilitate the process of conﬁguring hearing aids. Abstract: The current practice of adjusting hearing aids (HA) is tiring and time-consuming for both patients and audiologists. Of hearing-impaired people, 40–50% are not satisﬁed with their HAs. In addition, good designs of HAs are often avoided since the process of ﬁtting them is exhausting. To improve the ﬁtting process, a machine learning (ML) unsupervised approach is proposed to cluster the pure-tone audiograms (PTA). This work applies the spectral clustering (SP) approach to group audiograms according to their similarity in shape. Different SP approaches are tested for best results and these approaches were evaluated by Silhouette, Calinski-Harabasz, and Davies-Bouldin criteria values. Kutools for Excel add-in is used to generate audiograms’ population, annotated using the results from SP, and different criteria values are used to evaluate population clusters. Finally, these clusters are mapped to a standard set of audiograms used in HA characterization. The results indicated that grouping the data in 8 groups or 10 results in ones with high evaluation criteria. The evaluation for population audiograms clusters shows good performance, as it resulted in a Silhouette coefﬁcient >0.5. This work introduces a new concept to classify audiograms using an ML algorithm according to the audiograms’ similarity in shape.


Introduction and Motivation
The World Health Organization (WHO) estimates that by 2050, nearly 2.5 billion people are projected to have some degree of hearing loss, which poses an annual global cost of US $980 billion [1]. Daniela Bagozzi, a WHO Senior Information Officer, wrote an article to make a call for the private sector to provide affordable hearing aids in developing countries (as their current cost ranges from US $200 to over US $500) [2]. In addition, the Healthline Organization reported that a set of hearing aids might cost $5000 [3].
The main components of a digital hearing aid are shown in Figure 1, comprising of a microphone, an analogue to digital converter (A/D), filter banks, gain blocks, and a digital to analogue converter (D/A). First, the analogue sound signal detected by the microphone is converted into digital form by an A/D converter. Next, this digital signal is applied to a filter bank. Different digital signal theories can be applied to divide the input digitized sound signal spectrum into sub-bands with different bandwidths. Then, gain blocks are applied to the outputs of the filter bank to amplify the sound signal to the desired hearing level. In the last stage, the signal is converted from digital to analogue by the D/A converter [4,5]. However, it is better to design digital filters that can match multiple audiograms for patients who suffer from hearing loss. This approach lowers the cost of manufacturing hearing aids as they can be produced on a large scale as designed by many research techniques [4,[6][7][8][9]. On the other hand, it increases the complexity of hearing aid design, requiring high operating power and big chip area, leading to improperly fitted hearing aid [10].
Appl. Sci. 2021, 11, x FOR PEER REVIEW 2 of to make a call for the private sector to provide affordable hearing aids in developing coun tries (as their current cost ranges from US $200 to over US $500) [2]. In addition, th Healthline Organization reported that a set of hearing aids might cost $5000 [3].
The main components of a digital hearing aid are shown in Figure 1, comprising of microphone, an analogue to digital converter (A/D), filter banks, gain blocks, and a digit to analogue converter (D/A). First, the analogue sound signal detected by the microphon is converted into digital form by an A/D converter. Next, this digital signal is applied to filter bank. Different digital signal theories can be applied to divide the input digitize sound signal spectrum into sub-bands with different bandwidths. Then, gain blocks ar applied to the outputs of the filter bank to amplify the sound signal to the desired hearin level. In the last stage, the signal is converted from digital to analogue by the D/A con verter [4,5]. However, it is better to design digital filters that can match multiple audio grams for patients who suffer from hearing loss. This approach lowers the cost of manu facturing hearing aids as they can be produced on a large scale as designed by many re search techniques [4,[6][7][8][9]. On the other hand, it increases the complexity of hearing ai design, requiring high operating power and big chip area, leading to improperly fitte hearing aid [10]. This research was motivated by all these considerations in hearing aid design an the impact of hearing loss on nations' economies and patients' ability to afford the hearin aid. The main idea here is to use artificial intelligence and machine learning to facilitat the whole process for the patients and the hearing aid designers. In addition, the proces of fitting hearing aids is tiring and consuming in time as it depends on many trials whic require the patient to be highly responsive. A study stated that only 50-60% are satisfie with their hearing aids use [11]. Furthermore, integrating all these factors with the sever shortage of numbers of audiologists who are very rare and hard to find in rural area [12,13]. This urges the use of artificial intelligence new technology to resolve these prob lems, especially when the fitting process depends on the skills and experience of audiolo gists [14].
In this work, the authors apply unsupervised learning to cluster audiograms usin spectral clustering. These audiograms are taken from a database of 28,244 audiogram used by Bisgaard, Vlaming and Dahlquist [15] to produce a standard set of audiogram for the IEC (International Electrotechnical Commission) 60118-15 measurement proce dure. These audiograms were clustered by vector quantization analysis of size 60. Her the researchers excluded five audiograms of the quantized results, representing norma hearing levels (0-20 dB) defined by different health organizations [1, 16,17]. These five au diograms are removed since the study aims to produce clusters representing differen This research was motivated by all these considerations in hearing aid design and the impact of hearing loss on nations' economies and patients' ability to afford the hearing aid. The main idea here is to use artificial intelligence and machine learning to facilitate the whole process for the patients and the hearing aid designers. In addition, the process of fitting hearing aids is tiring and consuming in time as it depends on many trials which require the patient to be highly responsive. A study stated that only 50-60% are satisfied with their hearing aids use [11]. Furthermore, integrating all these factors with the severe shortage of numbers of audiologists who are very rare and hard to find in rural areas [12,13]. This urges the use of artificial intelligence new technology to resolve these problems, especially when the fitting process depends on the skills and experience of audiologists [14].
In this work, the authors apply unsupervised learning to cluster audiograms using spectral clustering. These audiograms are taken from a database of 28,244 audiograms used by Bisgaard, Vlaming and Dahlquist [15] to produce a standard set of audiograms for the IEC (International Electrotechnical Commission) 60118-15 measurement procedure. These audiograms were clustered by vector quantization analysis of size 60. Here, the researchers excluded five audiograms of the quantized results, representing normal hearing levels (0-20 dB) defined by different health organizations [1, 16,17]. These five audiograms are removed since the study aims to produce clusters representing different audiograms for Appl. Sci. 2022, 12, 298 3 of 17 patients who experience hearing loss. Different audiograms with the same shape but with different levels can be realized with the same set of filters by adjusting the gains to match the required audiogram shape. Another reason to classify audiograms according to shape is, fitting hearing aid process will be easier. A supervised machine learning model can be built based on this work to classify patients' audiograms then program or adjust hearing aid according to pre-set configurations. These configurations are linked to the clusters produced by this work then at the end of the process of hearing aid fitting, a fine tuning might be needed.
This introduction is kept simple to be comprehended easily by both experts in the field of hearing aid design from engineering perspectives and audiologist medical perspectives. The technical parts can be found in the following sections of the paper. This paper is organized as follows. Firstly, we discuss recent audiogram classifiers and what are the limitations of these works then the main contribution of this work is highlighted. Subsequently data clustering algorithm is explained in Section 3, where the algorithm description and implementation are elaborately discussed. This is followed by a discussion concerning how this algorithm is evaluated and how data sets are prepared. Then, the results are demonstrated and discussed in Section 4 to find the optimum number of clusters. The clustering algorithm is evaluated for audiogram population that produced the quantized data. The generated clusters are mapped and compared to the standard sets selected by Bisgaard in the last subsection. Finally, a summary for the results, conclusion, and prospects for future work are presented in Section 5.

Related Work
In 2016, Rahne, et al., have built an excel sheet as an audiogram classifier with pre-set inputs that can be defined according to inclusion criteria in the clinical trial. This tool provides inclusion decision based on the predefined audiological criteria [18]. Then, in 2018, Sanchez, et al. have classified the hearing tests data in two stages. The first stage is unsupervised learning to define trends and spot patterns in data obtained from different hearing tests. In the second stage a supervised learning algorithm is built in which different outcomes from different hearing tests were explored. In the second stage, the subjects were assigned a profile then the data were analyzed again to find the best classification structure of the subjects into the four auditory profiles. This classifier was based on data analysis to audiograms which reflects the loss of sensitivity and other hearing tests to reflect loss of clarity that was not captured by the audiogram [19]. Belitz et al., in 2019 have also combined unsupervised and supervised machine learning methods to map audiograms to a small number of hearing aid configurations. The target of this study was to use these configurations as a starting point for hearing aid fitting. This method was applied in two steps, the first one started by performing different unsupervised clustering algorithms to determine a limited number of pre-set configurations for a hearing aid. The centroids of the clusters were chosen to represent fittings targets which can be used as starting configurations for hearing aid adjustments for each individual. The second step was to assign to each audiogram a class based on the results from the first stage comfort target clustering. Various supervised machine learning techniques were used to assign to each audiogram a pre-set configuration. The classifier accuracy of the second stage was low when they selected single configuration and it was improved when they allowed two configurations to each audiogram [20]. In 2018, a research team developed their first steps of a machine learning classifier by the use of unsupervised learning to cluster audiograms [21]. In this work, audiograms were clustered with the target to make them maximally informative audiograms. Then the clustered data was prepared to be a good training set for supervised machine learning classifiers. They built an approach to get a set of non-redundant unannotated audiograms with minimal loss of information from very big data set. In 2020, the same group used the data preparation procedure carried out by them to produce a machine learning classifier. They applied supervised ML to 270 audiograms annotated by three experts in the field. The results have good accuracy to annotate the audiograms concisely in terms of shape, severity and symmetry [12]. The classifier can be integrated as a mobile application to help the user to describe audiogram concisely so it can be interpreted by non-experts. The classifier outputs can be used by non-experts to decide if the patient needs to be checked by a specialist. It can resolve partially the problem of having a shortage of specialists and it can be the first step towards a more sophisticated algorithm to help experts of the audiology field.
Crowson et al., used deep learning convolutional neural network architecture to classify audiograms of normal hearing, sensorineural, conductive, and mixed hearing loss. The audiograms were converted to jpeg formatted picture files. Image transformation techniques were used to increase the number of images available as a training data for the classifier. Image rotation, wrapping, contrast, lighting and zoom were applied to the audiogram images in the training set. They achieved 97.5% accuracy of their model to classify hearing loss types based on features extraction of the audiograms [13]. In this research, the study aimed at classifying audiograms to detect the cause of hearing loss which is not helping in configuring hearing aids or not conducted for this purpose [13].
Musiba [22], has classified audiograms based on UKHSE (United Kingdom Health and Safety Executive) categorization scheme. The sum of pure tone audiometry test hearing levels at frequencies 1 kHz, 2 kHz, 3 kHz, 4 kHz and 6 kHz, was obtained. Then, compared with the figures set by UKHSE and classified as one of the following: acceptable hearing ability, mild hearing impairment, poor hearing, or rapid hearing loss. The aim of this classification was to prompt proper actions to prevent noise-induced hearing loss. The annotation process was carried out by experts in the field who applied the UKHSE standards.
Cruickshanks and his team [23] made a longitudinal study on how the shape of audiograms change over time. The follow up was carried out based on four stages; 1993-1995, 1998-2000, 2003-2005, and 2009-2010. The audiograms were classified into eight levels and the change in hearing ability over time was recorded based on these classes. Musiba and Cruickshanks [22,23] didn't implement any intelligent solutions as they counted on the experience of the specialists in the field.
Classifier techniques found in the literature, are summarized in Table 1, showing the limitations and short-coming of each technique.

Reference Classification Technique Limitations
Sanchez Lopez et al. [19] a. It is a two-stage classifier. b.
The first stage is unsupervised machine learning then followed by supervised learning. c.
It uses different hearing tests to classify hearing loss into 4 types related to sensitivity and clarity loss.
a. It used different types of hearing tests, not only audiograms, to classify data to detect type of hearing loss.
Belitz [20] a. It is a classifier to audiograms with two steps; the first step is unsupervised learning to cluster audiograms into 4 pre-set configurations for a hearing aid. b.
Second, audiograms are mapped to these 4 configurations with supervised learning.
a. The supervised learning algorithm gives low accuracy when one configuration is set to each audiogram. However, the accuracy improves significantly when two configurations possibilities are set to each audiogram. b.
Data clustered in 4 classes which are not enough to describe different shapes of patients' audiograms.

Reference Classification Technique Limitations
F. Charih et al. [12] a. It is used supervised learning to 270 audiograms annotated by three experts in the field. b.
Audiograms are classified concisely in terms of shape, severity and symmetry.

a.
A limited number of audiograms are used as a training data set. b.
The classifier outputs are a concise description of audiograms.
Musiba [22] a. It is used the sum of hearing levels at frequencies (1 k-6 k) to classify the data. b.
Data are classified into 4 groups to assess hearing ability; acceptable hearing ability, mild hearing impairment, poor hearing and rapid hearing loss. a.
The output from the used classification process is the hearing ability, and experts in the field classify it. Therefore, the output is dependent on the experience and skills of the annotator.
It is a longitudinal study to observe the change of audiogram shape over time. b.
The audiograms are classified into 8 levels, and the change in hearing ability over time was recorded. a.
The findings of this study were related to the change of audiograms of patients during the follow-up period. b.
Experts in the field did the classification, so the classes are dependent on their knowledge.
Crowson et al. [13] a. It is used deep learning convolutional neural network to classify audiograms formatted as jpeg pictures. b.
The audiograms are classified to categorize hearing loss into 4 classes; normal hearing, sensorineural, conductive, and mixed hearing loss.
a. The outputs of this classifier are the hearing loss types to detect the cause of hearing loss. Hence these classes cannot be used to help in hearing aid design or configuration.
To the best of our knowledge, the classifiers that are built with the purpose to classify audiograms are very few and not suitable as a refence to the specialists in the field, such as audiologists, hearing aid specialists, and hearing aid designers. This study is the first study to classify audiograms according to the similarity in shape with the aim to reduce the complexity of the filter bank used to realize the audiogram shape of the patients. According to signal processing techniques, it is important to know the shape of the audiogram to apply different gains to different filters that cover the entire band of hearing (125 Hz-8 KHz). This classifier is built to capture different shapes of audiograms and not to classify hearing loss type as achieved by current existing works. Audiograms of similar shape at different levels can be realized by a group of filters by changing the gain coefficients of each filter or the overall gain of the cascaded filters. This classification will help hearing aid designers to reduce the complexity of their filter designs and can be a good start for the future supervised learning algorithm to classify audiograms according to these detected shapes. Applying novel methods such as sophisticated machine learning algorithm will facilitate the whole process for the experts and to increase patients' satisfaction.

Data Clustering Algorithm
The study is conducted to group audiograms according to similarity in shape. For this purpose, spectral clustering was used to provide clusters that can be technically used by the experts in the field. This section starts with a general description of the algorithm showing the main steps of how the algorithm was implemented and evaluated. Then, the details of the implementation process were discussed and finally, the evaluation criteria for different numbers of clusters and for the selected clusters, all were explained.

Algorithm Description
The spectral clustering algorithm is a graph-based technique to find k clusters in data [24,25]. It calculates a similarity matrix of a similarity graph from the data to determine the Laplacian matrix. A similarity graph models the local neighborhood relationships between the data points, where the matrix representation of this graph is the similarity matrix. The similarity matrix contains pairwise similarity values between connected nodes in the similarity graph and can be represented by Laplacian matrix. This algorithm starts by representing data in a lower dimension space in which the data are classified. The reduction in data dimension is based on the eigenvectors of the Laplacian matrix. The columns of the eigenvectors correspond to the k smallest eigenvalues of the Laplacian matrix. These eigenvectors are a low-dimensional representation of the input data in a new space, where the clusters are well-separated [25]. This algorithm aims to classify data into clusters, such that the parts of data in the same cluster are similar and others in different clusters are dissimilar to each other [26]. The authors decided to use spectral clustering for their data as it can produce accurate clustering results by solving the features of the Laplacian matrix. This clustering method can be used for any shape of data, with the advantage of dealing with non-convex data distributions [27]. Since the data used in this research is mostly convex and sometimes non-convex, spectral clustering is a suitable method for unsupervised learning to detect different audiograms shapes.
The authors started by clustering the data into seven clusters using spectral clustering. Two methods were used to construct the similarity matrix, namely the nearest neighbors and radius search methods. Then, the Laplacian matrix was generated and normalized with different methods, such as random-walk normalization and symmetric normalization. The produced seven clusters were checked by looking into the eigenvalues then visually inspected by plotting a scatter plot for the clusters. If all the eigenvalues are zero or the plot did not indicate credible clusters then the method is not considered. The selected methods were furtherly assessed by checking eigenvalues one more time if they indicate a gap then the Silhouette coefficient is calculated for these seven clusters. This process was repeated to generated eight clusters and then evaluate the model performance. Then, other number of clusters (9, 10, and 11 clusters) were also tested. It was decided that we would start with seven clusters in order to detect as many shapes as possible of the audiograms for future supervised machine learning model with good accuracy. The lower the number of audiograms clusters the lower the accuracy of the predictions. On the other hand, the authors decided to stop at 11 clusters as the Silhouette coefficient dropped significantly. The algorithm steps are shown in Figure 2.
Then authors picked the number of clusters with the highest 2 silhouette coefficients for further evaluation. These two numbers of clusters were compared by evaluating Silhouette coefficients, Calinski-Harabasz criterion, and Davies-Bouldin criterion. This was followed by generating audiogram population and annotating them according to the produced clusters and then these clusters were evaluated with the same three criteria methods Silhouette coefficients, Calinski-Harabasz criterion, and Davies-Bouldin criterion. This was carried out to test the clustering method for a large number of audiograms. Finally, the authors mapped the generated clusters to Bisgaard selected levels to compare their clusters with existing standards used in hearing aid measurements.

Clustering Implementation
Spectral clustering is a well-established algorithm but can be carried out using many input arguments. The authors tried many of them and the trials were evaluated statistically. First, the similarity graphs are generated using two ways (number of nearest neighbors and according to a certain value that represent radius to search for the nearest neighbors). Then, the similarity graphs are represented using Laplacian matrix. The clustering results were evaluated for different forms of this matrix; without normalization, random-walk normalization, and symmetric normalization. Finally, two clustering methods (k-means and k-medoids) were tested to cluster the eigenvectors of the Laplacian matrix. In each case, the eigenvalues were checked, and the silhouette coefficients were calculated for performance evaluation [28].
ppl. Sci. 2021, 11, x FOR PEER REVIEW Finally, the authors mapped the generated clusters to Bisgaard selected levels to their clusters with existing standards used in hearing aid measurements.

Clustering Implementation
Spectral clustering is a well-established algorithm but can be carried out us input arguments. The authors tried many of them and the trials were evaluate cally. First, the similarity graphs are generated using two ways (number of near bors and according to a certain value that represent radius to search for the near bors). Then, the similarity graphs are represented using Laplacian matrix. The MATLAB is the selected platform to perform spectral clustering. The similarity graph was generated using kernel nearest neighbors, where, it connects two points i and j, when either i is the nearest neighbor of j or j is the nearest neighbor of i. These distances are calculated using Euclidean formula, then, transformed with a scaled kernel with a value that is selected using heuristic procedure. The clustering method used to cluster eigenvectors of the Laplacian matrix is the K-medoids. A medoid in the K-medoids algorithm is the most centrally located point with minimum distance with respect to other points and is not influenced by the outliers or extremities [29,30]. Finally, the similarity graph is represented with normalized Laplacian matrix using random-walk.

Clustering Performance Evaluation
Four criteria values are calculated to find the best number of clusters and to evaluate clustering method. These are: eigenvalues, silhouette coefficients, Calinski-Harabasz criterion and Davies-Bouldin criterion. To obtain well-separated clusters, eigenvalues should ideally be zero or small. To determine the proper number of data clusters, the number of clusters is increased gradually till a gap is observed in the eigenvalues [31]. If it is not possible to reach this gap, the silhouette analysis is used to measure how well the data is clustered. This analysis results in a coefficient in the range of [−1, 1], where silhouette coefficients close to +1 indicate that the sample is far away from the neighboring clusters. On the other hand, a value of 0 indicates that the sample is on or very close to the decision boundary between two neighboring clusters. In contrast, negative values indicate that those samples are assigned to the wrong cluster. The silhouette index (SI) is the average of these coefficients, the closer to +1 the better the separation between clusters [32,33]. Calinski-Harabasz index (CHI) is the ratio between the variance of the sums of squares of the distances of individual data points to their cluster center and the sum of squares of the distance between the cluster centers. The higher the value the better the performance of the clustering model [34]. The Davies-Bouldin analysis calculates two values (within cluster variance and the distance between the centroids of different clusters). Then, the nearest neighboring cluster is identified for each cluster and the sum of within cluster variances is divided by distance difference between clusters centroids. The Davies-Bouldin index (DBI) is the average of these values and it ranges from zero to infinity and the smaller the value the better the separation between clusters [35]. The last three criteria are suitable to evaluate clusters as they give more accurate results for convex data [36]. The optimal number of clusters occurs at the highest Calinski-Harabasz and silhouette coefficients while it is the lowest for the Davies-Bouldin value [37].

Data Sets Preprocessing
The authors used two sets of data. The first one consists of 55 audiograms and the second one is generated using Kutools Excel add-in.

First Data Set
To perform spectral clustering to group large data set, it should go through two steps [38]: The first step is data reduction, which is carried out mostly using k-means to cluster the given data set. From each cluster, some data are picked normally. They are the ones near the center of the cluster. Thus, each cluster is represented by one set [39,40]. Then, the spectral clustering can be applied to construct the similarity matrix and to classify the reduced size data into final classes.
Bisgaard et al. [15] did the data reduction to a database of 28,244 audiograms using vector quantization of size 60. Then the authors in this paper, applied the spectral clustering to these 60 audiograms. This data can be found in Table A.1. in Bisgaard work [15]. In fact, it was furtherly reduced by eliminating five audiograms that represent individuals with normal hearing. These levels are removed since the model is built for patients who experience hearing loss to assist in configuring or designing hearing aids. The audiograms were measured in standard audiometry booths at eight test frequencies. Air conduction thresholds are measured at 250 Hz, 500 Hz, 1000 Hz, 1500 Hz, 2000 Hz, 3000 Hz, 4000 Hz, 6000 Hz, and 8000 Hz.

Second Data Set
The authors generated data with the original size (audiograms) by repeating the 60 audiograms with the frequencies indicated in Table A.1 in Bisgaard work. This percentages represent the part of population audiograms within a specified range around the 60 audiograms. This range is decided based on minimizing the calculated Euclidean distance from each measured audiogram to its corresponding "typical" code vector audiogram. Based on this training technique, the authors believe repeating these audiograms would result in good representation and carry enough information about the original database. The used tool to generate this data set is the Kutools Excel add-in.

Results and Discussion
The authors decided to consider a big number of clusters due to the nature of data. The audiograms have high overlap which make it difficult to detect different shapes of patients' audiograms with a small number of clusters. In addition, the authors wanted to detect steep sloping audiograms as their filter realizing and adjusting would be different from technical point. This section starts by finding the optimum number of clusters, then assessing the clustering method when applied to audiograms' population. Finally, the authors compared the generated clusters to the standard levels chosen by Bisgaard.

Finding the Optimum Number of Clusters
Silhouette Criteria Clustering Evaluation was used to determine the best number of clusters, ranging between −1 to +1. A positive value implies good clustering, and the best number of clusters are associated with the highest criterion values. The results are shown in Table 2, indicating that the best number of clusters are 8 followed by 10. The wrongly assigned audiograms were removed in another two consecutive stages, then the corresponding criteria values were recalculated as shown in Table 2. The criteria values for 8 and 10 clusters are found to be close. The following 2 subsubsections are introduced to show and discuss the results of different stages of 8 and 10 clusters. These stages are implemented to remove the wrongly assigned audiograms with a negative Silhouette coefficient.

Eight Clusters Evaluation Criteria
The selected 55 audiograms are classified into 8 clusters using spectral clustering. The stage 1 silhouette plot indicates that seven audiograms are wrongly assigned, with one on the boundary between two clusters (very small negative Silhouette coefficient of −0.006072). In the second stage, the wrongly assigned audiograms are removed. The second stage Silhouette plot shows one wrongly assigned audiogram, which is removed in the third stage. The three stages plots are shown in Figure 3. To evaluate the number of clusters, the Eigenvalues were generated, indicating no gap. Thus, the authors calculated the Silhouette criterion values, Calinski-Harabasz clustering evaluation criterion, and Davies-Bouldin criterion values. The results are shown in Table 3 in the third stage. The three stages plots are shown in Figure 3. To evaluate the number clusters, the Eigenvalues were generated, indicating no gap. Thus, the authors calculat the Silhouette criterion values, Calinski-Harabasz clustering evaluation criterion, and D vies-Bouldin criterion values. The results are shown in Table 3, where the first stage h 55 audiograms and the evaluation criteria values are; SI = 0.3907, CHI = 36.7956 and D = 1.0427, stage 2 of 48 audiograms with SI = 0.464, CHI = 38.5503 and DBI = 0.9670 wh stage 3 of 47 audiograms with SI = 0.4814, CHI = 38.5476, and DBI = 0.9426. These resu indicate that the best clustering algorithm performance is for stage 3 where SI is the hig est and DBI is the lowest.

. Ten Clusters Evaluation Criteria
Similarly, the selected 55 audiograms are classified into 10 clusters. The Silhouet plot indicates that four audiograms are wrongly assigned. The first two stages are co ducted to remove the wrongly assigned audiograms, as shown in Figure 4. In the secon stage, the plot shows that two audiograms are on the border between two clusters (ve small negative Silhouette coefficients of −0.004755, −0.0007911). The third stage shows th all audiograms are with positive Silhouette coefficients. Other clustering evaluati

. Ten Clusters Evaluation Criteria
Similarly, the selected 55 audiograms are classified into 10 clusters. The Silhouette plot indicates that four audiograms are wrongly assigned. The first two stages are conducted to remove the wrongly assigned audiograms, as shown in Figure 4. In the second stage, the plot shows that two audiograms are on the border between two clusters (very small negative Silhouette coefficients of −0.004755, −0.0007911). The third stage shows that all audiograms are with positive Silhouette coefficients. Other clustering evaluation criteria values are listed in Table 4. As seen from the results the best clustering performance is for stage 3, where SI and CHI are the highest and DBI is the lowest.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 11 of criteria values are listed in Table 4. As seen from the results the best clustering perf mance is for stage 3, where SI and CHI are the highest and DBI is the lowest.

Audiograms' Population Clusters Evaluation
The clustering algorithm is applied to the data set generated as described in Secti 3.4.2. The Original data size was generated (25,307 audiograms) to represent the select 55 audiograms with the corresponding associated percentages of the total populati (28,244 audiograms). Then, the authors applied a spectral clustering algorithm with t same input arguments used earlier to these numbers of audiograms. Still, the analy failed to generate a similarity matrix to cluster this large data, reflected by the high neg tive silhouette coefficient of value −0.6012. This result matches the findings from the erature, and the spectral clustering technique is not practical for large data [41][42][43][44]. Ne the authors annotated the generated 25,307 audiograms with the produced 8 and 10 cl ters. The wrongly assigned ones with negative Silhouette coefficient were removed

Audiograms' Population Clusters Evaluation
The clustering algorithm is applied to the data set generated as described in Section 3.4.2. The Original data size was generated (25,307 audiograms) to represent the selected 55 audiograms with the corresponding associated percentages of the total population (28,244 audiograms). Then, the authors applied a spectral clustering algorithm with the same input arguments used earlier to these numbers of audiograms. Still, the analysis failed to generate a similarity matrix to cluster this large data, reflected by the high negative silhouette coefficient of value −0.6012. This result matches the findings from the literature, and the spectral clustering technique is not practical for large data [41][42][43][44]. Next, the authors annotated the generated 25,307 audiograms with the produced 8 and 10 clusters. The wrongly assigned ones with negative Silhouette coefficient were removed in stage 2. Stage 3, has only audiograms with positive Silhouette coefficients in their clusters. Hence, 20,956 audiograms were annotated using 8 clusters in stage 3, and 22,002 audiograms were annotated using 10 clusters. Figure 5 shows Silhouette plots of stage 1 and 3 for 8 clusters, while Figure 6 shows the results of stage 1 and stage 3 for the 10 clusters. Table 5 summarizes criteria evaluation values for both numbers of cluster. as it can be seen from the results that the SI values are higher than 0.5 which indicates good performance of the algorithm in both cases 8 clusters and 10 clusters. At stage 3, the data is cleaned and the SI, CHI, and DBI have the best values. These criteria values are better for eight clusters. seen from the results that the SI values are higher than 0.5 which indicates g mance of the algorithm in both cases 8 clusters and 10 clusters. At stage 3, cleaned and the SI, CHI, and DBI have the best values. These criteria values a eight clusters.

Mapping Bisgaard Standard Levels to the Implemented Clusters
The aim of this part is to compare the clustering results with the chosen standard hearing levels by Bisgaard [15]. He chose seven flat and moderately sloping standard audiograms; named N1 to N7 and three steep sloping ones named S1 to S3 [15]. The eight clusters implemented by this work mapped N1, S1, and N2 in the same cluster, N4 and N5 to the same cluster and N6 and N7 to the same cluster. Meanwhile, N3, S2, and S3 are classified in different clusters. However, for the 10 clusters, S1 and N2 are mapped to the same cluster and N6 and N7 to the same cluster. Thus, N1, N3, N4, N5, S2, and S3 are in different classes. The mapping results are shown in Figure 7, where the x-axis represents the 10 standards N1-N7 and S1-S3, while the y-axis is the cluster number in our work. These mapping results are also displayed in Table 6 for further clarification and comparison.

Mapping Bisgaard Standard Levels to the Implemented Clusters
The aim of this part is to compare the clustering results with the chosen standard hearing levels by Bisgaard [15]. He chose seven flat and moderately sloping standard audiograms; named N1 to N7 and three steep sloping ones named S1 to S3 [15]. The eight clusters implemented by this work mapped N1, S1, and N2 in the same cluster, N4 and N5 to the same cluster and N6 and N7 to the same cluster. Meanwhile, N3, S2, and S3 are classified in different clusters. However, for the 10 clusters, S1 and N2 are mapped to the same cluster and N6 and N7 to the same cluster. Thus, N1, N3, N4, N5, S2, and S3 are in different classes. The mapping results are shown in Figure 7, where the x-axis represents the 10 standards N1-N7 and S1-S3, while the y-axis is the cluster number in our work. These mapping results are also displayed in Table 6 for further clarification and comparison.  [15] mapped to the generated clusters.   [15] mapped to the generated clusters.

Results Summary and Conclusions
A comparison between the results of 8 clusters and 10 clusters can be summarized in Table 7. As shown in Table 7, different criteria values are slightly better for 8 clusters than 10 clusters. Silhouette coefficients and Calinski-Harabasz clustering evaluation criterion values are higher for eight clusters, while Davies-Bouldin values are lower. For the population audiograms, the Silhouette and the Davies-Bouldin values are better for eight clusters, but the Calinski-Harabasz value is better for 10 clusters. This could be explained as Calinski-Harabasz is the most sensitive parameter to the number of observations considered to calculate this criterion [45]. The number of audiograms considered for 8 clusters in stage 3 is 20,957 while it is 22,002 audiograms for 10 clusters in stage 3 (as shown in Table 7). Eigenvalues are small, but no gap is indicated to confirm selection between 8 and 10 clusters. The number of audiograms' population considered in the last stage for 10 clusters is higher than those in 8 clusters. The 10 clusters can be preferred as more patients' audiogram shapes are considered using 10 clusters. The Silhouette coefficients of the audiogram population are higher than 0.5 for both number of clusters, which suggests good clustering. The trial to classify audiograms by Belitz [20] to adjust hearing aid, has low accuracy. 68% when they assigned one configuration to each audiogram. We believe that their accuracy might have increased if they considered higher number of clusters as the data has a nature of high overlap. This matches with the results found in this research to cluster data into 8 or 10 classes.
This work can be considered the first step to change the way of designing hearing aid filter banks. The existing filter bank designs use digital filters with different techniques to divide the entire frequency band (125 Hz-8 KHz) non-uniformly. It then applies gain controls to configure the hearing aid to match patient's audiogram. The current practice aims to design digital filters that can match multiple patients' audiograms which leads to very complex designs as conducted in [6][7][8][9]. These designs lower the cost of manufacturing hearing aids as they can be produced on a large scale since they accommodate multiple users. However, complex designs require high operating power and big chip area, which leads to an improper fitted hearing aid. When there is a reduction in complexity, the hearing aid prototype can match a limited number of audiograms effectively. On the other hand, a low complex hearing aid design makes it properly fitted as it does not require large area to be implemented since it has a small number of filter coefficients [4,8]. Normally, hardware complexity when designing filter bank structures is measured by the components needed to realize these filters (which are multipliers, adders, and shifters). However, in many researches, only multipliers are considered as they are the most power-consuming elements in the digital signal processing (DSP) hardware [46]. Hence, to summarize, the current practice, trials are made to mask the hearing frequency band with a large number of filters with complex techniques. Another consequence to these complex designs the process of adjusting hearing aid becomes difficult for both the patients and audiologists. Instead of attempting to match different types of hearing loss with one design to satisfy the needs of many patients aiming to lower the cost of manufacturing, designs can be implemented according to categories produced by our intelligent solution. The filter bank can be designed to match the shapes of a number of these clusters not all. This will result in designs that less complex, with low delay, a small chip area, and reduced cost. In addition, these clusters will facilitate the process of programming or adjusting hearing aid to match the user needs by assigning each patient's audiogram a configuration related to the produced clusters.
Consequently, configuring a hearing aid will be easy and less exhausting for the patients and the audiologists as it will require less response from the patients. The power of intelligent solutions does not depend on the skills, experience, and knowledge of limited number of skillful experienced audiologists. In addition, as it requires less response from the patients, it will introduce big help to cohorts such as older people, individuals with dementia and children who are experience hearing loss. In addition, this method can be applied to any set of test frequencies as it is not restricted to the used set in the study. Data can be pre-processed such that any missed frequency can be interpolated. The needed input is the hearing levels to be tested at eight different frequencies. Those eight frequencies can vary according to the protocol used in the hearing test. Hence, what is considered in this study is to test the air conduction thresholds at eight test frequencies with/without masking in the non-test ear. Bone conduction thresholds are not considered in this study.
To conclude, the authors do not count only on rigid statistical analysis. The results should be interpreted according to the needed solution to be introduced. The authors prefer to consider the 10 clusters since more shapes of patients' audiograms will be included. In addition, the authors predict that grouping the standard levels S1, N1, N2, and N4, N5 in the same cluster might be a source of confusion for any future supervised machine learning algorithm. It is seen that 10 clusters might produce a higher accuracy supervised learning model due to the high overlap nature of data, which means that introducing more clusters might help resolving this problem.
The authors recommend, for future work, the use of regression analysis to generate one polynomial that represents each cluster. These polynomials are predicted using regression with the least square method to minimize the difference between clustered audiograms in the same cluster and predicted polynomials (as carried out in [47,48]).  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Approval to reuse the data from Table A1 from paper [15] is obtained from SAGE Publishing at no cost for the life of the research work. The permission is obtained on 2 September 2021 via email for request RP-6079.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.