Innovative Artificial Intelligence Approach for Hearing-Loss Symptoms Identification Model Using Machine Learning Techniques

Physicians depend on their insight and experience and on a fundamentally indicative or symptomatic approach to decide on the possible ailment of a patient. However, numerous phases of problem identification and longer strategies can prompt a longer time for consulting and can subsequently cause other patients that require attention to wait for longer. This can bring about pressure and tension concerning those patients. In this study, we focus on developing a decisionsupport system for diagnosing the symptoms as a result of hearing loss. The model is implemented by utilizing machine learning techniques. The Frequent Pattern Growth (FP-Growth) algorithm is used as a feature transformation method and the multivariate Bernoulli naïve Bayes classification model as the classifier. To find the correlation that exists between the hearing thresholds and symptoms of hearing loss, the FP-Growth and association rule algorithms were first used to experiment with small sample and large sample datasets. The result of these two experiments showed the existence of this relationship, and that the performance of the hybrid of the FP-Growth and naïve Bayes algorithms in identifying hearing-loss symptoms was found to be efficient, with a very small error rate. The average accuracy rate and average error rate for the multivariate Bernoulli model with FP-Growth feature transformation, using five training sets, are 98.25% and 1.73%, respectively.


Introduction
More than 5 percent (466 million) of the world's population is affected by hearing loss (432 million adults, 34 million children). It is predicted that over 900 million people, or one out of ten, will experience hearing loss by 2050 [1]. Restricted hearing loss is more than 40 decibels (dB) in the better ear of an adult and more than 30 dB in that of a child. Most people living in low-and middle-income countries suffer from hearing loss [1]. Around a third of people over the age of 65 suffer from disabling hearing loss. In South Asia, the Asia Pacific and sub-Saharan Africa, the frequency of this age group is greatly increased. Statistics show that in the Asia Pacific, an area of which Malaysia is part, the occurrence of affected hearing loss is very high [2]. About 31,000 hearing loss cases were reported in Malaysia alone during 1980. In 2005, national survey disorder statistics indicate that the population prevalence was 17.4%, and during this time about 3,962,879 cases were reported. The Ministry of Health of Malaysia reported hearing loss as one of the top 10 illnesses [3].
Hearing loss is among the most prominent diseases harming children as well as younger and older adults, and can contribute to impairment if they are not properly diagnosed early. An otorhinolaryngologist categorizes the symptoms of a patient according to his/her expertise and after the specific evaluation of the symptoms of hearing loss. Such procedures include five steps followed by an order, which include a collection of patient case history, otoscopy, audiometric hearing tests, tympanometry and acoustic reflex. Given the number of patients who usually visit ENT departments of various hospitals to get their hearing problem treated and the amount of time it takes for each procedure to be performed during a consultation with the otorhinolaryngologist, these phases may delay treatment process and makes patients leaving the hospital because they already waited for a long time [4]. A long waiting time can cause anxiety and stress in the patients in the queue [5]. The patients' understanding of the health system, therefore, tampers with possible solutions, and thus it is important to reduce the average waiting time of patients so that the overall cost of consulting hearing-loss patients is reduced [6][7][8]. Procedures or measures to evaluate hearing loss in patients are available. The first step in the investigation is pure tone audiometry [9]. Hearing tests are carried out in a room that is very quiet and noise-free. Sounds are conveyed by audiologists on earphones at various frequencies (250-8000 Hz) and sound intensities (−10-140 dB), who advises the patient to hit the button for the least possible-to-hear sound. The test results are recorded on a soundtrack. Figure 1 displays the hearing loss investigation approach. On a patient's first appointment, physicians refer him to an ENT specialist. Once hearing issues begin, the physician will ask for the case history of one of the most common and basic audiological tests for hearing loss, making differential diagnoses possible with the patient's case history [10]. The following test will be done using an otoscope; the physician will then visually examine the external auditory channel [11]. The ENT professional then refers the patient to an audiologist who examines the patient's hearing loss by using an audiometer, integrating clarity of tone at various frequencies. In conjunction with an examination, tympanometry helps physicians to assess how well the conducting pathway passes sounds to the inner ear. Acoustic reflexes test stapedial muscle contraction in the middle ear to respond to severe sound [12]. During all of the following examination stages, the physician can diagnose whether it is conductive hearing loss, sensorineural or mixed hearing loss or normal hearing sensitivity to the illnesses or diseases that cause patients to lose their hearing ability. If conductive or mixed hearing loss occurs, the patient must go for a follow-up audiological evaluation after therapy by an ENT specialist. The ENT practitioner should create an auditory aid trail for the patient that is influenced by the way it is used and managed concerning sensorineural hearing loss. The ENT physician also organizes schedules with the patient after a few months or weeks for further evaluation [13]. The basic diagnosis and assessment protocol of hearing-loss symptoms for a patient with a hearing problem is illustrated in Figure 2.    Figure 2 illustrates the basic diagnosis and assessment protocol of hearing-loss symptoms for a patient with a hearing problem. Without these fundamental procedures, every audiological evaluation process is incomplete to determine the symptoms and type of hearing loss experienced by the patient [14]. Such five medical symptoms of hearing loss listed above are essential and fundamental clinical audiological medical techniques. One should disregard the amount of time spent on the procedures given their significance in diagnosing the forms and symptoms of hearing loss. The study carried out by [15] demonstrates that it takes a great deal of time to collect case history alone but offers interesting information. To treat certain waiting patients, the diagnostic process must always be accelerated. If a variety of tests are needed before the diagnosis findings are obtained by the specialist, then this may directly impact certain patients to be treated. Another study by [16] implies that a physician can classify signs of hearing loss considerably based on the case history and otoscopy. This reveals that the diagnostic protocol can be minimized and yet the expert can understand the issue by following all processes. Because numerous studies have shown how symptoms of hearing loss are linked to certain variations in the audiogram, a specialist may determine the form and symptoms of the hearing loss without necessarily performing all the diagnostic procedures using air and bone transmission.
The main objective of the study is to identify signs of hearing loss efficacy from the threshold of pure-tone air and bone conduction so that hearing loss is easier to investigate. This method includes identifying and using associations between pure sound audiometry and the signs and other features in the patient's health audiology datasets to classify symptoms of hearing loss. The symptoms can indeed be precisely predicted using a diagnosis model that uses hybrid machine learning approaches, which can predict a class of pure audiometric data for the input air or bone conduction. Vast quantities of untapped and potentially useful data produced by healthcare providers have potential information. In determining the symptoms of a disease, medical professionals depend on their experience and knowledge and a practical diagnostic mechanism. Many diagnostic stages and longer procedures will lead to longer appointments, which means that those waiting to be treated have a longer time to wait. This can contribute to anxiety and stress in these patients. However, the contribution of our study can be seen as follows: • This work provides an important opportunity to boost the diagnostic process of hearing-loss symptoms by proposing a model of symptom detection to accurately classify symptoms of hearing loss based on pure audiometry data from air and bone conduction. The symptoms can indeed be precisely predicted using a diagnosis model that uses hybrid machine learning approaches, which can predict a class of pure audiometric data for the input air or bone conduction.

•
The model is implemented using Frequent Pattern Growth (FP-Growth) and the naïve Bayes (NB) algorithm, where FP-Growth is an unsupervised method that used for the feature extraction purpose while NB models are supervised models that are hired for the classification target. • FP-Growth was first applied with small sample and large sample datasets to analyze the correlation among both the hearing thresholds and symptoms of hearing loss. The results of these experiments showed hybridization of the FP-Growth and NB models, shown to work effectively with a very low error rate to determine hearingloss symptoms.
The organization of this paper is as follows: Section 2 presents the related work for hearing loss identification. Section 3 describes the Materials and Methods for the classification of the hearing-loss symptoms identification model. The experimental results obtained are discussed in Section 4. The study constraints and limitations are discussed in Section 5. The conclusion of this study is made in Section 6.

Related Work
Numerous studies have developed hearing loss strategies or techniques that can boost or ease the role of otolaryngology clinicians. To aid physicians with hearing loss diagnosis [17], cluster forms of audiograms in homogenous and inhomogeneous clusters are applied using the K-mean technique for diagnosing hearing loss. Their research used pure tone data from 1633 individuals. The audiogram format was categorized by the K-means clustering algorithm in different cluster numbers, namely, 4, 5, 6, 7, 8, 9, 10 and 11. ANOVA, to test the presumption of homogeneity between the audiogram styles, was used to evaluate the clusters and the results were tested using the mentioned tool. The researchers in this study show that the judgment of a clinician during the diagnosis is based on their personal experiences that are not free of errors. Besides, there is a need for a consistent audiogram classification that can aid doctors in the diagnosis. The researchers did not reveal any pathology, signs, or frequency in relation to the classification of these audiograms. This correlation allows clinicians to understand the connection between other audiogram types and the characteristics of certain patients.
Moein et al. [18] has built a decision-support system for the evaluation of symptoms of hearing loss. Throughout their study, 150 patients from an otolaryngology clinic had been gathered. The Multi-Layer Perceptron Neural Network (MLP) and Support Vector Machine (SVM) were used for classification of hearing loss signs in six classes, namely, serous otitis media, otitis media, conductive fixation, cochlear age, cochlear noise and normal. The ear condition frequency in the dataset and the given labels for the MLP and SVM are displayed in Table 1.  Table 1 displays each ear condition frequency in the dataset and the given labels for the MLP and SVM. According to the results of the study, in the data classification, the SVM is stronger than the MLP, where SVM help achieving a 92.5% accuracy compared to MLP, with an accuracy of 77.5%. Despite the high SVM accuracy that can enhance patient diagnosis, only patients with particular or few symptoms or a disorder numbering to six were included in the experiments for small datasets. A dataset that would contain more typical signs would be more fitting and would have been better tested to determine the efficacy of the SVM on hearing-loss symptoms. Additionally, the Otoneurological System was created by [19] to help identify vertigo hearing-loss symptoms. To assess the accuracy of the machine training techniques and the accuracy of classification, the combination of the knowledge learned from machine learning techniques with expert knowledge to obtain information from the patient data, which will help with the diagnosis, the researchers focused on testing the mechanism of nearest K and naive Bayes classification techniques. An otoneurological dataset consisting of 815 experimental cases were collected. The data collection reveals acoustic neurinoma, Meniere's disease, benign positional vertigo, sudden deafness, traumatic vertigo and vestibular neuritis. The researchers have used an extra 1030 cases of a vertigo dataset collected from the Helsinki University Central Hospital in the process of evaluating the accuracy of these techniques. In the study, two vertigo datasets were used for the technique of knowledge exploration and a comparison was made with the otolaryngologist's knowledge. To assess the influence of both the otolaryngology information and the results of the machine learning technology, the classification accuracy is often combined in different ways. The findings showed the highest accuracy of classification by combining otolaryngologist knowledge with professional knowledge. The system was intended only for diagnosis of vertigo symptoms and more focus was put on testing the dataset system that only comprises vertigo cases. The method used to estimate the predictive accuracy of the information gained from the learning method was another drawback of this experiment. Approximately 70% of cases were used for algorithms training and only 30% were used for testing [19]. Thompson et al. [20] used a medical records database to find information on the causes and treatments of tinnitus to enhance tinnitus detection, interpretation of outcomes and an overall understanding. This is also the study that established a diagnostic method for the diagnosis of a single hearing loss symptom.
The diagnostic model for the identification of vestibular schwannomas from audiometric data has been developed and validated by [21], a company that provides an online audiometric hearing test service, by using an online application to play a range of tones at varying levels to the users whom will be asked to select the particular tones they can hear. A report of the result will be sent to them to view in AudioGen, which is a method that contains machine learning techniques to determine the genetic cause of hearing loss in people segregating autosomal prevalent non-individual hearing loss using phenotypical information derived from audiometric data. The study results show the predictability of the causative gene within the three top predictions, with an algorithm accuracy at 68%. However, the study by [21] only provides an audiometric hearing test, a process that is only one out of the five of the procedures of diagnosing hearing loss. Although AudioGene is a step forward in this regard, because of the immense importance of understanding the genetic cause of hearing loss, understanding other symptoms are also very important and a prediction accuracy of 68% is a level of accuracy that has to be applied with caution in healthcare [22][23][24].
Bing et al. [1] proposed a predictive model for the hearing result in sudden sensorineural hearing misfortune through machine learning techniques. The SSHL may be a multifactorial disease with tall heterogeneity, hence the results change broadly. Their research aided to create prescient models based on four machine learning strategies for SSHL, recognizing the most excellent entertainer for clinical application. The deep learning method has been used with support vector machine, logistic regression and neural network, and were created to classify the dichotomized hearing result of SSHL by contributing six features collected from 149 potential indicators. Precision, accuracy, review, F-score, recall and ROC curve were used to compare the predictive execution of the diverse methods. Generally, excellent predictive capacity was achieved by the DBN approach when tested within the crude information set with 149 factors, accomplishing a precision of 77.58% and AUC of 0.84. Shew and Staecker [25] utilized ML to construct disease-specific methods to anticipate different degrees of SNHL in numerous inward ear pathologies based on a perilymph-derived miRNA expression profile alone. They collected 2-5 µL of perilymph from patients whose internal ears were opened as part of the cochlear implantation and stapedectomy method. At that point, they analyzed the miRNA dataset special to internal ear pathologies, employing a directed machine learning classification, showing and considering multiple-choice models, counting multiclass decision forest, decision jungle, calculated relapse and neural systems. They made the demonstration by employing a 70/30 part, where 70% of the patients were utilized to build the demonstration and the other 30% were utilized to test the ML demonstration. The stage of highlighting the of significance in ML allows it to get which component, and at what weighted esteem that component was utilized, to be attained.
Nisar et al. [26] presented a new model that naturally identifies hearing impedance based on a cognitively inspired features extraction and discourse identification method. In the proposed approach, the client is inquired to rehash words articulated by the machine. Client reaction is first captured through the discourse signal, and the framework identifies right and off-base surmises articulated by the client, to create an audiogram and discourse identification limit naturally. Several machine learning-based classification methods were finally utilized, including the Hidden Markov Model (HMM), k-NN, SVM, and AdaBoost. Generally, the large absolute error of the proposed approach when compared with the specialized audiologist testing is less than 4.9 dB and 4.4 dB for the pure tone and discourse audiometry testing, respectively, accomplishing a precision up to 96.67% utilizing the Hidden Markov model. Cárdenas et al. [27] also displayed a machine learning implementation to consequently distinguish and classify hearing loss conditions based on feature extraction from artificially created brainstem sound-related evoked possibilities, a need given the shortage of fully fledged databases. The method is based on a multi-layer perceptron, which has illustrated to be a valuable and effective instrument in this field. Preparatory outcomes appear to have exceptionally empowering outcomes, with precision outcomes over 90% for an assortment of hearing loss conditions; this framework is to be conveyed as equipment execution for making a reasonable and convenient therapeutic gadget, as detailed in past work.
In the related works, many studies have proposed many computerized hearing loss testing strategies [28,29]. The main aim of the related works was to precisely analyze the hearing disability by minimizing the absolute error rate and maximizing the precision. However, the method is confined to air conduction audiometry, and in this way, the total assessment of the patient is not conceivable without getting to other testing modalities, such as bone conduction and discourse audiometry. Famously, most of the previously mentioned mechanized techniques suffer from issues such as wrong outcomes at lower frequencies, surrounding noise, difficulty in recognizing conductive and sensorineural hearing losses, less precision and effectivity because of nonappearance of the discourse audiometry, etc. This work provides an important opportunity to boost the diagnostic process of hearingloss symptoms by proposing a model of symptom detection to accurately classify symptoms of hearing loss based on pure audiometry data from air and bone conduction. The model is implemented using FP-Growth and NB, where FP-Growth is an unsupervised method that is used for the feature extraction purposes while NB models are supervised models hired for the classification target. For this purpose, FP-Growth was first applied with small sample and large sample datasets to analyze the correlation among both hearing thresholds and symptoms of hearing loss.

Proposed Identification Model
In this section, we introduce and discuss in detail our proposed detection model for hearing-loss symptoms. The model diagram shows the components of the proposed model and how each component processes the data. The NB algorithms and Frequent Pattern (FP Growth) were employed in the model as machine learning (ML) methods. A full description of those methods with the reasons behind employing them in the model is provided in this section. In healthcare literature [30][31][32], these methods are commonly used for similar illness and they were reasonably efficient and successful. This has motivated us to utilize these methods in our proposed model. In Figure 3, our proposed classification model for hearing-loss symptoms, and how the extracted data are frequently processed using the FP-Growth algorithm, is illustrated.
Each item set from the dataset reflect several features, each feature is a part of the vocabulary. In this model, the FP-Growth algorithm, utilized for processing the feature transformation after the process of selection and extraction of the feature, was conducted. The NB classification method was used for training a subset of frequent item sets that achieved the minimum support threshold, as shown in Figure 3. In this example, 242 training item sets out of total 399 training item sets achieved the minimum support threshold to be within the training set in the NB classifiers. Our model can minimize the data dimensionality and requirement repository for the classification methods. Besides, it can enhance the performance of the classification methods and eliminate redundancies. In a specific condition, the dimensionality of the whole training data is minimized and added to the training set for the classification method. Therefore, each training example should include some frequent features that achieve the minimum support threshold considered within the training set for the neural network classification method.
The requirement repository of the classification method is reduced in case frequent features are composed. This is opposite to the traditional method when consisting of the entire features of the training dataset. The common characteristics of the datasets are redundancies and noise. The redundancies can be removed when choosing one frequent item set in the data. It is obvious that the algorithm's speed and performance can be increased once the dataset becomes small. In our proposed model, feature transformation advantages can be obtained, including construction, selection and extraction. New features can be created through all these feature transformation forms [33]. Functional mapping is used to extract new features from old ones [34]. The most important method in the dataset is a frequent feature extraction. Another important method in the dataset is a feature construction that generates additional features to replace the missing data. In this study, we employ the FP-Growth algorithm as linear and non-linear spaces to offer a feature construction process to minimize the data dimensionality and recovering the missing information [33]. Less data dimensionality makes the process easier and faster. However, feature selection can reduce the requirement repository and enhance the performance of the algorithm by removing the redundancies and noise [34]. An associative classification, which is a combination of unsupervised learning methods, such as the FP-Growth algorithm or association rule and NB classifiers, performs much better than the standalone classification method [35]. The hybrid of the FP-Growth algorithm and K-nearest neighbor (KNN) can obtain a high classification accuracy [36]. Our hearing loss detection model has utilized a combination of unsupervised and supervised learning ML methods, particularly the FP-Growth algorithm and NB classifier. The two versions of the naïve Bayes classification models, which are the multivariate Bernoulli and multinomial model, were explained. The multivariate Bernoulli naïve Bayes model was the model of choice for the classifier as the implementation with the FP-Growth algorithm has proven to be more efficient than the multinomial model. This is against the argument by other researchers that indicate that the multinomial model outperforms the multivariate Bernoulli in every respect, as depicted in the chapter using different kinds of datasets. The justifications for adopting these as the techniques for implementing the model are explained using various research in healthcare that uses a similar method with varying degree of success, as well as by the literature that support the efficiency of these techniques. The identification model for hearing-loss symptoms was depicted in a diagram and all the components that make up the model explained. The FP-Growth algorithm serves Sustainability 2021, 13, 5406 9 of 30 as a pre-processing mechanism that provides all the elements of data transformation to the data before it becomes part of the classifier's vocabulary. With that, the advantages inherent in data extraction, selection and construction techniques were all achieved. These advantages include discarding redundant and noisy features in the data, reducing storage requirements and improving the classification algorithm's performance.
The calculation of these parameters for the prior can be represented as follows: From the union of all item sets that meets the minimum threshold, extract the vocabulary (V) for each class and get the training cases that have that class: Calculate P(C j )terms For each C j in C do Training cases t j ← All the training cases with class = C j The calculation of these parameters for the multinomial likelihood can be represented as follows: Calculate P(tk/C j )terms Thresholdsj ← single set containing union of all frequent items sets (vocabulary) For each tk in vocabulary nk ← # of occurrence of tk in the training cases of class = C j The algorithm shows the steps for calculating the parameters for multinomial likelihood. To calculate the probability of a class given a particular training example. P(tk/C j ), the vocabulary, is formed from the union of item sets of the thresholds. Then the number of occurrences of the threshold tk in the training examples of class C j is calculated plus the alpha (α) divided by the total number of tokens (n) in class C j plus additive smoothing alpha (α).
The calculation of these parameters for the multivariate Bernoulli likelihood can be represented as follows: Calculate P(dk/C j )terms Thresholdsj ← single set containing union of all frequent items sets (vocabulary) For each tk in vocabulary nk ← # of training cases where tk is present The algorithm shows the steps for calculating the parameters for the multivariate Bernoulli likelihood. To calculate the probability of a class given a particular training example Pdk/C j , the number of training examples nk where the threshold tk is present is added to the smoothing parameter alpha (α) and divided by total number of tokens plus the alpha (α).

Identifying the Relationship with Association Analysis Algorithms
Unsupervised learning methods, such as association analysis algorithms, have the capability to find a correlation with invisible datasets [37]. Frequent features (item sets) and association rules can be found using this method as discovered segments. If there is a strong relationship between more than two item sets in the dataset, then this is a suggestion of an association rule, which is represented by A → B, where A and B are distinct item sets. The support and confidence metrics are used to measure the correlation of the item set elements in a dataset. Support metric reflects the frequent number of a rule that is used in the dataset at hand. ADi audiology data compresses the S item set where S is a subset of ADi, mathematically formulated as follows: σ(S) represents support for an itemset S. ADi represents individual audiology data with S as its subset (SADi). This is means that each item of S is can be an item in ADi, where ADi is also an element of the dataset (D). A confidence metric is used to measure the interface reliability of an association rule. It suggests a strong correlation between items within an itemset in the preceding and succession of the rule. In instance, the rule TNTS → 2000:30 shows a big confidence value with a big probability hearing threshold between 2000 and 30 in the individual audiology data ADi that included TNTS. The confidence metric reflects the frequency of a number of elements in the S itemset in ADi data that compress the T item. The Confidence and Support measurements can be formulated as follows: The combination of the FP-Growth algorithm and association analysis is powerful and have a capability of item extraction from the dataset [38]. The FP-Growth algorithm is used to generate a frequent itemset within a dataset for patients with hearing loss. The FP-Growth algorithm represents the dataset in a tree data structure known as the FP-tree. Each FP-tree has a path that maps to certain training example after it is scanned by the FP-Growth algorithm [39]. Different features can be reflected by various training examples. The deep interference of the structure of the FP-tree leads to better dataset compression for the FP-tree. Table 2 illustrate the structure of the dataset in details.   In the FP-tree, for each given path, each node represents a feature with a counter for the training example number that is mapped to this path. In the FP-tree, null is the root node, representing the starting point of the FP-tree. Firstly, the FP-Growth algorithm scans the number of frequencies for each item in the dataset and then it removes the item with no frequency count. Thus, an infrequent item leads to infrequency as well. Then the FP-Growth algorithm rescans the number of frequencies to build the structure of FP-tree to extract the frequent item sets [40]. For example, tinnitus is the most frequent item set in our dataset, followed by vertigo and then giddiness, otorrhea and lastly otalgia. After the FP-Growth algorithm generates an FP-tree structure, it crosses the first training example to generate the nodes as Tinnitus → Vertigo. Initially, the FP-tree start from the null node then the other path will be created by the training example as null → Tinnitus → Vertigo. In the example, each node in this path has a frequency count equal to 1. In the second training example, another path will be created from nodes the Vertigo, Vertigo, Giddiness and Otorrhea as null → Vertigo → Giddiness → Otorrhea. The second path is created due to there being no overlap with the first training example that represents the first feature (tinnitus). However, in the third training example, there is an overlap with the first training example in the first feature (tinnitus). So, for the path of null → Tinnitus → Giddiness → Otorrhea → Otalgia, the count feature (tinnitus) becomes two as it is overlapping with the third training example.
FP-Growth algorithm repeats this process until to reach the tenth training example. In addition, frequent item sets are generated by the FP-Growth algorithm to build a conditional branch of FP-tree in a bottom-top approach. The FP-Growth algorithm finds the frequent item sets ending with otalgia, and then it looks for another itemset that ends with otorrhea, giddiness, vertigo and tinnitus. This process is reasonable as each branch in the FB-tree is mapped to each training example. Therefore, for a given feature, a path is traversing to generate frequent item sets. We used settings 0.1 and 0.7 for the minimum support threshold and confidence thresholds, respectively, on the sample audiology dataset of 50 patients. Furthermore, we used settings of 0.2 and 0.7 for the minimum support threshold and confidence thresholds, respectively, on the sample audiology dataset of 339 patients. It is hard to find lower values for the minimum support and confidence threshold measurements. Therefore, we chose 0.2 (20%) and 0.7 (70%) for the minimum support and confidence threshold values as it could achieve the result at an acceptable level. Setting the values to less than 0.1 (10%) of the dataset leads to an undesired result.

Feature Transformation with FP-Growth Algorithm
The FP-Growth algorithm was applied on an audiometry dataset of 399 patients using air and bone conduction audiology medical records. The FP-Growth algorithm acts as a frequent item set extraction algorithm with a setting of 0.4 (40%), the minimum support threshold. Each item set in the training examples that pass the minimum threshold is integrated into the training set for the NB as a classification method. Opposite to the traditional method, which extracts the vocabulary form of all item sets (features) in the training examples, the NB extracts the item sets from a union set of item sets. Only 242 out of 399 training examples were found after the process of the item set generation. Those training examples do not belong to their subset of the generated item sets. Only three symptom types were found from the extracted item sets. From 242 training examples, there are tinnitus symptoms and some symptoms of both tinnitus and vertigo and other symptoms with tinnitus, vertigo and giddiness. The FP-Growth algorithm is fed by the neural network by three labels to identify the symptom of the air and bone conduction audiometry. The first label is tinnitus, the second label is tinnitus and vertigo and the third label a collection of tinnitus, giddiness and vertigo. The air and bone conduction thresholds could consist of undesired frequent aspects for the same frequency or decibel for hearing in both ears of the patient. This can lead to increasing the dataset and features' dimensionality and resulting in noisy features. The FP-Growth algorithm extracts features patterns to build up the classification vocabulary. New features can be created by one of the common feature transformations, such as feature construction, selection and extraction [41]. The feature extraction method is used to extract the frequent item sets from the dataset. The feature construction method is a pre-processing method used to reduce the dataset dimensionality. It is a very critical method as the success of machine learning approaches depends on this process. The feature selection method is used to select features from the dataset to reduce the requirements repository and enhance the performance of the classification algorithm [42].
In this study, we employed all three feature transformation techniques. Extracted item sets (features) were used to build up the vocabulary. This leads to minimizing the feature number for vocabulary. Thus, this minimizes the feature dimensionality as well, which helps the vocabulary keeping the relevant data. The vocabulary consists of a number of disjoint item sets (features) in the training examples [43]. Thus, the three feature transformations, extraction, selection and construction, are attained. Reducing the requirements repository, removing the noisy feature and lowering the computational complexity result in enhancing the performance of the classification algorithm, and a lower feature number means higher speed processing. Factor analysis, independent component analysis and principal component are the most common techniques used to reduce the feature dimensionality [44]. In this research, we employ the FP-Growth algorithm in our detection model to offer a feature construction process to minimize the data dimensionality and recovering the missing information [45].

Patterns Evaluation
A large number of item sets and form patterns can be generated by the FP-Growth algorithm within the minimum support threshold. The FP-Growth algorithm tends to generate a huge number of patterns since the size of the dataset is very big. The issue is that some of these generated patterns are undesirable. It is not a trivial process to identify the desirable patterns and undesirable ones as this decision depends on many aspects. Thus, using standard evaluation methods for pattern quality is a necessity. Statistical methods are one of these methods used to evaluate the quality of the generated patterns [46]. It can be considered that the item sets that have a lower number of items or are discovered in less of the training examples are undesirable item sets. An objective interestingness metric can be used to remove these item sets. An objective interestingness metric is based on statistical analysis that identifies which item set should be removed. In the literature, several objective interestingness metrics is proposed to discover the desirable item sets concerning specific aspects. An aggregating method is proposed in [47] to discover the desirable association rules using an advanced aggregator. The ranking method comprises two processes. The first process is based on the chi-square test technique while the second process is measuring the objective interestingness. Objective interestingness measurement is commonly used in the literature. It relies on the relationship of the confidence threshold and minimum support threshold [48].
A study on the objective interestingness measurement was conducted by [49], demonstrating that some interestingness measurements can reduce the association rules number efficiently. However, the accuracy quality is not improved. In addition, no individual interestingness metric is superior to others. Another standard evaluation method in the evaluation of desirable item set quality is subjective arguments. In this method, the itemset can be desired if it offers unpredicted beneficial information for the discovered data. In this study, we employed subjective knowledge arguments as an evaluation method. This is because of the advanced knowledge obtained from the patients' medical audiology data. The template-based method is employed as a subjective knowledge evaluator to evaluate the extracted item set quality. Thus, the generated item set using the FP-Growth algorithm is allowed to be restricted as all the items are filtered, keeping only the itemset that has one or more symptoms, such as vertigo, tinnitus, otalgia, Meniere, and others. In this paper, the template-based method is used because of its advantages that has been demonstrated in many recent studies. Besides, it can enhance the search of keywords using semantic data [50]. Researchers and scientists who are experts in this domain can only use their knowledge and experience to discover the important patterns. So, the patterns selected by the expert template only were extracted.

Symptoms Identification with the Naïve Bayes Algorithm
We performed the classification process on the output of the training set obtained from the recurrent item sets when applied on the FP-growth algorithm. We used two common methods of naïve Bayes, including a multinomial model [17] and multivariate Bernoulli, to find out the most accurate solution. The naïve Bayes method is applied to the hearing loss classification problem to detect the symptoms for the thresholds of bone conduction audiology and the pure-tone air. In the multivariate Bernoulli method, the vocabulary and a training example act as inputs, after which they are processed to obtain the binary output classification. The binary classification can be represented by a vector of ones to reflect the condition of the existing hearing threshold while it can be represented by a vector of zeros to reflect the condition of the absence hearing threshold. The vocabulary consists of several different features that form the training examples [18]. The vocabulary length binary should be the same length as the binary vector. The vocabulary results contain various features and thresholds. For a given class, the multinomial model produces the portion of times that the threshold values of the training examples appear. In our proposed model, the threshold value of the frequent item set is insignificant compared to the threshold value state, whether in existence or absent in the training example. Therefore, we employed the multivariate Bernoulli for this purpose. The training example was divided into a number of feature sets to extract the features, including the bone conduction audiology thresholds and symbols of air from the dataset. The threshold of audiology hearing reflects the given frequency level and decibel at the point of hearing the pure tone. A vector of ones and zeros symbols represent every training example. A one value indicates that the symbols are available in the training example while the zero values indicate that the symbols are unavailable the training example. The estimated training examples of the probabilities and conditional probabilities for the given class feature were used to train the classification methods [19]. The naïve Bays process is formulated in the mathematical equations as follows: The Bayes rule is formulated as in [17,20]: This is applied in the classification method and formulated as C_map represents the best class, which is the one excluded from all classes that maximize the values argmax and P(C/D). Using the Bayes rule, every class is maximized by Equations (4) and (5): The class that could maximize the product of P(D/C) P(C) is most likely to be selected. The goal is selecting the class that is associated with the probability bigger than the specific audiology thresholds that have the symptom or set of symptoms.
To calculate the most likely class, the probability of the initial of likelihood features is multiplied by the class probability. This can be reformulated as C_NB is the best class that maximizes the advance class probability P(C j ) multiplied by each probability of the feature in the given feature class. In the data, for each hearing threshold position in the given class probability, the class is computed and assigned the best probability. The frequent item sets in the data were computed for classification training purposes.
For the advance training example (t) that is available in a class (C j ), To calculate the most likely class, the probability of the initial of likelihood features is multiplied by the class probability. This can be reformulated as C_NB is the best class that maximizes the advance class probability P(Cj) multiplied by each probability of the feature in the given feature class. In the data, for each hearing threshold position in the given class probability, the class is computed and assigned the best probability. The frequent item sets in the data were computed for classification training purposes.
For the advance training example (t) that is available in a class (

Results
This section discusses the first and second experimental results of the study that is aimed at finding a relationship between the audiometry thresholds and attributes in hearing-loss patient medical records, using association analysis. The section also presents the results of the implementation of the identification for hearing-loss symptoms using the FP-Growth feature transformation and the performance of the two naïve Bayes classification models; multivariate Bernoulli and multinomial models with and without the FP-growth feature transformation technique. The reason why the multivariate Bernoulli (C_j) = tcount c = C j /Nt (9) In in multinomial model, the likelihood and the threshold probability i (ti) for the given class (C j ) can be calculated by the number of times the threshold of i (ti) is counted for the given class (C j ) in the training example and then dividing it by the overall threshold number across all training examples of class (C j ), as represented in the following equation: The class that could maximize the product of P(D/C) P(C) is most likely to be selected. The goal is selecting the class that is associated with the probability bigger than the specific audiology thresholds that have the symptom or set of symptoms.
To calculate the most likely class, the probability of the initial of likelihood features is multiplied by the class probability. This can be reformulated as _ = ( _ )∏ ( / _ ) ∈ ∈ C_NB is the best class that maximizes the advance class probability P(Cj) multiplied by each probability of the feature in the given feature class. In the data, for each hearing threshold position in the given class probability, the class is computed and assigned the best probability. The frequent item sets in the data were computed for classification training purposes.

Results
This section discusses the first and second experimental results of the study that is aimed at finding a relationship between the audiometry thresholds and attributes in hearing-loss patient medical records, using association analysis. The section also presents the results of the implementation of the identification for hearing-loss symptoms using the FP-Growth feature transformation and the performance of the two naïve Bayes classification models; multivariate Bernoulli and multinomial models with and without the FP-growth feature transformation technique. The reason why the multivariate Bernoulli (ti|C j ) = count ti, C j / ∑ count t, C j W ∈ V C_map represents the best class, which is the one excluded from all classes that maximize the values argmax and P(C/D). Using the Bayes rule, every class is maximized by Equations (4) and (5): The class that could maximize the product of P(D/C) P(C) is most likely to be selected. The goal is selecting the class that is associated with the probability bigger than the specific audiology thresholds that have the symptom or set of symptoms.
To calculate the most likely class, the probability of the initial of likelihood features is multiplied by the class probability. This can be reformulated as C_NB is the best class that maximizes the advance class probability P(Cj) multiplied by each probability of the feature in the given feature class. In the data, for each hearing threshold position in the given class probability, the class is computed and assigned the best probability. The frequent item sets in the data were computed for classification training purposes.

Results
This section discusses the first and second experimental results of the study that is aimed at finding a relationship between the audiometry thresholds and attributes in hearing-loss patient medical records, using association analysis. The section also presents the results of the implementation of the identification for hearing-loss symptoms using the FP-Growth feature transformation and the performance of the two naïve Bayes classification models; multivariate Bernoulli and multinomial models with and without the FP-growth feature transformation technique. The reason why the multivariate Bernoulli

Results
This section discusses the first and second experimental results of the study that is aimed at finding a relationship between the audiometry thresholds and attributes in hearing-loss patient medical records, using association analysis. The section also presents the results of the implementation of the identification for hearing-loss symptoms using the FP-Growth feature transformation and the performance of the two naïve Bayes classification models; multivariate Bernoulli and multinomial models with and without the FP-growth feature transformation technique. The reason why the multivariate Bernoulli naïve classifier model is adopted for the implementation of the proposed model is also explained in this section.

Dataset Used
The National Medical Research Register (NMRR) in Malaysia is considered the official data bank in the medical field. Researchers can register their medical research online at NMRR for review and get approval for sample data collection by the concerned authorities. Our research has obtained NMRR registration and sample data collection approval. The type of data used for this research is secondary data. The type of secondary data collected for the research is the medical records of hearing-loss patients, including their audiometry data. This type of data is typically recorded by the audiologist and otolaryngology specialists in the course of diagnosing the patient during consultation. A collection of the audiometric data from the period between 2003 and 2012 were obtained from an otolaryngology department in a Malaysian local hospital. The collection data belonged to 399 patients with hearing difficulties aged from 3 to 88 years old. The data collection ranged from 0.125 kHz to 8 kHz for 11 frequencies measurements. To find out the link between the symptoms of hearing loss and audiometry threshold of pure-tone air conduction, a Frequent Pattern Growth algorithm (FP-Growth) combined with rule mining algorithm were used on a sample dataset of 50 patients with hearing difficulties with a setting of 0.7 and 0.1 for the confidence thresholds and the support threshold, respectively. The FP-Growth and rule mining algorithm were also employed on another bigger sample dataset of 399 patients with hearing difficulties with the setting of 0.7 and 0.2 for the confidence thresholds and the minimum support of the item set generation, respectively. Both studies reveal that there is a correlation between the audiometry thresholds and the symptoms of hearing loss, such as dizziness, vertigo, tinnitus and other medical information.
The small dataset: FP-Growth algorithm combined with the rule algorithm were employed to find the correlation of audiometric configuration and the characteristic of patients with hearing difficulties on a sample of pure-tone air conduction audiometry data, collected from 50 medical records of patients with hearing loss. The hearing loss characteristics included data structures of age, gender and symptoms. The confidence thresholds were set to 0.7 and the minimum support was 0.1, as the association rule that set to 0.1 (10% of the dataset) or more is more motivated than an association rule that is set to less than 10% of the dataset. The dataset included a collection of data of puretone audiometry thresholds and the characteristic of hearing loss from medical records. Around 349 frequent item sets were generated using the FP-Growth algorithm based on the association rules mentioned above.
Large sample dataset: Using the same method, the experiment is repeated on the entire dataset that contains data of 399 hearing-loss patients, including the sample data that applied in the initial experiment. The value of the confidence thresholds has not changed, which is equal to 0.7, while the value of the minimum support is increased to 0.2.

Data Preparation
We prepared the dataset in a way that is easier for the algorithm to read and apply it. Discrete data are more likely to be chosen because of the sorting way of the item sets. Some symptoms of the hearing loss were abbreviated, including vertigo as (VTG) and tinnitus (TNTS), while other symptoms were not abbreviated, including rhinitis, prescubysis, otalgia, giddiness and otorrhea. The patient's characteristics and attributes also were abbreviated, including gender represented male as (M) and female as (F). The patient's age was abbreviated as early (E), mid (M) and (L) late. For instance, 5 M (the mid 50 s) is representing a male of 55 years old and 8 L (the late 80 s) is representing a male of 89 years old. We used a colon (:) in the hearing thresholds as a separator between the sound frequencies and the sound dB. For instance, hearing thresholds of 500:45 R represent a 500 Hz frequency and 45 sound dB for the right ear, while 8000:80 L represents 80,000 Hz frequency and 80 sound dB for the left ear. In another study [30], symptoms of hearing loss, attributes and structured data, such as date of birth, gender, type of hearing device and other medical information, were abbreviated and applied on statistical and neural methods for patients classification to help in selecting the most beneficial hearing device for the appropriate patients. It is necessary to change the data format to be acceptable for the given algorithm.

Performance Evaluation and Validation
The error rate metric was used to measure the performance of our detection model. To calculate the error rate, a cross-validation technique was applied, and the random sub-sampling validation method was used to repeatedly divide the dataset into two sets: one used for the training while the other used for the validation. To validate our model, a validation technique was applied to randomly divide the dataset to obtain training and test sets during the execution time. The validation technique was iteratively repeated ten times and then the average error rate was computed. For each training example, this method was applied to select the sets for the test and training. Each grouped training example was chosen randomly after was divided at each iteration. The errors rate was averaged after a number of iterations for each partition group. It suggested applying the NB on the dataset prior to the pre-processing step to make a performance comparison using a different representation. It also suggested taking the risk of using the whole information and data rather than the risk of reduced information. Our model was implemented by the Python programming language. The testing also was conducted in the same programming language. Python was chosen as it is a powerful and efficient programming language for mathematical computation [51][52][53][54]. Otorhinolaryngology specialists are involved in the validation process in the first and second experiments for the results confirmation purpose.
These experiments were based on the extracted patterns that reflect the correlation of the audiology thresholds and symptoms of hearing-loss symptoms.

Results from the Association Analysis Using the Small Sample Dataset
The pure-tone air conduction audiometry data for 50 hearing-loss patients were collected to find any possible connection between the audiogram configuration and attributes in the hearing-loss patients' medical records. These attributes consist of structured data, such as symptoms, gender and age. The FP-Growth algorithm and association rule algorithm was used for this purpose. The minimum support for item set generation was set to 10% (0.1) and the minimum confidence for the association rule was set to 70% (0.7). The dataset comprised pure-tone audiometry threshold data in a combined form with additional characteristics as found in the medical records. The FP-Growth algorithm generates 349 frequent item sets from which association rules are generated. The results of some association rules are interesting, of which 93 are depicted in the exact format of the output of the FP-Growth and association rule algorithm (Tables 3-5). Moreover, the association rules are further summarized in Tables 6-8.   These rules show the hidden relationships in the small sample dataset (50 cases) using 0.1 (10%) as the minimum support threshold. The confidence in the rightmost column is used to measure the strength of association between items in the dataset. From the item sets above, the association rule (TNTS → 2000:30 R, F) denotes a strong correlation between the symptoms of tinnitus (TNTS) with a 2000:30 hearing threshold in the right (R) ear among females (F) in the sample dataset of 50 hearing-loss patients. Table 7 provides the summary and meaning of the abbreviations in Table 3 The result (TNTS → 500:55 L, 250:60 L) means a strong relationship between tinnitus and the hearing threshold at a mid-frequency of 500 Hz and 55 dB (sound decibel). This rule also shows a possible tinnitus connection with 250 Hz (low frequency) at 60 dB, all in the left (L) ear. The other generated association rules in the table also show interesting relationships between tinnitus and hearing thresholds and other attributes in the dataset. According to evidence from other researchers, flat, cochlear-type hearing impairment can be detected on the audiogram of tinnitus patients and that low frequencies are most affected. In addition, the shape of the audiogram is often flat or rising but any configuration is possible. From the results, a low frequency of 250 Hz can be seen. Moreover, as evidenced by the literature, the shape of the audiogram is often flat; that is, the hearing thresholds are mostly at lower sound decibels but at various frequencies. Table 7 shows a summary of all the discovered rules. Table 5 shows the association rules for vertigo. One of the symptoms of hearing loss diagnosed in patients is vertigo. The level of confidence met by the item sets depicts an interesting relationship between vertigo and hearing threshold and other attributes. The rule (VTG → 1000:10 L, 500:20 L) denotes the probability that vertigo patients experience hearing loss from a mid (500 Hz) to high (1000 kHz) frequency at lower sound decibels (10-20 dB) in the left ear. Looking at the whole association rules, it can be observed that, bilaterally, there is a relationship between normal hearing (NH) and vertigo, depicting a mid to high frequency (1000 kHz, 2000 kHz and 4000 kHz) for normal hearing. The rule (VTG → 4000:65 L, F) denotes a possibility of females having hearing loss with an extreme frequency of 4000 kHz at 65 dB given that vertigo exists. Table 6 shows the summary of all the discovered rules in Table 5, which includes the symptom of vertigo and some hearing thresholds. Table 4 shows the association rules within the item sets consisting of vertigo together with tinnitus. A correlation between the vertigo and tinnitus and hearing threshold values are shown by all the rules. As evidenced by other studies, vertigo and tinnitus can also occur together [55]. Table 8 shows a summary of all the discovered rules in Table 5 that includes the symptom of tinnitus and vertigo and some hearing thresholds. Table 9 depicts the relationship between giddiness and the hearing threshold inside the right ear occurring at a low frequency and low sound decibels in females (GIDDINESS → 250:35, F). The other rule (TNTS, GIDDINESS → 250:35 R, F) shows a similar interesting relationship between giddiness, tinnitus and a low-frequency threshold among females. Table 10 shows the summary of all the discovered rules in Table 9, which includes the symptom of giddiness and some hearing thresholds.  Tables 3-5 and Table 9 present the pure-tone audiometry measures obtained from the dataset of 50 patients involved in a primary study [56]. Correspondingly, Tables 11 and 12 compare the previous results with that of an air and bone conduction audiometry data carried out on 339 patients. The study by [56] stated that from the dataset of the primary study involving 50 patients with hearing loss, it was gathered that a correlation exists between audiometry thresholds, gender and hearing-loss symptoms. Furthermore, each of Tables 13 and 14 presented results from Tables 11 and 12. The outcome of the primary research on the 50-patient dataset and pure-tone audiometry processes are illustrated in Tables 3-5 and Table 9. A comparison with the outcome for the research on 339 patients' air and bone conduction audiometry data is in Tables 11 and 12 below. There is a connection between the hearing-loss symptoms, gender and audiometry thresholds from an initial study on a dataset of 50 hearing-loss patients [26]. The tinnitus was represented as TNTS, vertigo as VTG, normal hearing as NH, BILATERAL referred to the two ears, M referred to male while F referred to female, L represented left ear while R represented right ear. The support value for itemset generation was set at 0.1 (10%) while the confidence value for the association rule was set at 0.7 (70%). The correlations between pure-tone audiometry thresholds and vertigo, tinnitus and giddiness were an interesting discovery.

Results from Association Analysis Using Large Sample Dataset
Tables 11 and 12 present the results of the current study on association rules. The comparison between Table 3 (observed tinnitus association rules) and Table 11 (the latest observed results of tinnitus and vertigo association rules) reveals the correlation between tinnitus symptoms and normal hearing threshold occurring at 500 Hz (500:15 R, → ONOFF TNTS) inside the right ear-R , as presented in Table 3. Table 11 presents the reflection of this in the results of the current study involving 339 patients depicting a relationship between tinnitus symptoms and vertigo and normal hearing threshold occurring at 500 Hz (TNTS, VTG → 500:20 R, M) in male patients' right ears-RI. Akin to this kind of relationship is that seen between the normal hearing threshold and vertigo (VTG, M → 500:20 R, NH, BILATERAL) and (VTG → 500:15 R, F) in Table 3 for the initially observed vertigo association rule. Table 11 presents the results of the association rules (TNTS, VTG → 250:40 R, 400:45 R, F). These rules can be likened to those observed in the primary study as depicted by Table 3 (TNTS → 250:30 R, F), (TNTS, M → 250:60 L) and (TNTS → GIDDINESS, 250:35 R, F). A close look at both studies (the primary study and second study) shows some similarities, as seen in the low-frequency, mild-moderate hearing loss in patients with the tinnitus condition. Table 5 depicts a reflection of the results (250:35 R → VRTG, F).
The correlation between giddiness symptoms and pure-tone thresholds is presented in Table 12. This signifies a correlation between giddiness and hearing normal at highfrequency thresholds inside the left ear and right ear (GIDDINESS → 1000:10 L, 500:20 L, 250:25 R). A related result is reflected in the primary study presented in Table 9 (500:20 R → GIDDINESS). Tables 13 and 14 shows the summary of all the discovered rules in Tables 11 and 12, respectively, including the symptom of tinnitus/vertigo, giddiness and some hearing thresholds.

Symptoms Prediction and Model Evaluation
The significance of the feature extraction methods cannot be overemphasized. It is significant in the accomplishment of many AI methods [56]. The exhibitions of the classifiers utilizing both multivariate Bernoulli and multinomial models with and without features extraction were compared. Our study displays the validation outcomes of the machine learning assessment used in the dataset with 242 training samples. The validation outcomes utilizing the multivariate Bernoulli model (MVB-FPG) with FP-Growth feature transformation appear in Figure 5. Also, further details have provided in Table 15. Figure 6 display the 10 iterations utilizing 5 unique segments of the cross-approval method with the average error rate (Repeated Random Sub-Sampling Validation Technique), a technique that is used to test the accuracy of a classifier. Using 10 training samples, there is a 100% expectation of exactness utilizing allotment; also, the ratio can be predicated as 99.5% precision with 20 training samples; in addition, 99% with 30 training samples, 98.25% with 40 training samples and 94.60% prediction with 50 training samples. Therefore, the average error rates for the five distinct segments were 0, 0.5, 1, 1.75 and 5.4%, respectively. The ML method work astoundingly with the multivariate Bernoulli model (MVB) with FP-Growth features processing.    Figure 7 illustrates the validation outcomes above 10 iterations utilizing diverse segments. The assessment acquired from the multivariate Bernoulli model without FP-Growth features processing differs from the outcomes acquired with FP-Growth features processing. Figure 6 displays the average error rates utilizing the multivariate Bernoulli approach without the features processing. The average error rate for every segment is a high ratio. The segments with 50 and 40 training samples have the worst average identification incorrectness with a 57% and 56% average error rates, respectively. The segments with 10, 20 and 30 training samples have up to 50% error rates. Table 16 shows the summary of all the percentage error and accuracy rate of the multivariate Bernoulli naïve Bayes classifier (MN-FPG) model without FP-Growth feature transformation.   Figure 8 shows that the average error rates rely on 10 iterations utilizing the 5 diverse segments of the validation group, and the outcome of the multinomial NB method with FP-Growth features processing. The segment with 50 training samples have achieved the best, with 10% error rates averaged over 10 iterations task. Therefore, the identification rates were precisely 90%. While, the minimum average error rates were 2%, which is for the segment with 10 training samples. It can be concluded to use average error rates of 3%, 3.9% and 8.5% for 20, 30 and 40 segments, respectively. Table 17 shows the summary of all the percentage error and accuracy rates of the multinomial naïve Bayes classifier (MN-FPG) model with FP-Growth feature transformation.   Figure 8 displays the validation outcomes utilizing the multinomial model without FP-Growth features processing. The segment with 10 training samples have acquired the minimum average error rate of 42%. While, the maximum value for a segment with 20 training samples achieving a 53% average error rate. The segments with 30, 40 and 50 training samples got error rates of 48%. All the error rates are averaged above 10 iterations. Table 18 shows the summary of all the percentage error and accuracy rates of the multinomial naïve Bayes classifier model without FP-Growth feature transformation. According to Tables 15-18, the results indicated that a low average (AV) error rate and high average (AV) accuracy for the proposed models are achieved only when adopting the FP-Growth feature transformation.

Discussion
The results of this study indicate a possible connection between patients' audiogram configuration and some attributes in their medical records. These attributes include age, gender, symptoms, medical history, etc., as evidenced in other studies stated in Section 2 of this study. This experiment has detected evidence of the relationship between the patient's audiogram configuration and hearing-loss symptoms when experimenting with both a small sample dataset of 50 hearing-loss patients and a larger dataset of 399 patients. The initial study using a smaller sample dataset found a relationship among tinnitus symptoms, vertigo symptoms, giddiness symptoms and hearing thresholds at different frequencies and sound levels. These include some attributes such as age and gender in the relationship. The most interesting findings were that the results of the first experiment with the smaller data sample correlate with the results of the second experiment with a larger dataset sample of 399 hearing-loss patients. For example, the results in the second experiment show the symptoms of tinnitus and vertigo to be related with mild hearing loss at lower frequencies in females (TNTS, VTG → 250:40 R, 400:45 R, F). There were similar results seen in the first experiment in Table 1 where the symptoms of tinnitus is related to mild hearing loss at a low frequency (TNTS → 250:30 R, F), (TNTS, M → 250:60 L) and (TNTS → GIDDINESS, 250:35 R, F). This implies that a huge similarity exists between the two results since low-frequency, mild-moderate hearing loss exists among tinnitus patients in both results. The same was found in the case of symptoms of vertigo. The result is in Table 3 (250:35 R, VTG → 500:25, F), and 250:35 R → VRTG, F is also reflected in Table 5 (250:35 R → VRTG, F).
A significant result was the correlation of the input data used in the Bayesian classifier and the high accuracy of predicting hearing-loss symptoms. It shows that the prediction accuracy is becoming high when the vocabulary of the classification method consists of item sets with a high frequency. The comparison results show that the multivariate Bernoulli method is superior to the multinomial method alone or even when combined with the FP-Growth feature transformation technique in terms of the prediction accuracy. The multivariate Bernoulli method integrated with the FP-Growth algorithm obtains a 5.4% average error rate on 50 training examples for ten iterations for random sub-sampling. The multinomial method with FP-Growth obtains a 10% average error rate on the same number of training examples and iterations. As the testing and validation process that used a big number of random training examples yields a better accurate result and more reliable model, we prefer to use 50 partitioned training examples. The experiment results demonstrate that both the multinomial and multivariate Bernoulli method with no FP-Growth combination perform badly in a partition with 50 training examples and yields the biggest average error rates of 48% and 57%, respectively. The absence of a feature transformation technique affect the performance of both methods negatively with this size of the dataset. It is surprising to note that the average error rates for both methods without the feature transformation technique are high in all five partitions at the tenth iteration. In addition, the average error rate of the multivariate Bernoulli is quite higher compared to the multinomial method. These findings support the outcome of another study in [4], which demonstrate the multinomial method is superior to the multivariate Bernoulli method on four diverse datasets. Other findings show that the multinomial method is superior to the other four probabilistic methods, including the multivariate Bernoulli methods on three text classification problems. Despite these findings, the multivariate Bernoulli method is superior to the multinomial method when combined with the FP-Growth algorithm. The average error rate of the multivariate Bernoulli method is smaller than the multinomial method when both combined with the FP-Growth algorithm in the five partitions. However, these outcomes are in contrast to findings in [4], which indicates the multinomial method performs better than the multivariate Bernoulli with respect to the prediction accuracy because of the number of word frequency. In [4], an argument was based on the vocabulary size, as the multinomial method yields better results on the smaller size while the multivariate Bernoulli method yields better results on the bigger size [4].
However, a contradiction to this argument is in [57], as it shows that the word information size does not affect the performance of both methods and the multinomial method is superior to multivariate Bernoulli method regardless of the information count. Moreover, the author of [57] argues that minimization of the word information count lead to improvement in classifier performance. Despite all previous studies that reinforced the multinomial classifier, our study demonstrates that the multivariate Bernoulli method is better than the multinomial method when the vocabulary formation includes frequent item sets of subsets that belongs to every training example in our dataset.
According to the SD analysis in Figure 9, overall we found that MVB-FBG and MN-FPG have scored highest and almost the same values in all training and validation data splitting approaches; this confirms the stability of the classification performance (accuracy) of the proposed models. As shown in Figure 10, of the analysis of the error rate, we found that MVB-FBG and MN-FPG have scored the lowest and almost the same values in all training and validation data splitting approaches; this confirms the stability of with a low error rate performance (misclassification) for the proposed models. However, we performed the Wilcoxon signed-rank statistical test [58] to verify, on the one hand, whether a significant difference exists between MVB-FPG and MVB. On the other hand, whether there is a difference between MN-FPG and MN. Error rate and accuracy values for all classifiers with five training sets were the main input for the Wilcoxon signed-rank statistical test, as shown in Table 19. In the Wilcoxon signed-rank statistical test the main indicator is T-sig. The result is significant when T-sig < 0.05. According to the Table 19, all of the tested results are significant and satisfied the Wilcoxon test.

Limitations of the Study
This study is not without constraints and limitations. The size of the sample dataset available for the research is a limitation that cannot be overlooked. The accuracy of the prediction from a large dataset better shows the efficiency of the algorithm than the accuracy of prediction from a mid-range or small dataset. It is believed the higher the amount of training sets and validation data available for machine learning algorithms, the more reliable the classification or prediction result will be. This limitation is due to the fact that a lot of patient data collected from the Department of Ear, Nose and Throat at Hospital Pakar Sultanah Fatimah, Muar, was without an audiogram. This is because some of the patients were diagnosed with either nose or throat disease. Thus, this does not require any hearing measurement. Another constraint is the format in which the collected data comes with. The data collected is in paper format; therefore, there is the need to convert it into a digital format. This has become tedious work because each and every air and bone conduction hearing threshold value has to be recorded with the corresponding patient data. One of the drawbacks of using a small dataset is that not all training examples in a small dataset can have an itemset as a subset that pass the minimum support value. This can result in the exclusion of those training examples from those that will be chosen as the training set, as seen in this study where only 242 were chosen out of the 399 training examples in the dataset. In the case of a very large dataset, a large percentage of the training examples can form the training set because most of them will contain an item set that passes the set minimum support value.

Conclusions
The main contribution of this work is proposing a model of symptom detection to accurately classify symptoms of hearing loss based on hybrid machine learning approaches, Frequent Pattern Growth (FP-Growth) and naïve Bayes (NB) algorithm, where FP-Growth is an unsupervised method that is used for the feature extraction purpose while the NB models are supervised models hired for the classification target. The correlation between the hearing thresholds and symptoms of hearing loss were identified. Furthermore, the experiments were conducted based on two scenarios: small sample and large sample datasets. The proposed model efficiently solved the challenges relevant to diagnosis and features extraction. This study has shown that FP-Growth and association analysis algorithms can be used to uncover the hidden relationships between the hearing-loss symptoms and audiometry thresholds in patients with hearing loss. The strong correlation between some pure-tone audiometry thresholds and tinnitus, giddiness and vertigo symptoms was discovered in a sample air conduction pure-tone audiometry data of 50 patients. One of the more significant findings to emerge from this study is the correlation between the results for the first study on a smaller data sample and that of the extension of that study on a dataset sample of 399 hearing-loss patients. These findings suggest that there is a connection between audiometry thresholds and hearing-loss symptoms. The result of these two experiments showed the existence of this relationship and the performance of the hybrid of the FP-Growth and naïve Bayes algorithms in identifying hearing-loss symptoms was found to be efficient with a very small error rate. The results also presented a high accuracy rate when adopting the proposed hybrid model. The average accuracy rate and average error rate for the multivariate Bernoulli model with FP-Growth feature transformation with five training set is a 98.25% accuracy and 1.73% error rate. The statistical test confirmed that the proposed model has showed significant performance.
In future work, the dataset samples need to be increases to ensure a better efficiency of the machine learning techniques. It is believed the more training sets and validation data available for a machine learning algorithm, the more reliable the classification or prediction result will be. To obtain a higher accuracy and training process, it is also suggested to use deep learning methods.