Article Classification of Knee Joint Vibration Signals Using Bivariate Feature Distribution Estimation and Maximal Posterior Probability Decision Criterion

Analysis of knee joint vibration or vibroarthrographic (VAG) signals using signal processing and machine learning algorithms possesses high potential for the noninvasive detection of articular cartilage degeneration, which may reduce unnecessary exploratory surgery. Feature representation of knee joint VAG signals helps characterize the pathological condition of degenerative articular cartilages in the knee. This paper used the kernel-based probability density estimation method to model the distributions of the VAG signals recorded from healthy subjects and patients with knee joint disorders. The estimated densities of the VAG signals showed explicit distributions of the normal and abnormal signal groups, along with the corresponding contours in the bivariate feature space. The signal classifications were performed by using the Fisher’s linear discriminant analysis, support vector machine with polynomial kernels, and the maximal posterior probability decision criterion. The maximal posterior probability decision criterion was able to provide the total classification accuracy of 86.67% and the area (Az) of 0.9096 under the receiver operating characteristics curve, which were superior to the results obtained by either the Fisher’s linear discriminant analysis (accuracy: 81.33%, Az: 0.8564) or the support vector machine with polynomial kernels (accuracy: 81.33%, Az: 0.8533). Such results demonstrated the merits of the bivariate feature distribution estimation and the superiority of the maximal posterior probability decision criterion for analysis of knee joint VAG signals.


Introduction
The knee is composed of the femur, tibia, fibula, and patella (or knee cap), the extensive ligaments, and two main muscle groups (i.e., quadriceps and the hamstrings) [1].The knee joint has a crescent-shaped fibrocartilaginous structure, named meniscus, which fits into the joint between the tibia and the femur.It allows the bones to slide freely on each other in the joint.A pair of knees support nearly the entire weight of the human body, and help the body perform different mobility functions in the locomotion [2].The knee joint can afford moderate stress without much internal injury, but it is not able to withstand rotational forces that commonly occur in athletic activities.In addition to the sports injury, the degeneration or damage of cartilage in the articular surface may cause rheumatic disorders such as osteoarthritis [3].Screening of knee joint disorders at an early stage can provide preliminary information for physicians to undertake some appropriate therapy options in order to retard the degenerative process of arthritis [1].
Detection of knee joint disorders requires a thorough medical history, physical examination, and may need some additional tests such as magnetic resonance imaging (MRI) or computed tomography (CT).MRI is commonly used to characterize the in vivo orthopaedic condition of articular cartilage in the knee joint [4].It is a popular noninvasive method for the assessment of knee joint degeneration.Although MRI provides good image contrast that helps discriminate the articular cartilage and synovial fluid of the joint, it only displays anatomical morphology of the knee joint rather than the functional condition illustration during leg movement.In the standard CT scanning procedure, the X-ray beam moves in a circle around the body.CT images can provide much detailed information about the bone tissue and bone structure [4].However, CT scanning takes the risks of radiation exposure due to the cumulative number of X-ray examinations.
Arthroscopy is a common surgical procedure to visualize, diagnose, and sometimes treat the interior damage of a knee joint [1].In an arthroscopy examination, orthopaedic surgeons insert an arthroscope fiber through a small incision in the knee to inspect the structure inside the joint.Compared with the traditional open surgery, arthroscopy is a semi-invasive method that does not require to open up the joint fully.The knee arthroscopy just needs to make two small incisions: one for the arthroscope and the other for the surgical instruments to be used in the knee cavity.It could reduce recovery time and may also increase the rate of surgical success due to less trauma to the connective tissue.Although arthroscopy is widely used as the "gold standard" for relatively low-risk assessment of knee joint injury [3], it is not suited for the patients whose knees are in a highly degenerated state due to ligamentous instability, meniscectomy, or patellectomy.In addition, arthroscopy cannot be used for routine examinations of articular cartilage because the same incision may not undergo repeated arthroscope invasions.
Vibration arthrometry is an alternative and useful technology for the screening of knee joint pathology [5,6].The protocol usually requires the subject to perform voluntary leg movement, and uses accelerometers or electro-stethoscope sensors attached on skin of the knee to record the acceleration time series called vibroarthrographic (VAG) signal.The healthy articular surfaces are bathed by joint fluid and can prevent any collision in the knee joint during leg movements.For the patients who suffer from the degeneration of cartilages, the articular surface frictions are very likely to occur in walking and sporting activities [3,7].Knee vibration arthrometry can provide more information about the stability behavior of a knee joint, and is used as a noninvasive detection technique to study knee joint disorders [1].
The purpose of knee joint VAG signal analysis is to interpret distinct features in the temporal and frequency scales [8][9][10][11].With the statistical features extracted [12][13][14][15], computational algorithms can be utilized to distinguish anomalous patterns associated with pathology [10,[16][17][18].The aim of the present study is to use the kernel density estimation method with two-dimensional Gaussian kernels to represent the knee joint VAG signals in the bivariate feature space.The classification task is then performed by using the maximal posterior probability decision criterion that can distinguish the VAG signals into the normal or pathological group.

Dataset
The knee joint VAG signals analyzed in the present work were selected from the dataset used in a few previous related studies [10,[12][13][14][15].The signals were recorded from 75 subjects (47 healthy volunteers and 28 patients with knee joint disorders).The healthy volunteers were confirmed by medical history and physical examination.The patients were suffering from different pathologies, such as chondromalacia, meniscus tear, and anterior cruciate ligament injuries, which were confirmed by arthroscopy examinations.
In the data acquisition procedure, an accelerometer was attached at the middle position of the patella.The subjects who sit on a rigid table were requested to voluntarily swing the leg over the angle range from 135 o (approximately full flexion) to 0 o (full extension), and then back to 135 o in 4 seconds [16].The first half and the second half of the leg swinging are also referred to as extension and flexion movements, respectively.Auscultation of the knee joint using a stethoscope was also performed, and a qualitative description of sound intensity and type was recorded in correspondence with the joint angle.Each VAG signal was digitized with a sampling rate of 2 kHz and the resolution of 12 bit per sample.Prior to feature analysis, the baseline wander in each raw signal was removed by a cascade moving average filter [19], and the random noise was eliminated by a time-delay neural filter [20].

Features Description
Noting that the knee joint VAG signals recorded from the symptomatic patients with knee joint pathologies commonly exhibit higher variability or complexity, the corresponding signal patterns should possess different shape characteristics from those of healthy subjects.For a better representation of the normal and abnormal VAG signals, we considered two waveshape features extracted in the time scale for the classification task.
The first feature was the variance of the mean-squared (VMS) values in every fixed-duration segment (5 ms each) of a given VAG signal [13].The VMS values can be used to describe the local spread or dispersion degree of the signal in a short time span.The Kruskal-Wallis test results showed that the VMS values of the normal and abnormal VAG signals reached the level of significant difference (p < 0.01) [13].On the other hand, the form factor (FF) value was computed for each VAG signal to measure the shape complexity [12].The FF is computed as the ratio of mobility of the first derivative of the signal M s ′ to the mobility of the signal itself M s [21]: where the mobility M s denotes the square root of the ratio of the variance of the first derivative of the signal σ s ′ to the variance of the original signal σ s : Then the FF can be rewritten as where σ s ′′ represents the square root of the variance of the second derivative of the signal s ′′ .The p value of the FF feature estimated by the Kruskal-Wallis test was lower than 0.01 [12], which implied that the distributions for the normal and abnormal VAG signals were also significantly different.

Bivariate Probability Distribution Modeling
In order to estimate the distribution of the knee joint VAG signals in the bivariate feature space, we applied the multivariate kernel density estimation method.The kernel density estimation can be used to approximate arbitrary multimodal distributions without the assumption of underlying density functions [22,23].In the present study, we used the kernel functions to estimate the class-conditional probability density of the bivariate features for the two VAG signal groups respectively.Let u = [u x , u y ] T be the vector of the VMS and FF features.Suppose that a particular VAG group ω, (ω ∈ {ω N , ω A }, ω N and ω A indicating the normal and abnormal signal groups respectively) contains K signals in total.The class-conditional probability density is then estimated as [22]: The bivariate Gaussian kernel function The center of Gaussian kernel is ] T , which is the location of the k-th VAG signal in the group ω.The spread area of the Gaussian kernel is determined by the symmetric and positive definite covariance matrix, which is given by where σ x = 0.0109 and σ y = 3.1436 represent the standard deviation of the VMS and FF features respectively, ρ = 0.7756 denotes the correlation coefficient between these two features, and g is the scaling factor that controls the bandwidth of the Gaussian kernel.In order to estimate the appropriate probability distributions in the bivariate feature space, we searched the scaling factor in the range from 0.01 to 1, and chose the optimal scaling factor g = 0.1 that helped minimize the class-conditional probability error or risk between the normal and abnormal signal groups.

Signal Classification
In the present study, we performed the nonlinear classifications for the normal and abnormal signal groups.The linear classification was implemented by the Fisher's linear discriminant analysis and the nonlinear classification was performed according to the maximal posterior probability decision criterion.For a performance comparison, the support vector machine with polynomial kernels was also employed to distinguish the normal and abnormal signal groups.The details of the algorithms and experiment settings are presented as follows.

Fisher's Linear Discriminant Analysis
Fisher's linear discriminant analysis (FLDA) is a simple but effective pattern classification tool that searches a mapping orientation that leads to the best separation among the classes [24].In other words, the FLDA performs a projection of the multidimensional data onto a straight line so that the dimensionality of the complex dataset can be reduced [25].The FLDA is widely used in practical applications because it does not require the prior assumption that the classes are with normal distributions or equal covariances [26].
The objective of the FLDA algorithm is to seek a linear combination of features that yields the maximization of the discriminant function [27]: where w is the linear combination vector that indicates projection direction, S B represents the inter-class (or between-class) scatter matrix, and S W denotes the intra-class (or within-class) scatter matrix.For the binary classification of the knee joint VAG signals, the scatter matrices S B and S W are defined as [26]: and where m N and m A denotes the mean of u for the normal and abnormal signal groups, respectively.If the intra-class scatter matrix S W is nonsingular, the discriminant function J(w) can be maximized by the Lagrange multiplier method, and the projection direction that makes the best linear separation of the VAG signal groups is given by the solution

Maximal Posterior Probability Decision Criterion
According to Bayes' formula, the joint probability density of finding a signal with the feature vector u in a particular group ω can be written as p(u, ω) = p(u| ω)P (ω) = P (ω| u)p(u) (10) where P (ω) is the prior probability that indicates the occurrence of a signal group, P (ω| u) is the posterior probability that describes the probability of the state of a signal being in the group ω with respect to the corresponding measured feature vector u.For the VAG signal classification, the evidence factor p(u) is the sum of the product between the class-conditional probability density and prior probability for the normal and abnormal signal groups, i.e., Therefore, the posterior probability P (ω |u ) can be computed as [27]: With the measured feature vector u, a given VAG signal is then assigned the class label according to the maximal posterior probability (MPP) criterion as arg max

Support Vector Machine
The support vector machine (SVM) with inner-product kernels is a supervised learning model that can be used to perform nonlinear classification by mapping the inputs into high-dimensional spaces [28,29].By choosing different nonlinear inner-product kernels, the architecture of the SVM is equivalent to the polynomial learning machine (with polynomial kernels), the radial basis function network (with Gaussian kernels), or the multilayer perceptron with a single hidden layer (with sigmoid kernels) [27].The learning of the SVM follows the structural risk minimization rule by means of solving the quadratic programming problem [30].During the model parameter optimization process, a subset of the representative training data is selected as support vectors, with which the SVM constructs a hyperplane as the decision surface to maximize the margin of separation between the classes [27].
To select the appropriate kernels for the SVM, we compared the polynomial kernels, Gaussian kernels, and sigmoid kernels, and found that the polynomial kernels can help the SVM produce the best classification accuracy and confusion matrix results over the current dataset.Therefore, we employed the SVM with polynomial kernels to perform the knee joint VAG signal classification task in the present work.The polynomial kernel is expressed as [30]: where d denotes the degree of the polynomial and t represents the intercept.We searched the polynomial kernel parameters in the range from 1 to 10, and chose d = 3 and t = 4, which could make the SVM achieve the best classification performance.

Results
The estimated distributions of the normal and abnormal knee joint VAG signals in the bivariate VMS-FF feature space are illustrated in Figure 1   The FLDA and the SVM produced the same total classification accuracy of 81.33%, and the MPP method provided a higher correct accurate rate of 86.67%.The confusion matrices provided by the FLDA, SVM and MPP classifiers are listed in Tables 1-3, respectively.The true negative rate of the FLDA (specificity: 0.8936) was worse than that of the SVM (specificity: 0.9149), whereas the FLDA provided a better true positive rate (sensitivity: 0.6786) than the SVM (sensitivity: 0.6429), with an additional abnormal VAG signal correctly classified.Compared with the FLDA and SVM classifiers, the MPP method was able to correctly distinguish more VAG signals for the both normal and abnormal groups, with the highest values of sensitivity (0.75) and specificity (0.9362).The area under the receiver operating characteristic (ROC) curve A z of the SVM was 0.8533 (standard error, SE: 0.0467).Such results was slightly worse than that of the FLDA (A z ± SE: 0.8564 ± 0.0447), because the FLDA successfully classified one more abnormal signal than the SVM.The best ROC curve was obtained with the MPP method, with the highest A z value of 0.9096 (SE: 0.0332).It is very clear that the MPP method outperformed the linear FLDA and the SVM with nonlinear inner-product kernels for the knee joint VAG signal classification.Moreover, the results of the MPP method were better than those of the radial-basis function neural network with the features of form factors, skewness, kurtosis, entropy (A z : 0.8172) [12], or with the features of variance of mean-squared values and turns count with adaptive threshold (A z : 0.857) [13].Most of the misclassified normal signals were recorded from the elderly healthy subjects.These subjects were suspected in the course of chronic articular cartilage disorders, although there was no symptom found in the current physical examinations for them.By checking with the arthroscopy surgery records, we found that majority of the misclassified abnormal VAG signals were associated with the lateral or medial meniscus tears, and the most of the corrected distinguished abnormal signals were recorded from patients who suffered chondromalacia at different grades.Figure 3 shows a pair of abnormal signal segment samples recorded from the patients with Grade I-II chondromalacia (upper trace) and lateral meniscus tear (lower trace), respectively.It can be observed that the morphological characteristics of the two abnormal signal segments were different.The signal with chondromalacia exhibited a higher degree of fluctuations, and the morphology of the signal with lateral meniscus tear was closer to that of the signal recorded from healthy adults.

Discussion and Conclusions
Computer-aided analysis of knee joint vibration signals helps improve the noninvasive detection and localization of articular cartilage disorders at an early stage.The present study used the kernel-based probability density modeling method and maximal posterior probability decision criterion to study the VAG signals.The estimated densities showed the distributions of the VAG signals in the bivariate feature space, and the contours of the normal and abnormal signal groups were more explicit than the scatter plot of these signals.Experiment results of the signal classifications, in terms of total accuracy and area under the receiver operating characteristic curve, demonstrated the merits and superiority of the maximal posterior probability decision criterion, compared with the Fisher's linear discriminant analysis and support vector machine with polynomial kernels.
The present study still has some limitations because the number of the abnormal signals with chondromalacia and meniscus tears was not large enough to support the further study of the pathological subclasses.It is important to extract other morphological features to describe the dynamics characteristics of the abnormal VAG signals with chondromalacia and meniscus tears in the future work.The present protocol did not consider the knee flexion and extension under the loaded conditions, because some subjects with severe knee joint disorders were not able to perform the requested leg movement under loaded conditions.Further experiments will recruit more subjects with knee joint disorders at different stages and some subjects with knee pain but without knee injury.We are interested in studying the morphological features of VAG signals recorded from these subjects at different loaded conditions.We also plan to incorporate the methodology presented in this paper in a follow-up investigation to establish the feature distributions of the subjects with degenerative articular cartilage disorders at different stages.It is positively considered that the progressive changes of feature density distributions could provide useful information about the progress of articular cartilage degeneration, which could assist physicians and medical experts to assess the pathological conditions of patients.
. It can be observed that the normal and abnormal VAG signals locate in the distinctive regions in the feature space.The majority of the normal signals aggregates in the area with the VMS value smaller than 0.012 and the FF value lower than 6.The normal signals recorded from the healthy subjects possess the unimodal probability density, whereas the abnormal signals associated with pathological conditions in the knee joint present more of multimodal characteristics.

Figure 1 .
Figure 1.Distributions of the bivariate features for the normal and abnormal knee joint vibration signals.The distribution of the normal signals is shown with the cold color map (blue representing the highest density), whereas the distribution of the abnormal signals is illustrated with the hot color map (red representing the highest density).

Figure 2 .
Figure2.Decision boundaries provided by the Fisher's linear discriminant analysis (FLDA), the support vector machine (SVM) with polynomial kernels, and the maximum posteriori probability (MPP) estimation method.

Figure 3 .
Figure 3. Segments of the knee joint vibration signals associated with (a) Grade I-II chondromalacia and (b) lateral meniscus tear.

Table 1 .
The confusion matrix and area under the receiver operating characteristic (ROC) curve of the Fisher's linear discriminant analysis (FLDA).A z : area under ROC curve.SE: standard error.

Table 2 .
The confusion matrix and area under the receiver operating characteristic (ROC) curve of the support vector machine (SVM) with polynomial kernels.A z : area under ROC curve.SE: standard error.

Table 3 .
The confusion matrix and area under the receiver operating characteristic (ROC) curve of the maximal posterior probability (MPP) decision.A z : area under ROC curve.SE: standard error.