1. Introduction
Oral frailty is a risk factor for physical frailty, and is related to quality of life. In previous studies [
1,
2], we suggested that patients with dysphagia have a lower position of the hyoid bone on panoramic radiographs. However, this anatomical structure did not focus on the dental treatment. In our recent study [
3], we found that, in patients diagnosed with dysphagia by videofluoroscopic examination of swallowing, the position of the vertical hyoid bone was significantly lower than in people who did not have dysphagia. A cut-off value was investigated to determine how low the hyoid bone was observed in the vertical direction to indicate a high probability of dysphagia. From these articles, we suggested it is important to check the position of the hyoid bone, not only for oral frailty but also the risk of dental treatment. General dentists typically do not focus on the position of the hyoid bone. Instead of these dentists, an artificial intelligence (AI) system could automatically check the position of the hyoid bone and alert them to the risk of dysphagia if the hyoid bone is in a low position. We believe that it is possible to prevent the decline in oral function from an earlier stage.
Advances in computer processing power have made it possible to analyze vast numbers of images in relatively short time periods. As a result, AI systems have evolved, and these are now being applied to various fields, such as daily life and medicine.
In the field of dentistry, the usefulness of AI in image diagnosis is now being investigated. In addition, researchers have sought to evaluate diagnostic ability using AI in a number of recent studies, as described below.
Kabir et al. [
4], Yilmaz et al. [
5], and JH Lee et al. [
6] investigated the extraction of normal anatomical structures. Shaffi et al. [
7] were teeth lesion detection by using deep learning and the Internet for the automated healthcare diagnosis.
Regarding diseases, Fatima et al. [
8] detect the periapical disease, a lightweight Mask-RCNN model is proposed for periapical disease detection. Mao et al. [
9] detected furcation involvement on molar teeth, and Son et al. [
10] discussed automatic fracture detection in the maxillofacial area. Ha et al. [
11] evaluated the detection of supernumerary teeth, and Okazaki et al. [
12] investigated diagnostic accuracy in the detection of odontoma and impacted teeth. Other studies that have examined multiple diseases, including the detection of cysts and tumors by Yang et al. [
13]. In addition, Tareq et al. [
14] attempted diagnostic evaluation of dental caries using smartphone images of teeth, and found that that this may be useful for remote dental treatment.
Concerning evaluation of the degree of growth and development, Li et al. [
15] assessed the maturity of the cervical spine using AI. With regard to evaluation of non-anatomical structures, Park et al. [
16] investigated automatic extraction of implant bodies on radiographs using AI.
It can be seen, then, that many diagnostic imaging studies using AI have been reported to date. More generally, Putra et al. [
17] focused on diseases such as dental caries, periapical lesions, periodontal disease, and cystic benign tumors. Finally, Thurzo et al. [
18] summarized the frequency and trends of studies on AI in the dental field over the last 10 years, and found that published papers were particularly focused on the field of radiology.
The purpose of this study was to perform image diagnosis of the vertical position of the hyoid bone on panoramic radiographs using AI.
2. Materials and Methods
2.1. Acquisition of Panoramic X-ray Images
Panoramic radiographs of 915 patients aged 20 to 95, who visited our university—specifically, the department of Periodontology, Orthodontics, and Oral rehabilitation—from June 2013 to February 2021 and underwent panoramic radiography, were used for AI analysis.
A Hyper-XF radiography machine (Asahi Roentgen Ind. Co., Ltd., Kyoto, Japan) was used. Exposure parameters were set to 78 to 82 kV, 10 mA, and 12 s. Panoramic radiographs were taken as a standardized protocol. During the panoramic radiograph, the patient bit down on a cotton roll to prevent infection. The patients were also instructed to relax their tongues. The excluded objects that hyoid bone moved during the exposure. Patients with clear suspicion of jaw deformity or tumor based on the images were excluded. The image quality of panoramic radiograph was assessed by Izetti et al. [
19]. Synmetry, inclination of the occlusal plane, localization of mandibular condyles, aspect of upper teeth root apexes, and position of the cervical spine were assessed.
The pixel size of the panoramic radiograph was 1976 × 976.
2.2. Ethical Statement
All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Showa University (SUDH0034).
2.3. Vertical Position Classification of the Hyoid Bone
The vertical position of the hyoid bone was classified into 6 types based on the classification method described in our previous paper [
3].
Figure 1 shows the shame of the hyoid bone position which was cited from our previous paper [
3].
Two landmarks were defined as follows:
The bilateral mandible line: A simulated line connecting the right and left sides of the angles of the mandible.
The mandibular border line: The line that moved the bilateral mandibular line parallel to the lowest point of the lower border of the mandible.
An evaluation was conducted of to what extent the hyoid bone body and large angle appeared in the upper area from the mandibular border line. The following six groups were categorized:
Type 0: The hyoid bone could not be observed in the upper area from the mandibular border line;
Type 1: Only the greater horn was observed in the upper area from the mandibular border line;
Type 2: A piece of the hyoid body was observed in the upper area from the mandibular border line;
Type 3: Half of the hyoid body was observed in the upper area from the mandibular border line;
Type 4: All of the hyoid body was observed in the upper area from the mandibular border line;
Type 5: The hyoid body overlapped with the mandible bone.
On the right and left sides, if the vertical position of the hyoid bone was different, the lower position side was recorded.
Evaluations were carried out by a dental radiologist (Y.M.) with 37 years of experience and another (E.I.) with 3 years of experience. When the ratings of these two radiologists differed, they reviewed the images together to reach a consensus.
Since the hyoid bone can be seen on both the left and right sides of a panoramic X-ray image, the positions of the left and right hyoid bones were evaluated. As a result, a total of 1830 hyoid bone sites in 915 images were evaluated.
Table 1 shows the number of cases by type.
2.4. Convolutional Neural Network Selection
In this study, we used YOLOv5, a deep learning method for object detection, to extract the vertical position of the hyoid bone from panoramic radiographs and develop a learning model that predicts the risk of dysphagia.
Annotations
We annotated the data used for learning; specifically, the correct label and coordinate information of the object were added as annotations.
When providing coordinate information for the image, we specified the mentum in addition to the hyoid bone as the coordinates of the target object so that learning included the positional relationship between the hyoid bone and the virtual line, which is the standard for type classification.
Next, in order to improve learning accuracy, we expanded the amount of data. As an expansion method, the number of data was doubled by performing left–right reversal processing. Coordinate information was also reversed and specified for certainty.
When providing the coordinate information, the hyoid bone area was set from the mentum to the edge of the projected image so that it would have roughly the same area as the other types.
Figure 2 shows an image with coordinate information on both sides.
2.5. Learning Method
The number of groupings and the number of learning groups are shown in
Figure 3. The following four learning methods were used:
Each of the 6 types from Types 0 to 5 were trained and evaluated.
Type 0 was considered difficult to learn because the hyoid bone was not visible, so it was excluded, and the 5 types from Type 1 to 5 were learned and evaluated.
This learning model was set to determine Type 0 when the hyoid bone was not detected.
Types 0 and 1 were combined into one group and designated as Class A. Similarly, Types 2 and 3 were grouped together to form Class B, and Types 4 and 5 were grouped to form Class C.
These three groups were trained and evaluated.
Class A (Types 0 and 1), was not learned because Type 0 and Type 1 were difficult to learn since the hyoid bone was not observed or partially visible.
The remaining two groups, i.e., Class B and Class C, were learned. There were thus two groups of learning data: Class B (Types 2 and 3) and Class C (Types 4 and 5).
This learning model was set to determine Class A when the hyoid bone was not detected.
2.6. Cross Validation
In this study, we performed cross validation.
Table 2 shows the number of training sets, validation sets, and test sets for each plan. The learning parameter, the number of epochs, was set to 100, and the batch size was set to 2. Confidence interval was examined under various conditions, and the value with the highest average F value was adopted.
2.7. Evaluation
In this study, recall, precision, F-values, and accuracy were calculated as the evaluation values of the learning model. Recall expresses the proportion of predicted positives that were positive. The calculation formula may be expressed as follows:
Recall (true positive rate, TPR) = TP/(TP + FN).
Precision = TP/(TP + FP).
Accuracy = (TP + TN)/(TP + FP + TN + FN).
PR-AUC (area under the precision–recall curve) was plotted for evaluation of the classification ability, and the AUC values under the PR curve were calculated. The AUC of a random model was 0.5, and the predictive ability/diagnostic ability was judged based on the AUC value, as follows:
AUC value of 0.9 or higher: high accuracy.
AUC values above 0.7 and below 0.9: moderate accuracy.
AUC value greater than or equal to 0.5 and less than 0.7: low accuracy.
3. Results
3.1. PR Curves and AUC Values
Figure 4 shows the PR curves and AUC values for each plan. Plan 2, which did not include Type 0 in learning, had a higher average AUC value than Plan 1, which included Type 0 in learning.
Plan 3 and 4 was evaluated with fewer classes than Plans 1 and 2. Plan 3 which was learned with all classes, had a better average AUC value than Plan 1 and 2. Plan 3 had an average AUC value was 0.9.
Plan 4, which was learned without Class A, had a higher average AUC value than Plan 3, which was learned with Class A included. Plans 1 and 2 both had an average AUC value of less than 0.9, and the lowest AUC values for each group were 0.57 and 0.61, respectively.
On the other hand, Plans 3 and 4 both had an average AUC value greater than 0.9. In addition, the lowest AUC values for each group were 0.86 and 0.95, respectively.
3.2. Precision, Recall, F-Values, and Accuracy
Table 3 shows the evaluation values for precision, recall, F-values, and accuracy. Comparing Plans 1 and 2, the precision and recall values of Type 0 were higher in Plan 2 than in Plan 1. In the evaluation of the six types of Plan 1, the recall values of Types 1 and 2 were lower than those of the other types. Comparing Plans 3 and 4, the precision values for Class A (Types 0 and 1) were higher in Plan 4 than in Plan 3.
Plans 3 and 4, which reduced the number of groupings, had higher precision and recall values than Plans 1 and 2. The highest accuracy value of 0.93 was achieved by Plan 4.
4. Discussion
4.1. Hyoid Bone Detection
In our previous study, we concluded that the 43 patients diagnosed with dysphagia, 28 patients had either Type 0 or Type 1 hyoid bone position. We found that dysphagia was observed when the hyoid bone was positioned below the mandibular border line [
2]. However, the hyoid bone is an anatomical structure that has received little attention in the literature concerning panoramic radiographs. In light of this result, in the present study, we aimed to detect the position of the hyoid bone on panoramic radiographs using AI.
In particular, we found that suspected cases of dysphagia in which the hyoid bone was not visible or only partially visible could be accurately assessed.
4.2. Regarding Research Plan Setting Conditions
In Type 0, the position of the hyoid bone could not be learned because the hyoid bone was barely visualized. or only part of it could be seen. Plans 1 and 2 were examined in order to examine how much the learning outcome would change by excluding the group that could not be learned.
Plans 3 and 4 were designed to reduce the number of classifications, and plan 4 to examine how much learning outcomes would change by excluding groups in which the hyoid bone was not completely visible or only partially visible.
4.2.1. Plan 1: Learning the Position of the Hyoid Bone in Six Groups
The precision value of Plan 1 showed the lowest average value of 0.68. If there were too many groupings, or if the hyoid bone was positioned too low, the hyoid bone appeared to be partially missing. In addition, calcification of the thyroid cartilage and of the carotid artery were included in the diagnosis area, so there was a high possibility of erroneous recognition.
4.2.2. Plan 2: Learning the Position of the Hyoid Bone in Five Groups
Compared to Plan 1, precision and recall values for Type 0 improved from 0.47 to 0.87, and from 0.69 to 0.80, respectively. Accuracy also improved from 0.68 to 0.76.
However, the average precision and recall values were 0.68 vs. 0.70, and 0.62 vs. 0.69, respectively, indicating no significant improvement.
4.2.3. On Reducing Grouping
Compared to Plan 2, in Plans 3 and 4, where classes of two types were learned, average precision values were higher, at 0.90 and 0.86, respectively, and average recall values were also higher, at 0.83 and 0.87, respectively. As one of the methods for improving diagnostic accuracy, it may be suggested that when evaluating the degree of visibility of anatomical structures, it is possible to improve the learning effect by not dividing into many groups.
We reduced the number of learning groups by combining them, and by not learning groups that were difficult to detect. Hence, we were able to obtain a high diagnostic performance. Depending on the number of categories to be classified and the maximum number of objects to be found in an image, the number of detected results increased, adding to the processing load and possibly reducing the accuracy.
This may be because, in supervised learning, by reducing the number of groupings, it becomes easier to perform clustering, among others, and diagnostic accuracy can therefore be improved.
Regarding the number of groupings, a previous paper by Okazaki et al. [
12] may be recalled. These authors investigated whether abnormal images of different teeth could be correctly diagnosed; however, the targets of this study were single supernumerary teeth and odontomas. Concerning the hyoid bone, the same structure was reviewed by Park et al. [
16] for the purpose of implant detection. They studied classifying various types of dental implant systems (DISs) using a large-scale multicenter dataset in panoramic X-ray and intraoral images. They found no significant difference between the results for the two image types, and concluded that high precision could be obtained for both.
In this study, we investigated differences in the position of the hyoid bone, but we were able to obtain good results by reducing the number of groups rather than increasing them. This may be due to the increased number of cases per group.
4.3. Recognition of Missing Images
In the case of a simple shape, such as an implant body, AI may be able to provide a diagnosis even if part of the image is missing. However, in the case of a U-shaped feature such as the hyoid bone, only a small part is within the region of interest. Even if it is included, it may be quite difficult to recognize it as a hyoid bone. Elmahmudi and Ugail [
20] reported learning by dividing a face, so that if only half of the face was captured, it could be individually recognized as a face. Using this method, the identification rate of recognizing half of the hyoid bone as the hyoid bone may increase.
4.4. Limitations of This Study
There is a method of measuring the position of the hyoid bone relative to the lower border of the mandible in analysis using lateral cephalometric radiographs, and we have also used this measurement method in our previous research [
3]. However, there were no papers that analyzed the position of the hyoid bone in panoramic X-ray photographs. We consider this point to be a limitation of our research.
4.5. About the AI Program
AI learning in the medical field is mainly evaluated using a method called the convolutional neural network (CNN), specifically, object detection by deep learning with CNN. Object detection methods include R-CNN and YOLO. In the case of R-CNN, methods such as Faster R-CNN, Mask-RCNN and Cascade R-CNN have been developed. Studies on panoramic X-ray diagnosis using R-CNN include that of Li et al. [
15], for the extraction of cervical vertebrae.
Disadvantages of R-CNN include slow processing times and large memory consumption. The reason for this is that it is necessary to trace thousands of anatomical structures of interest or manually select regions containing the anatomical structures one by one. After that, it is necessary to repeat the steps of convolution and pooling for each of them.
Yilmaz et al. [
5] compared the performance of YOLO and R-CNN with respect to object detection in panoramic radiography diagnosis. They examined the accuracy and speed of tooth detection on panoramic radiographs and found that the YOLOv4 method was superior to the Faster R-CNN method in terms of predicting tooth detection, the speed of detection, and detection of impacted teeth. They concluded that the YOLOv4 method outperformed the Faster R-CNN method in terms of the accuracy of tooth prediction, the speed of detection, and the ability to detect impacted and erupted third molars.
In the present study, we used YOLO, which is one of the more widely used object detection methods. Among object detection algorithms, YOLO is notable for its very high processing speed. YOLO’s object recognition method divides the entire image into square grids in advance, and judges whether the target object is included in each grid. In addition, because bounding-box setting and analysis are performed simultaneously, the analysis speed is greatly improved. As a result, we believe that high-speed real-time object detection will be possible. In addition, false detection, in which an object is recognized from a blank background, is reduced to a considerable degree. YOLO is license-free and can be used commercially; in addition, YOLOv5 runs in Python and can be easily learned from researchers’ own datasets.
In present study, Type 4, in which the entire hyoid bone was visible, had a large number of cases, but all the plans in our study exhibited high precision and recall values. This is probably because anatomical structures rarely overlap and are relatively easy to find, but it is also thought that this shows the characteristics of learning via YOLO. On the other hand, when only a part of the hyoid bone could be seen, as in Type 1, or when it could not be seen at all, as in Type 0, it is possible that the learning made detection difficult.
5. Conclusions
The vertical hyoid bone position is important for the risk of dysphagia. In usually, dentist do not to focus to this position. In this study, we could create the AI program for the automatic detection of the position of the hyoid bone by reducing the number of classes and increasing the number of cases in each class, and by not learning cases where the hyoid bone is not visible or only partially visible.
In the future, we would like to create a program that can automatically alert users when the hyoid bone is in a low position.