Periodontal Disease Classiﬁcation with Color Teeth Images Using Convolutional Neural Networks

: Oral health plays an important role in people’s quality of life as it is related to eating, talking, and smiling. In recent years, many studies have utilized artiﬁcial intelligence for oral health care. Many studies have been published on tooth identiﬁcation or recognition of dental diseases using X-ray images, but studies with RGB images are rarely found. In this paper, we propose a deep convolutional neural network (CNN) model that classiﬁes teeth with periodontal diseases from optical color images captured in front of the mouth. A novel network module with one-dimensional convolutions in parallel was proposed and compared to the conventional models including ResNet152. In results, the proposed model achieved 11.45% higher than ResNet152 model, and it was proved that the proposed structure enhanced the training performances, especially when the amount of training data was insufﬁcient. This paper shows the possibility of utilizing optical color images for the detection of periodontal diseases, which may lead to a mobile oral healthcare system in the future.


Introduction
Oral health plays an important role in people's quality of life [1].It is considered to contribute to well-being, as it is related to daily activities, such as eating, talking, and smiling.Recently, many researchers have proposed oral healthcare systems, as artificial intelligence (AI) has many potential advantages [2][3][4].Most studies on this topic have been conducted using dental panoramic radiography (DPR) to identify the types of teeth (often called tooth numbering) and to recognize ages and dental caries.The earliest studies on tooth numbering were conducted in 2005 using periapical images [5,6].These studies classified molars and premolars using pattern recognition and numbered them according to the position of each tooth.Since then, studies on tooth detection and numbering have become popular with the use of deep learning techniques [7].
In 2017, Miki et al. classified teeth into seven types according to their location using 52 cone-beam computed tomography (CBCT) images.This study used AlexNet as the convolutional neural network (CNN) structure and obtained a classification accuracy of 91% [8].Later, an alternative approach pursued by Oktay et al. used 100 DPR images to classify teeth into three types.They used AlexNet as the CNN structure and obtained a classification accuracy of over 90% [9].
In 2018, Zhang et al. proposed a method to recognize 32 teeth and number them according to their location.They utilized the VGG16 model with datasets of 1000 X-ray images, and their proposed approach achieved a precision and recall of more than 95% [10].Likewise, Tuzoff et al. proposed a computer-aided diagnosis solution to detect teeth and number them using the VGG16 model with 1574 DPR image datasets.Their proposed system achieved over 99% sensitivity and precision for teeth detection and over 98% for teeth numbering [11].
In 2019, Chen et al. studied tooth detection and numbering based on the ResNet model, with 1250 dental periapical images [6].Muramatsu et al. studied tooth recognition and numbering using 100 DPR images.They used DetectNet, GoogLeNet, and ResNet structures and obtained a teeth detection sensitivity of 96% and a classification accuracy of more than 93% [12].
In 2020, Sukegawa et al. conducted a study that classified 11 implant types using 8859 dental X-ray images that included implants.They used VGG-16 and VGG-19 models and obtained a classification accuracy of more than 90% [13].Kim et al. studied tooth and implant recognition and numbering using 303 DPR images.They used a region-based CNN (R-CNN) together with heuristics and obtained an accuracy of more than 96% for tooth recognition and more than 84% for tooth numbering [14].Yasa et al. proposed a faster R-CNN model based on the GoogLeNet Inception v2 network for tooth recognition and numbering using 1125 bitewing radiographs.Their proposed model exhibited high sensitivity and precision rates, with values of 97.48%, 92.93%, and 95.15 for the sensitivity, precision, and F-measure, respectively [15].
In 2021, Kılıc et al. proposed a method for the automated detection and numbering of deciduous teeth in children using 421 DPR images.They implemented a faster R-CNN model based on the Inception v2 network.The performances were 96%, 95%, and 98% for the F1 score, precision, and sensitivity, respectively [16].
Görürgöz et al. studied tooth recognition and numbering with 1686 X-ray images.They utilized the GoogLeNet and obtained an F1 score, precision, and sensitivity of 87%, 78%, and 98%, respectively [17].Estai et al. studied tooth recognition and numbering based on VGG-16 with 591 DPR images.They achieved a recall and precision of more than 99% for tooth detection and more than 98% for tooth numbering [18].
Thus, the usability of deep learning models has been widely studied for the teeth in X-ray images.However, disease detections of teeth have not been much studied relatively.Prajapati et al. classified 251 DPR images of disease-infected teeth using VGG-16 [19].They obtained a classification accuracy of more than 88%.In another study, in 2020, You et al. designed a model for judging the area of calculus using tooth images obtained by an intraoral camera [20].A tooth region was cropped manually to train the AI model and the calculus regions were identified from a single-tooth image.The model was trained on color images of a tooth using a disclosing agent.Its mean intersection-over-union (MIoU) was 0.724 ± 0.159 compared to an MIoU of 0.652 ± 0.195 for a dentist with 20 years of experience.
Despite these recent advances in teeth classifications and oral healthcare using X-ray images [8,11,12,16], the number of studies using color images is relatively small in the literature.Li et al. detected the region of plaque from tooth color images using super-pixel level features [21].The detection accuracy was 86.42% with 607 tooth images.You et al. also presented a method to detect plaque region from a tooth image, using a transfer learning technique.The mIOU of the detected regions was statistically similar to the detected region by dentists [20].
One of the issues in teeth disease classification would be the collection and preparation of the labeled dataset, as studies using color teeth images have not been popular yet.It is difficult to obtain a precise model with a small number of data because deep neural networks generally require a large size of the dataset to train a model.
In this paper, we propose a neural network model that recognizes the presence of periodontal diseases including calculus and inflammations with a small-sized teeth dataset.The teeth images taken in front of the mouth using an optical camera were collected over the internet.The proposed method automatically detects the tooth area and classifies tooth images with and without the diseases.Novel network structures were designed by utilizing one-dimensional convolutions and shortcuts.
The remainder of this paper is organized as follows.Section 2 describes the research method by explaining the data collection and network models.Section 3 presents the experimental results.Finally, Section 4 gives the conclusions.

Data Acquisition
To train the tooth image recognition model, we used 220 frontal tooth images collected over the internet.The collected data includes 82 healthy teeth, 138 teeth with calculus or inflammation.During the data collection, we selected the teeth images that covered the entire tooth region using a mouth opener.Figures 1 and 2 show the examples of healthy and teeth with calculus.The manual classification of calculus was conducted by two experts on dental hygiene and a dentist.This dataset is open to the public on our website (https://github.com/PKNU-PR-ML-Lab/calculus)(accessed on 20 March 2023).
In this paper, we propose a neural network model that recognizes the presence of periodontal diseases including calculus and inflammations with a small-sized teeth dataset.The teeth images taken in front of the mouth using an optical camera were collected over the internet.The proposed method automatically detects the tooth area and classifies tooth images with and without the diseases.Novel network structures were designed by utilizing one-dimensional convolutions and shortcuts.
The remainder of this paper is organized as follows.Section 2 describes the research method by explaining the data collection and network models.Section 3 presents the experimental results.Finally, Section 4 gives the conclusions.

Data Acquisition
To train the tooth image recognition model, we used 220 frontal tooth images collected over the internet.The collected data includes 82 healthy teeth, 138 teeth with calculus or inflammation.During the data collection, we selected the teeth images that covered the entire tooth region using a mouth opener.Figures 1 and 2 show the examples of healthy and teeth with calculus.The manual classification of calculus was conducted by two experts on dental hygiene and a dentist.This dataset is open to the public on our website (https://github.com/PKNU-PR-ML-Lab/calculus)(accessed on 20 March 2023).The tooth regions were labeled with a rectangular shape using an open-source software (https://roboflow.com)(accessed on 15 February 2023).All labeling tasks were conducted manually to minimize errors (Figure 3).In this paper, we propose a neural network model that recognizes the presence of periodontal diseases including calculus and inflammations with a small-sized teeth dataset.The teeth images taken in front of the mouth using an optical camera were collected over the internet.The proposed method automatically detects the tooth area and classifies tooth images with and without the diseases.Novel network structures were designed by utilizing one-dimensional convolutions and shortcuts.
The remainder of this paper is organized as follows.Section 2 describes the research method by explaining the data collection and network models.Section 3 presents the experimental results.Finally, Section 4 gives the conclusions.

Data Acquisition
To train the tooth image recognition model, we used 220 frontal tooth images collected over the internet.The collected data includes 82 healthy teeth, 138 teeth with calculus or inflammation.During the data collection, we selected the teeth images that covered the entire tooth region using a mouth opener.Figures 1 and 2 show the examples of healthy and teeth with calculus.The manual classification of calculus was conducted by two experts on dental hygiene and a dentist.This dataset is open to the public on our website (https://github.com/PKNU-PR-ML-Lab/calculus)(accessed on 20 March 2023).The tooth regions were labeled with a rectangular shape using an open-source software (https://roboflow.com)(accessed on 15 February 2023).All labeling tasks were conducted manually to minimize errors (Figure 3).The tooth regions were labeled with a rectangular shape using an open-source software (https://roboflow.com)(accessed on 15 February 2023).All labeling tasks were conducted manually to minimize errors (Figure 3).In this paper, we propose a neural network model that recognizes the presence of periodontal diseases including calculus and inflammations with a small-sized teeth dataset.The teeth images taken in front of the mouth using an optical camera were collected over the internet.The proposed method automatically detects the tooth area and classifies tooth images with and without the diseases.Novel network structures were designed by utilizing one-dimensional convolutions and shortcuts.
The remainder of this paper is organized as follows.Section 2 describes the research method by explaining the data collection and network models.Section 3 presents the experimental results.Finally, Section 4 gives the conclusions.

Data Acquisition
To train the tooth image recognition model, we used 220 frontal tooth images collected over the internet.The collected data includes 82 healthy teeth, 138 teeth with calculus or inflammation.During the data collection, we selected the teeth images that covered the entire tooth region using a mouth opener.Figures 1 and 2 show the examples of healthy and teeth with calculus.The manual classification of calculus was conducted by two experts on dental hygiene and a dentist.This dataset is open to the public on our website (https://github.com/PKNU-PR-ML-Lab/calculus)(accessed on 20 March 2023).The tooth regions were labeled with a rectangular shape using an open-source software (https://roboflow.com)(accessed on 15 February 2023).All labeling tasks were conducted manually to minimize errors (Figure 3).All the images in the dataset were resized to a size of (640, 640, 3) because the deep neural network requires the image size to be fixed.The images were resized with a fixed aspect ratio, whereas the empty regions were padded with zero.Color values were normalized to between 0 and 1.

Method Overview
The proposed method of calculus recognition follows two steps: teeth region detection and the classification of teeth with calculus or inflammation (Figure 4).The teeth region detection was employed to increase the accuracy of the proposed system and to avoid overfitting issues when the amount of training data is limited.
All the images in the dataset were resized to a size of (640, 640, 3) because the deep neural network requires the image size to be fixed.The images were resized with a fixed aspect ratio, whereas the empty regions were padded with zero.Color values were normalized to between 0 and 1.

Method Overview
The proposed method of calculus recognition follows two steps: teeth region detection and the classification of teeth with calculus or inflammation (Figure 4).The teeth region detection was employed to increase the accuracy of the proposed system and to avoid overfitting issues when the amount of training data is limited.We adopted the 10-fold validation policy for precise validations of the methods.Ninety percent of the data was used for the training and the left was used for the test.The training and test were performed 10 times for the test data and were not repeated over the validations.The divisions of the training and test were kept over the teeth region detections and the classification of calculi or inflammations.

Tooth Region Detection
Tooth detection was performed using YOLOv5 [22,23].YOLOv5 was developed by the Ultralytics group in 2020 and has been widely used for various tasks.The detection accuracy and speed of YOLOv5 are known to be significantly faster than conventional YOLO [23].There are 5 sub-models of YOLOv5 such as "x", "l", "s", and "n".We utilized the model "s", as the number of objects that need to be detected is only one.The number of epochs for training was set to 300, batch size was set to 8, and pretrained weights were used for transfer learning.The SGD (stochastic gradient descent) was used for the optimizer, and the momentum, learning rate, and decay were set to 0.98, 0.01, and 0.001, which are the default parameters for the optimizer of YOLOv5.

Calculus Classification
The proposed network structure for the classification of calculus or inflammation is illustrated in Figure 5.In the overview, the network was designed by stacking convolution blocks together with max pooling layers.A global average pooling layer summarizes the extracted features from the convolutional blocks, and dropout and fully connected layers follow to classify features into two groups.
In this paper, we propose a novel way of utilizing two 1D convolutional layers by placing them parallel (see the parallel conv block in Figure 5).One of the 1D convolutions was designed to detect the features along the horizontal axis, and the other was to detect the vertical features.The filter sizes of both layers were the same, but the directions of the filters were orthogonal.The number of the weights of the network was reduced by twothirds in comparison to the case of 2D convolutions.
The parallel convolutional block has shortcut paths for bypassing the 1D convolutional layers inside.By allowing the bypass, the network can learn with deeper layers as was proven with ResNet [24].We adopted the 10-fold validation policy for precise validations of the methods.Ninety percent of the data was used for the training and the left was used for the test.The training and test were performed 10 times for the test data and were not repeated over the validations.The divisions of the training and test were kept over the teeth region detections and the classification of calculi or inflammations.

Tooth Region Detection
Tooth detection was performed using YOLOv5 [22,23].YOLOv5 was developed by the Ultralytics group in 2020 and has been widely used for various tasks.The detection accuracy and speed of YOLOv5 are known to be significantly faster than conventional YOLO [23].There are 5 sub-models of YOLOv5 such as "x", "l", "m", "s", and "n".We utilized the model "s", as the number of objects that need to be detected is only one.The number of epochs for training was set to 300, batch size was set to 8, and pretrained weights were used for transfer learning.The SGD (stochastic gradient descent) was used for the optimizer, and the momentum, learning rate, and decay were set to 0.98, 0.01, and 0.001, which are the default parameters for the optimizer of YOLOv5.

Calculus Classification
The proposed network structure for the classification of calculus or inflammation is illustrated in Figure 5.In the overview, the network was designed by stacking convolution blocks together with max pooling layers.A global average pooling layer summarizes the extracted features from the convolutional blocks, and dropout and fully connected layers follow to classify features into two groups.
In this paper, we propose a novel way of utilizing two 1D convolutional layers by placing them parallel (see the parallel conv block in Figure 5).One of the 1D convolutions was designed to detect the features along the horizontal axis, and the other was to detect the vertical features.The filter sizes of both layers were the same, but the directions of the filters were orthogonal.The number of the weights of the network was reduced by two-thirds in comparison to the case of 2D convolutions.
The parallel convolutional block has shortcut paths for bypassing the 1D convolutional layers inside.By allowing the bypass, the network can learn with deeper layers as was proven with ResNet [24].
Figure 6 illustrates the variations of convolutional blocks designed to evaluate the effectiveness of the parallel convolutions and shortcuts.A shortcut was removed for the type A; the two 1D convolutional blocks were placed serially for the type B; a 1D convolutional layer was solely used in the types C and D. The size of the type C's convolution was [1,S], whereas the size of the type D was [S,1].The type E utilized a conventional 2D convolutional layer.The primary difference between the types A and E is the number of weights, as the type A does not consider diagonal features.

Tooth Detection
Several studies have utilized deep transfer learning to detect teeth from radiographic tooth images (see Table 1), but the studies using color images are rare.Tooth detection was successful by employing pretrained networks for the faster R-CNN model.Various types of networks, such as AlexNet, VGG, and GoogLeNet, were utilized, and the accura-

Tooth Detection
Several studies have utilized deep transfer learning to detect teeth from radiographic tooth images (see Table 1), but the studies using color images are rare.Tooth detection was successful by employing pretrained networks for the faster R-CNN model.Various types of networks, such as AlexNet, VGG, and GoogLeNet, were utilized, and the accuracies were higher than 95% for all the pretrained networks.In this study, we achieved an F1 score of 99.9% and an mAP50 of 99.5% for the teeth region detection.This is partly because the task was relatively easier than the individual tooth detection.

Classification of Periodontal Disease
The proposed model showed superior accuracies for the classifications of the periodontal disease as listed in Table 2.It achieved 74.54%, whereas the ResNet model achieved 63.09%.It was also proved that the proposed parallel convolutions and shortcuts were effective as the mean accuracy decreased when they were removed or substituted to other components.Removing the shortcut decreased the accuracy by 7.72%, and the models with single 1D vertical/horizontal convolutional layers, serial structures, or 2D convolutional layers achieved 69.54%, 68.19%, 67.73%, and 65.00%, respectively.This indicated that the proposed model learned effectively from the small number of training images.
Table 3 lists recent reports on the performance of the detections and classifications of tooth diseases.As it was discussed in Introduction, the study on this topic using color images is hardly found, and it is difficult to compare the results directly because of the different experimental conditions such as the dataset.One of the recent advances in this field is the work of Liang et al. [25], who detected and classified calculus, gingivitis, and deposits.The area under the curve of their model was 80.11%.The accuracy was not reported in numbers, but the reported AUC graph indicated that the accuracies were lower than 80%.The present study, however, achieved 11.45% higher than ResNet152, and it is expected that the accuracy could be increased with additional training images.

Figure 1 .
Figure 1.Examples of healthy teeth images.

Figure 2 .
Figure 2. Examples of teeth images with calculus and inflammation.

Figure 3 .
Figure 3. Example of frontal tooth image data (a) and its labeled areas (b).

Figure 1 .
Figure 1.Examples of healthy teeth images.

Figure 1 .
Figure 1.Examples of healthy teeth images.

Figure 2 .
Figure 2. Examples of teeth images with calculus and inflammation.

Figure 3 .
Figure 3. Example of frontal tooth image data (a) and its labeled areas (b).

Figure 2 .
Figure 2. Examples of teeth images with calculus and inflammation.

Figure 1 .
Figure 1.Examples of healthy teeth images.

Figure 2 .
Figure 2. Examples of teeth images with calculus and inflammation.

Figure 3 .
Figure 3. Example of frontal tooth image data (a) and its labeled areas (b).Figure 3. Example of frontal tooth image data (a) and its labeled areas (b).

Figure 3 .
Figure 3. Example of frontal tooth image data (a) and its labeled areas (b).Figure 3. Example of frontal tooth image data (a) and its labeled areas (b).

Figure 5 .
Figure 5. Proposed network structure for calculus and inflammation classifications.

Figure 6
Figure 6 illustrates the variations of convolutional blocks designed to evaluate the effectiveness of the parallel convolutions and shortcuts.A shortcut was removed for the type A; the two 1D convolutional blocks were placed serially for the type B; a 1D convolutional layer was solely used in the types C and D. The size of the type C's convolution was [1,S], whereas the size of the type D was [S,1].The type E utilized a conventional 2D convolutional layer.The primary difference between the types A and E is the number of weights, as the type A does not consider diagonal features.

Figure 5 .
Figure 5. Proposed network structure for calculus and inflammation classifications.

Figure 6 .
Figure 6.Five different types of convolutional blocks.

Table 1 .
Results from different studies of tooth recognition models based on transfer learning.