Digestive diseases are a serious threat to human health. A quick, automatic, accurate, and robust fecal examination approach could greatly reduce the burden on medical inspectors. Unfortunately, this type of approach has remained elusive due to the scarcity of datasets and the low accuracy, and accordingly time-consuming manual examination methods are still widely used in most hospitals. In manual examinations, medical inspectors need to be close to the feces samples, which leads to a tremendous risk of cross infection. Moreover, fecal examinations are a terrible work due to the overpowering stench. Since January 2020, a lot of people have been infected by a novel coronavirus (COVID-19), and lots of medical staff have been infected owing to continuous contact with sources of infection, which not only include saliva, but also feces. Fecal examinations are highly important in the clinical diagnosis of digestive diseases. If feces samples can be acquired with vision sensors, and the feces in the acquired images can be quickly, automatically, accurately, and robustly detected and recognized, medical staff will be able to save much diagnosing time and patients will be able to receive their diagnostic reports quickly. Fecal examinations are widely applied to evaluate the probability of getting or relapsing into digestive diseases [1
], so various related methods have been proposed in recent years.
Fecal examinations can be briefly divided into macroscopic examinations and the microscopic examinations. The camera sensor-based macroscopic examination is a fast and convenient assessment method for prescreening various terrible digestive diseases. A trait examination is an important item in fecal macroscopic examinations, so we proposed a light-weight practical framework for trait recognition, which works well in a real hospital environment. The model should be updated with the increasing number of collected samples; therefore, the subsequent model update is also considered in this paper. Because of the light-weight structure, our framework can be fine-tuned with low hardware requirements. Moreover, the light-weight framework can be conveniently embedded into the mobile fecal examination machine.
Because of the scarcity of fecal trait datasets labeled by professional doctors, few related automatic diagnosis systems have been designed and this research field is developing slowly. To our best knowledge, the only report on fecal trait recognition is found in [2
]. However, this method cannot maintain its recognition performance in real hospital environments because the dataset was collected in uncontrollable environments. In order to alleviate this problem, we tried to compile a dataset in which all of the acquired images are only used for scientific research with the consent of the patients. It is worth emphasizing that all the images were labeled by professional doctors, so the dataset has a high research and medical value. Moreover, the fecal images were carefully classified into five classes, as shown in Figure 1
, namely, tar, paste, mucus, watery, and loose.
Our approach is composed of three stages: illumination normalization, object detection, and classification diagnosis. The first stage and the second stage are combined into the preprocessing stage. The illumination condition contains the illumination source and the illumination level. The illumination sources in our fecal detection machine are astigmatism, such as natural light and lamplights that are unlike spotlights, so the principal illumination is the variance in the illumination level. Since the illumination conditions are not always exactly uniform in different hospitals, the digital images acquired with the camera sensors are not always at the same illumination level, so the images need to be normalized to a uniform illumination level. Then, the feces object is detected with segmentation to remove the disturbance of the background. Finally, a light-weight practical convolutional neural network (CNN) is proposed for the feature extraction and trait recognition. The main contributions of this paper include:
A novel research field for public health is proposed, in which the fecal image dataset is collected in a real hospital environment and labeled by professional doctors. This dataset has a high clinical value and a high medical value;
A quick automatic accurate and robust diagnosis framework is proposed. The feces object is accurately detected with a well-designed threshold-based segmentation scheme on the selected color component to reduce the background disturbance. We find that the CNN does not resist the illumination variance well. In contrast, our framework has a strong tolerance for illumination variances, and the trait classification accuracy is satisfactory;
Our light-weight framework is economical and meets the requirements of practical applications. The computational complexity is low and the number of parameters is small. It is feasible and convenient to fine-tune the structure with a common hardware source.
The rest of this paper is organized as follows. Section 2
introduces the related work. Section 3
specifies our framework in detail. The experiments and discussion are in Section 4
. The conclusions are drawn in Section 5
2. Related Works
Fecal examinations can be roughly divided into traditional methods and computer-aided methods. Traditional fecal examinations include physical-based and chemical-based methods. Kopylov et al. [3
] confirmed the correlation between calprotectin and the small bowel, and assessed the small bowel diagnostic accuracy based on calprotectin. Costea et al. [4
] tested 21 representative DNA extraction protocols and recommended a standardized fecal DNA extraction method. Teimoori et al. [5
] applied a developed recombinant O. viverrini cathepsin F to diagnose human opisthorchiasis. Inpankaew et al. [6
] compared Kato–Katz and a simple sodium nitrate flotation technique in the identification of eggs in feces samples. Cai et al. [7
] applied a TaqMan based real-time polymerase chain reaction to detect C. sinensis DNA in the feces sample. The methods mentioned above require lots of demanding professional skills, expensive sensors, instruments, and reagents. Moreover, these methods may cause pollution in environment [8
Computer-aided medical diagnosis systems for macroscopic examinations have several advantages, such as the quick examination speed, high accuracy, low risk of cross infection, and low levels of professional skills required. Many automatic diagnosis systems are designed with computer technology. Theriot et al. [9
] used a logistic model to classify the patients with non-C. difficile diarrhea, C. difficile infection, and the patients who are asymptomatically colonized with C. difficile. Carvalho et al. [10
] combined fuzzy logic and a support vector machine (SVM) to diagnose lung nodules; the fuzzy rule was designed by a professional doctor. Similarly, Soundararajan et al. [11
] proposed a fuzzy logic-based knowledge system for tuberculosis recognition. The aforementioned methods are based on manual feature extraction.
With the rapid development of artificial intelligence, especially deep learning, in recent years, convolutional neural networks (CNN) have been widely studied and have achieved excellent results in different tasks related to computer vision, such as image enhancement [12
], segmentation [13
], tracking [14
], detection [15
], and recognition [16
]. The features learned with the CNN do not heavily rely on manual modeling, so their robustness and accuracy are usually better than for manual methods. In the intelligent healthcare field, CNNs do well in analyzing medical images [19
]. Sun et al. [20
] implemented three network structures and some traditional methods, and their deep belief network yielded a satisfactory accuracy in diagnosing lung cancer based on computed tomography (CT) images. Arabasadi et al. [21
] proposed a hybrid method in which a genetic algorithm was used to initialize the parameters, and the CNN was used to extract the features and classify cardiovascular diseases. Oktay et al. [22
] incorporated prior anatomical knowledge into CNNs through a novel regularization model. Since fusion can improve the performance in many ways [23
], Liu et al. [26
] proposed a novel network layer that effectively fuses the global information from the input, and a novel multi-scale input strategy that acquires multi-scale features. Li et al. [27
] proposed a novel 3D self-attention CNN for the low-dose CT denoising problem; the structure acquired more spatial information. Tschandl et al. [28
] trained two CNNs with dermoscopic images and clinical close-ups images, respectively, and combined the outputs of CNNs to diagnose nonpigmented skin cancer. In addition, Singhal et al. [29
] applied a CNN to analyze the emotion variances of people based on electroencephalograms.
Object detection is widely used in the medical diagnosis field. Many object detection methods have been proposed and have yielded remarkable results in recent years [30
]. Yang et al. [32
] proposed a novel object detection method that combined multi-scale features and an attention-based rotation network. Pang et al. [33
] improved “you only look once” (YOLO) [34
] to detect concealed objects. Yang et al. [35
] proposed a real-time cascaded framework to detect tiny faces. Yuan et al. [36
] proposed a scale-adaptive CNN to detect occluded targets and track them. Zhao et al. [37
] detected the salient object according to the difference between the feature maps of different depths. Fu et al. [38
] proposed a general unified framework to detect the salient object, which is composed of “skip-layer” architecture, “top-down” architecture, “short-connection” architecture, and so on. These CNN-based methods yield satisfactory accuracies, but they require a large amount of data and high-performance hardware to train the networks. Because of the particularity of feces samples, neither the location nor the shape is fixed, it is unfeasible to directly label the images to train CNNs for feces object detection.
Vision systems can generate more efficient and diverse information than the other sensory organs [39
], so medical images are usually acquired by visual sensors. However, images typically contain much private information, and few patients are willing to provide their pathological samples, certainly including fecal samples. Few researchers have investigated computer-aided fecal examinations because of the lack of datasets. Hachuel et al. [2
] applied ResNet [42
] to classify feces macroscopic images into three classes, namely, constipation, normal, and loose. The images were collected in uncontrollable environments. The structure is very deep, and it is unsuitable for a real hospital environment, which is a unified and controllable environment. Nkamgang et al. [43
] extracted histogram orientation gradient (HOG) features from the microscopic images and applied a neuro-fuzzy classifier for the diagnosis of intestinal parasite diseases. Yang et al. [44
] proposed a shallow CNN dubbed StoolNet to classify the colors of fecal images.
4. Experiments and Discussion
The experimental setup is as follows: Intel Core i5-8250U CPU, 8GB internal storage, NVIDIA GeForce mx150 GPU. All codes were written in Python. The Python compiler is Pycharm. The deep learning framework is Tensorflow [49
As far as we know, there is no public fecal examination dataset until now. Even though fecal examinations are very important to healthcare, the lack of a dataset has suppressed the development of the related studies. Professional doctors carefully categorized the fecal images into five typical trait classes: tar, paste, mucus, watery, and loose. Illumination variance typically deteriorates the recognition accuracy, so we needed to test our approach under different illumination conditions. However, it is difficult to collect the fecal images under different illuminations, so we simulated the effects of various illumination scales, as shown in Figure 5
. We collected and augmented the fecal trait images using the method in [50
]. Each image was rotated three times and inverted. If the shape of the object in one of the newly augmented image existed in the previous augmented images, this augmented image was deleted. Hence, each original image has seven augmented images. The images after augmentation were salted with Gaussian noise. Finally, the total number of images is 6336.
In all experiments, 75% and 25% of the dataset were set as the training set and the testing set, respectively. In order to suppress the information learned from the shape, the images with the same shape were put only in the training set or in the testing set. The shape of an original image was the reference shape, its rotation and inversion generated different shapes, while the shapes of the other augmented images were considered to have the same shape as the reference shape. Two main reasons motivated us to do this: first, in order to verify the effect of our method in a tough environment, we deliberately increased the difficulty by reducing the number of shape styles that could be learned in training set; second, it is rare to find two feces samples with the same shape in practice, so this operation is reasonable.
All images were preprocessed, the preprocessing stage includes illumination normalization and object detection with segmentation. The relationship between the network depth and its recognition accuracy is shown in Figure 6
. For the trait recognition task, three layers is a better choice than the other structures. When the number of layers was less than three, the corresponding layers were removed. When the number of layers was larger than three, the number of kernels was doubled layer by layer, and the step size remained one.
Images acquired with vision sensors in different hospitals are not always at the same illumination level, so we tested the illumination variance tolerance of our approach. If the training images and testing images were under highly different illumination levels, the recognition accuracy could be terrible. Thus, in this experiment, the training set consisted of images under a single illumination, while the testing set consisted of images with different illuminations. As shown in Table 1
, illumination normalization and object detection can improve the robustness against illumination variance. Moreover, recognition accuracy is always the best when both the two preprocessed methods are jointly applied. As shown in Figure 7
, “W-P”, “T-D” and “I-N” denotes “without preprocessing”, “target detection”, and “illumination normalization”.
The target detection method is insensitive to illumination variance, because the proportion between the foreground and the background does not depend on illumination variance. The normalized Hamming distance measures the dissimilarity between the segmented images under the standard illumination and other illuminations. According to Table 1
, all the normalized Hamming distances are very low, which confirms that our target detection method is insensitive to illumination variance.
There are very few fecal trait recognition methods, so we compared our method with the method in [2
] and the related state of the art [44
]. In [2
], the method was not designed for a real hospital environment. Furthermore, there are several substantial differences between this paper and [44
], which are summarized as follows:
The recognition tasks are different. In [44
], the method is designed for fecal color recognition, but the main objective of this paper was to design a quick, automatic, accurate, and robust method to classify the traits of fecal images. Color recognition and trait recognition are both important for macroscopic examinations, but typically trait recognition is more difficult than color recognition;
The method described in [44
] cannot maintain its level of performance in the task presented in this paper, which is demonstrated by the experimental results. The developed novel method in this paper can work well in the trait classification task;
The method described in [44
] cannot work well at different illumination levels. In contrast, the illumination problem has been solved well in this paper.
According to Table 2
, when the training set was composed of the images under different illuminations, the accuracy of our method was at least 7.78% higher than the other methods. Besides, when the training set was composed of the images under single illumination, our method yields at least a 13.96% higher accuracy than the other methods. The method described in [2
] was designed for uncontrollable environments, which means that the structure is too bloated to converge in controllable environments. In addition, the number of classes in our dataset is more than in their dataset, which implies that the classification difficulty was higher for our dataset in the task. It can be concluded that a deeper depth and a bigger number of parameters do not always yield a better recognition accuracy; however, they would lead to difficulties converging, especially in controllable environments. The depth of the CNN in [2
] is 18, while the depth of our network is only three. Hence, our method can converge fast and yields a high recognition accuracy in a real hospital environment.
In this paper, we propose a novel, quick, automatic, accurate, and robust fecal trait examination approach. A valuable fecal trait dataset was collected in a real hospital environment and all the images were labeled by professional doctors. The feces object was accurately detected with a well-designed threshold-based segmentation scheme on the selected color component to reduce the background disturbance. In addition, the illumination normalization scheme has a strong tolerance for illumination variance, and the recognition accuracy meets the business requirements. As a result of the light-weight structure, the computational complexity and the storage cost are both low, which is necessary for the application of automatic fecal examination machines and some edge devices, such as hand-held examination devices. Meanwhile, the shallow structure means it is feasible and convenient to fine-tune when more samples are collected. In the future work, we will try to develop a general fecal examination system with more functions and collect more samples to enlarge our dataset.