Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques

Dimauro, Giovanni; Bevilacqua, Vitoantonio; Fina, Pio Raffaele; Buongiorno, Domenico; Brunetti, Antonio; Latrofa, Sergio; Cassano, Michele; Gelardi, Matteo

doi:10.3390/electronics9060952

Open AccessFeature PaperEditor’s ChoiceArticle

Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques

by

Giovanni Dimauro

^1,*

,

Vitoantonio Bevilacqua

^2,3

,

Pio Raffaele Fina

⁴,

Domenico Buongiorno

^2,3

,

Antonio Brunetti

^2,3

,

Sergio Latrofa

¹,

Michele Cassano

⁵ and

Matteo Gelardi

⁵

¹

Department of Computer Science, University of Bari, 70125 Bari, Italy

²

Department of Electrical and Information Engineering, Polytechnic University of Bari, 70125 Bari, Italy

³

Apulian Bioengineering s.r.l., Via delle Violette 14, Modugno, 70026 Bari, Italy

⁴

Department of Computer Science, University of Torino, 10124 Turin, Italy

⁵

Department of Clinical and Experimental Medicine, University of Foggia, 71122 Foggia, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2020, 9(6), 952; https://doi.org/10.3390/electronics9060952

Submission received: 10 April 2020 / Revised: 21 May 2020 / Accepted: 3 June 2020 / Published: 8 June 2020

(This article belongs to the Special Issue Bioelectronic Technologies and Artificial Intelligence for Medical Diagnosis and Healthcare)

Download

Browse Figures

Versions Notes

Abstract

:

Cytological study of the nasal mucosa (also known as rhino-cytology) represents an important diagnostic aid that allows highlighting of the presence of some types of rhinitis through the analysis of cellular features visible under a microscope. Nowadays, the automated detection and classification of cells benefit from the capacity of deep learning techniques in processing digital images of the cytological preparation. Even though the results of such automatic systems need to be validated by a specialized rhino-cytologist, this technology represents a valid support that aims at increasing the accuracy of the analysis while reducing the required time and effort. The quality of the rhino-cytological preparation, which is clearly important for the microscope observation phase, is also fundamental for the automatic classification process. In fact, the slide-preparing technique turns out to be a crucial factor among the multiple ones that may modify the morphological and chromatic characteristics of the cells. This paper aims to investigate the possible differences between direct smear (SM) and cytological centrifugation (CYT) slide-preparation techniques, in order to preserve image quality during the observation and cell classification phases in rhino-cytology. Firstly, a comparative study based on image analysis techniques has been put forward. The extraction of densitometric and morphometric features has made it possible to quantify and describe the spatial distribution of the cells in the field images observed under the microscope. Statistical analysis of the distribution of these features has been used to evaluate the degree of similarity between images acquired from SM and CYT slides. The results prove an important difference in the observation process of the cells prepared with the above-mentioned techniques, with reference to cell density and spatial distribution: the analysis of CYT slides has been more difficult than of the SM ones due to the spatial distribution of the cells, which results in a lower cell density than the SM slides. As a marginal part of this study, a performance assessment of the computer-aided diagnosis (CAD) system called Rhino-cyt has also been carried out on both groups of image slide types.

Keywords:

nasal cytology; automatic cell recognition; rhinology; image analysis

1. Introduction

1.1. Context and Goals

Computer-aided diagnosis (CAD) systems have recently become an integral part of clinical diagnosis processes and medical image evaluation. Despite what happens with automatic diagnosis systems, CAD systems only play a supporting role, and their performances are not supposed to be comparable to the ones of the specialized doctors and not even replace them, only playing a complementary role [1,2,3].

Numerous studies in the field of computer vision applied to medical and biomedical fields have demonstrated that additional CAD-based tools might support specialists in their tasks [4,5,6,7]. Modern technologies allow improved acquisition, transmission and analysis of digital images. A growing benefit is also provided by the possibility of sending clinical data useful for the diagnosis of pathologies, thanks to the spread of fast connections for electronic computers and mobile phone networks that allow the exchange of large amounts of data [8,9].

The increasing usage of such systems in clinical contests is due to the recent progress in digital imaging techniques, including the generations of quantifiable metrics, which are useful to enhance understanding of biological phenomena. Image processing and machine learning (ML) techniques are employed to find information, objects or even features related to a particular image, supporting the specialist during clinical and diagnosis activities [10].

Nasal cytology represents an interesting case study for the development of a CAD system. Through the sampling of nasal mucosa and slide observation with an optic microscope, the specialist is able to diagnose pathologies such as non-allergic rhinitis (NAR) just considering the quantity and type of the observed cells. This last process has completely been automated by the CAD Rhinocyt, developed by the authors [11,12,13], using segmentation algorithms to identify the cells and artificial intelligence to classify them. The sampling process and the formulation of the diagnosis remain human activities, carried out by the specialist, so that time effort is reduced, letting the accuracy of the diagnose remain unchanged or be even improved.

The main objective of this study is to analyze the difference between a direct smear and cytological centrifugation images with quantifiable metrics and study how they affect CAD system performance.

Firstly, we compare the features of the cell observation fields of the two types of slides, and then the performance of the Rhinocyt CAD system is evaluated working on cells included in the images extracted from both types of slides.

According to the results of Bartoli et al. [14], the smear technique is a simple, efficient and relatively cheap way to practice cytological analysis, but, on the other hand, images are not always so clear, due to high-cell-density areas. Conversely, cytological centrifugation makes it possible to study less noisy material, thanks to a more rigorous (and expensive) process that requires dedicated clinical instruments. Therefore, the two main research hypotheses are as follows:

Hypothesis 1.

Cytological centrifugation improves the spatial distribution of the cells, increasing the inter-cellular distance.

Hypothesis 2.

The performance of the CAD system should be comparable for both groups of cells.

1.2. Related Work

In the last ten years, machine learning techniques have been increasingly employed in medical software. In particular, image recognition algorithms have found successful usage in several diagnostic practices, mostly in diagnostic imaging [15,16,17,18,19,20]. Deep learning models, such as convolutional neural networks (CNNs), are used for blood cell classification [21,22,23], cytometry [24], and to diagnose brain [25,26,27], colposcopy [28] and lung [29] cancer.

In 2018, the authors carried out a project following a similar approach, concerning the nasal cytology domain. Thanks to an optical microscope equipped with a camera, it was possible to sample hundreds of cells from direct smear (SM) slides. Then, using a segmentation algorithm, single images for different cells were extracted and labeled, according to their type. Subsequently, CNN was trained with the resulting dataset. Interesting results were obtained in the experimental phase [11,12].

During a retrospective look at that study, a data quality issue emerged. To achieve better results and maximize consistency, general quality improvement is required. Better cytological slides will generate better cell images, that will, in turn, make it possible to improve the CNN performance. Thus, the use of cytological centrifugation (CYT) images has begun, CYT slides being considered important in terms of quality in medical practice. On the other hand, CYT processing has higher costs compared to SM, due to the time and instruments required.

Bartoli et al. compared SM and CYT techniques using an optical microscope, applying both techniques to rhinological samples of a large population, heterogeneous in terms of sex, age, habits and ills [14]. After studying the distinctness of the cells from the rest of the cytoplasm, the recognition of the different cell types, and how easy it was to identify them, they asserted that the CYT technique is superior.

That study was based on clinical practice, and the values of the considered metrics are subjective since they are based on the visual expertise of the human observer. We will try to conduct a new quality analysis using quantitative metrics with a stronger connection to the image analysis domain.

2. Background

2.1. Nasal Cytology

Nasal cytology is a simple, affordable and non-invasive and point-of-care method used to diagnose rhinitis and other nasal diseases. The rationale of this method is the composition of the nasal mucosa: in a condition of health, only four types of cell are present in it: ciliated, muciparous, meta-plastic and ribbed cells; the presence and number of any of these can be a sign of an ongoing pathology [30].

The rhino-cytological exam is based on three main phases [31]:

Specimen Sampling: Cells from the superficial level of nasal mucosa are withdrawn using techniques such as nasal biopsy, nasal flush, brushing, scraping or nasal swab. Each technique offers different pros and cons, e.g., preserving some particular features of the mucosa, but on the other hand, being more invasive or more painful [31]. For clinical purposes, a simple and minimally invasive technique like scraping turns out to be preferable among the previous ones.
Processing: Using the direct smear technique, the sample is spread out onto a microscope slide, then secured with air drying and later colored with the May–Grunwald–Giemsa (MGG) method. This method, which appears to be the most widely used, makes it possible to color all the cellular components present in the nasal mucosa, including immune system cells like neutrophil, eosinophil, lymphocytes and mast cells, and also foreign matter like bacteria and spores.
Observation: The slice containing the oil-immersed samples is observed under an optical microscope, typically with a magnification factor equal to 1000×. The diagnostic protocol suggests acquiring at least 50 cytological fields to collect a statistically valid sample, i.e., at least 200 cells overall. During the manual reading phases, the specialist compiles a specific rhino-cytological diagram, counting the number of each different type of cell found (Table 1) and also assessing the quality of the cytological preparation, gathering useful information for the final diagnosis.

2.2. Cytocentrifugation

An alternative technique to SM is CYT. This consists in diluting the already-taken sample in a phosphate-buffered saline or into ditiothreitol and, when the right cell concentration is established with a nephelometer (Figure 1), the samples are processed with a cytological centrifuge [14] like that in Figure 2.

During the centrifugation process, the diluted sample will be transferred to a microscope slide put into a cytofunnel (Figure 3); the centrifugal force exerted by rotation, along with the hydraulic force, allows cells to settle in a single-celled layer with a concentric distribution in a restricted area of the slide. Then cells are colored with the MGG method. The staining process is not affected by centrifugation since it is executed after the cells are centrifuged. For this reason, the centrifugation process is agnostic to the staining and sampling techniques used.

Cytological centrifugation is considered the standard method to let cells settle on the microscope slide. Studies on cell integrity have proven centrifugation to be suitable for the majority of cases, although, if compared to the SM technique, CYT usually offers a lower number of samples [32]. It is also possible that the forces exerted on cells during the centrifugation process alter their morphological features, affecting the overall quality of the preparation. An additional qualitative parameter is represented by the density of cells: a too high number would imply that the layers are thick and stratified, making the observation phase further complicated due to the overlapping cells [32]. Despite a more complex preparation process, the CYT technique seems to dramatically reduce the microscope reading time, cutting one of the most effortful activities for the specialist and still resulting as clinically equivalent to SM processing [14].

2.3. Cell Detection and Segmentation

Cell detection frequently represents a requirement in a CAD application for automated image analysis. The task is mainly carried out in the preliminary phases; indeed, after cells have been detected they can be used to extract densitometric, morphometric and structural features, valuable for subsequent analysis.

Difficulty in obtaining a robust and accurate result is mainly due to the complexity of the biological information coded into images [33]. Microscope images are often characterized by the presence of artifacts produced during the acquisition process (noise, chromatic aberration, blurring, or non-uniform illumination). Information quality can also be affected by some features of the cytological preparation, such as layer thickness, quality of the coloration and agglomeration presence [34]. In addition, cells generally present poorly defined borders [33], and, in the case of agglomerations, partially overlap each other [35].

Based on the detail level required by the domain, it is possible to obtain the cell position with different options: without defining the borders (marker seeking), approximating the borders (cell detection) or delimiting borders (cell segmentation) [35]. Marker seeking techniques are used to support tasks such as cell counting and tracking or segmentation of preliminary steps.

Among several segmentation techniques, thresholding represents the easiest one. Assuming that the distribution of the intensity values of cells and background pixels are sufficiently distinct, the image is converted to a binary one using a global or a local threshold, determined automatically or a priori [35]. The watershed transform is a widely used region accumulation method. It is based on the iterative adding of connected pixels, starting from the region of the minimum, in order to create labeled regions that identify the cells. Because of intensity variation between image background and foreground, over-segmentation problems can often arise. The most common solution consists in the introduction of markers indicating the object of interest, which would also turn out to be the only ones the algorithm will start processing [35]. Segmentation can also be interpreted as a classification problem that can be solved with supervised learning methods. In this case, the belonging to a cell region or a background region is determined with a trained model like a neural network.

One of the main issues in determining the appropriate technique is linked to a low standardization level. In most cases, every proposed technique is evaluated on ad-hoc-built datasets with different metrics, making eventual qualitative comparison very hard to carry out [35].

In the CAD system developed in our previous work, called Rhinocyt, extraction is performed using OTSU thresholding, Euclidean distance transform (EDT) and watershed transform [11].

2.4. Deep Learning

Deep learning (DL) is a branch of machine learning that consists of a set of techniques and algorithms inspired by human brain processes. One of the main features of DL-based models is their hierarchical organization in layers (for example in a neural network), which allows the use of different data representations with growing levels of abstraction for the training. Each layer processes input data with a non-linear transformation so that present patterns can be identified and extracted.

With input data (images) being structurally complex, non-linear and highly dimensional (thousands/millions of pixels) [36], DL turns out to be particularly suitable for an application domain like biomedical imaging, due to its capability to automate the process of feature extraction, thus reducing the effort employed in feature engineering activities.

A classic architecture with fully connected levels (FC) is not scalable: the high number of parameters to be learned increases the model capacity, the dataset dimension and, consequently, the training and execution time. The main limitation of this kind of model, like the multi-layer perceptron, consists in the lack of invariance between the transitions: a minimum variation of position for a certain pattern in the input implies new pattern-associated weights to be learned several times in different positions [36]. An additional limitation is the lack of analysis of the input topology. Images present a strong local structure: pixels spatially distributed in a small area are highly correlated, and various layers of an FC learn patterns at a global level, completely deleting the information associated with local structures [36]. In order to overcome such limitations, models with a specialized architecture are needed. The convolutional neural network is one of the most-used architectures in tasks like cell detection, cell segmentation and cell detection, achieving good results [37].

In order to limit overfitting, DL-based models require a training phase to be run on many-dimensional datasets [24]. However, in a biomedical context, it is not always possible to easily acquire samples, and the labeling process must also be carried out by specialized doctors, seldom available. Image augmentation techniques make it possible to increase the dataset dimension just by applying to the initial images a set of geometrical transformations like rotation, translation and mirroring, generating new artificial instances. After training on an augmented dataset, the model will turn out to be improved in terms of generalization, being able to learn invariant features under such transformations [24]. Additional problems are linked to imbalanced datasets: majority classes representing the most common cells are often present, accompanied by numerous minority classes linked to pathological situations. A common solution to dataset imbalance is artificial re-sampling: under-sampling for the majority classes and oversampling for the minority ones.

3. Materials and Methods

3.1. Sampling and Dataset

3.1.1. Sampling

The nasal mucosa specimens used in this study were collected by the specialists through nasal scraping from 34 different patients at the Otolaryngology Department of the University of Bari. Fourteen specimens were processed with direct smear and 20 specimens were diluted with PapSpin Collection Fluid. Seven specimens for SM and CYT were considered of good quality and suitable for experimentation by the specialists. Once the right cell concentration was established with a nephelometer (Figure 1), the specimens were processed with a cytocentrifuge (Figure 2). All samples were dried in air and stained with MGG stain.

3.1.2. Data Acquisition

All the slides were observed using the Proway optical XSZPW208 T optical microscope with a 100× lens and 1000× magnifying factor. The DCE-PW300 3MP digital camera was used to obtain digital image-fields as 1024 × 768 px PNG files. The resolution and file format were chosen considering a reasonable computational overhead, at the same time avoiding noise artifacts due to the compression algorithm. As suggested by the protocol proposed by Heffler et al. [31], more than 50 fields were considered for each slide in order to acquire about 200 cells.

Two different datasets were created from the acquired images: the FIELDS dataset contains raw field images without annotations; the CELLS dataset contains images of isolated cells (extracted from FIELDS) annotated with their respective labels.

3.2. Quantitative Differential Analysis

Visual interpretation of cellular features using microscopy, i.e., image-based cytometry, plays a fundamental role in cytology and histopathology diagnosis activities [38]. The human visual system is able to qualitatively detect and interpret visual patterns with great efficiency [39,40]. A subjective evaluation is a reliable and accurate evaluation methodology but requires a great amount of time and effort, impractical in real-life large-scale applications and unachievable in automatized ones. Moreover, subjective evaluation could be further complicated by reproduction devices, environmental conditions, inter-observer variance, the observer’s visual abilities and cognitive biases. In order to remedy these issues, a quantitative approach through an objective evaluation should be used. Using mathematical models and metrics, biological entities and phenomena could be described in less fuzzy terms, enabling quantitative, unbiased, reproducible and large-scale analysis [41].

Using a digital image as the data source, almost all the biological information is implicitly encoded at different levels of abstraction. In order to describe the object of interest in a formal and quantitative way, it is necessary to execute a feature extraction process [38]. A feature is a numeric quantity used to describe an image (or a restricted patch) at a high level of abstraction [42]. Usually, visual features are associated with the essential constructs of visual language like points, lines, shapes, colors, textures and patterns. According to the terminology proposed by Rodenacker et al. [38], a feature is denoted as morphometric if it describes only spatial relations or densitometric if describes only pixel intensity levels, while a feature that combines both aspects is a structural feature.

A different categorization proposed by Kothari et al. is based on the data–information–knowledge pyramid model [34]. Pixel-level features are the lowest in the hierarchy because the computation is based on pixel intensity data and they convey the lowest biological interpretability value [34]. They are the easiest to extract and in some cases are satisfactory for content description. Object-level features convey a higher degree of biological information because they describe shapes, textures and spatial distribution of cellular structures and substructures. Since the extraction is carried out at object level, the entire process is highly dependent on the accuracy of detection or segmentation of the target objects, by which the robustness of consequent analysis is affected [34]. Semantic-level features are the highest in the hierarchy because they tend to reduce the semantic gap between pixel data and biological concepts directly interpretable by specialists. The features’ values are based on classification or statistical rules computed on lower-level features [34].

The selection of the right set of features has a great impact on the analysis quality; as a rule of thumb, two main aspects should be considered:

Interpretability: A feature should describe the encoded visual information in a way that is easily interpretable by the final observers. The description should be based as much as possible on human visual perception principles.
Robustness: Visual information encoded in an image shows a high degree of variance. A robust feature should have properties like equivariance/invariance to illumination levels, noise, scale, orientation or translation.

Among the features known in the literature [38], those that better describe the visual content and discriminate the investigated processing techniques, i.e., SM and CYT, have been chosen for this study. The workflow depicted in Figure 4 describes two main analysis typologies based on the region of interest (ROI) scale: global, having the ROI defined on the entire field image, captures intercellular relations; local, restricting the ROI to the area occupied by a single cell, captures intracellular features. The main goal of quantitative differential analysis is to measure the differences between the feature empirical distribution of fields processed with SM relative to those processed with CYT.

To detect cellular structures, the algorithm implemented by the Rhinocyt extraction module [11] has been used. As discussed in Section 2.3, the over-segmentation problem has been solved by using the watershed transform augmented with additional markers extracted employing the Otsu segmentation and EDT.

3.2.1. Density

Conceptually, field density is defined as the number of cytological specimens contained in the image field. The field density is based on a simple pixel counting process. The most important thing to notice is that field density takes into account both the pixels associated with cells but also artifacts like dust, clots, scratches and stain spots (see Figure 5). The artifacts’ presence in field images is marginal; thus, the quantitative contribution to the density measure is negligible. Moreover, field density tries to convey the information associated with the specialist observation process, as the specialist effort is directly proportional to the density field.

To extract the field density, the field RGB image has been processed by the mean-shift algorithm filtering stage. Using a weighted average of pixel neighbors (spatial window: 21 px; color window: 51 px), this step acts as a low-pass filter. This helps to dampen color shades and attenuate high-frequency components of pixel colors. The smoothed image has been converted to grayscale to discard any color information, binarized using Otsu’s thresholding method and dilated by morphological dilation operation with a disk structuring element of 5-px radius to fill small holes. The output binary image has been then used to compute a normalized histogram; the field density and, in a complementary way, the field sparsity, is defined as in Equation (1).

δ (I) = \frac{H (I', 1)}{M \cdot N} σ (I) = 1 - δ (I) = \frac{H (I', 0)}{M \cdot N}

(1)

where I and I′ are the input and output images with size N × M, respectively, and H(I′,0) and H(I′,1) are the absolute frequencies of background and foreground pixels, respectively.

Field density is a densitometric pixel-level feature because the computation is mainly based on an image histogram that does not encode any spatial information. In this way, the feature is orientation and translation invariant and possesses a certain degree of robustness to illumination variance since a binarized image is used.

3.2.2. Agglomeration

The agglomeration feature is defined to quantify the cellular clumping phenomenon due to the mechanical forces exerted during the centrifugation process. The cellular clumping tends to reduce intercellular distance, creating cellular agglomerates. In the worst-case scenario, cells are overlapped, meaning that field reading is more difficult [14] and automatic cell detection and extraction performance are degraded [35]. Cellular clumping is a generic phenomenon that could be addressed from multiple points of view; the definition proposed here is based on the construction of a cellular network that reflects the cellular spatial distribution patterns, highlighting the connectivity among the image regions [43].

Since feature extraction is based on the spatial position of cells in the field, the agglomeration is an object-level feature. The Rhinocyt extraction module has been used for cell detection. As the first step, after the cellular structures have been detected, the associated image regions are used to compute a set of centroids C = {c₁, …, c_n}; each centroid is in a one-to-one relation with a cell. From centroid information, an unweighted undirected graph G (V, E), called a region adjacency graph (RAG), is built. Each vertex v_k ∈ V is associated with a centroid c_k ∈ C, and an edge e ∈ E is defined for each pair of adjacent regions (see Figure 6). Two regions R_i and R_j are considered to be adjacent if R_i

\cup

R_j is a connected set, using 8-connectivity. Using the RAG, the image field agglomeration α is defined as:

α (I) = \frac{| E |}{2 | V |}

(2)

From Equation (2), it is clear that α has a lower bound (0 ≤ α) when the RAG has only isolated nodes (i.e., E = ∅). On the other hand, there is no clear analytical upper bound (α < ∞).

Particular attention must be paid to feature interpretation: two fields with different numbers of cells could have a similar agglomeration value if they have, in proportion, the same number of agglomerated cells. Agglomeration is a morphometric feature since the computation is mainly based on the RAG that, different to the original approach [43], encodes only spatial information without considering any. Moreover, the feature is computed on a grayscale version of the input image and so is robust to color variation.

3.3. Rhinocyt System Evaluation

The same methodology used in the work of Dimauro et al. has been chosen to preserve analysis backward compatibility and validity [11]. The evaluation of the Rhinocyt system has been carried out separately for extraction and classification modules. Common evaluation metrics found in machine learning literature have been employed, namely, precision, recall, F-measure and confusion matrix. In both cases, the evaluation has been carried out as a supervised task requiring a manual labeling process on the selected acquired slides.

3.3.1. Extraction Process Evaluation

The extraction process of cells from fields can be seen as an object detection task. Extracted image regions are detected cells; the discarded regions are cells not recognized. The 85 most-populated CYT fields have been sampled from multiple slides in order to evaluate the extraction module. A total number of 449 “objects” have been automatically extracted from these 85 fields by the extraction module. The extracted objects have been analyzed and verified manually by a specialist considering: true positives, extracted objects that were real cells; false positives, extracted objects that were not real cells; false negatives, discarded objects that were real cells; and true negatives, discarded objects that were not real cells. The last one is the most difficult to define since the “opposite of a cell” could be any object, like a smudge, a clot or an exploded cell; therefore, in the true negative count, only “objects” visually identifiable that have a distance from each other greater than an empirical threshold of 100 px have been considered.

The specialist validation was conducted using the following principles:

If the cell area is occluded more than 75%, the cell is not taken into account
If the cell area is cut by field image borders, it is not taken into account.
If a single cell is detected in multiple regions for more than 50% of the cell area, then it is considered a true positive. The same cell images are then manually merged to be counted only once. An example of this phenomenon, denoted with “Splitting Problem”, is depicted in Figure 7.

A further analysis of the extraction module algorithm explains the causes behind the splitting problem presented in the previous sections. To detect the watershed transform markers, the extraction algorithm uses a maximum filter with a local window to compute local maxima. The window size is a fixed predefined parameter that does not adapt well to different cell sizes. If the cell area is much greater than the local window, multiple markers will be defined, producing an over-segmentation of the cell. The splitting problem mainly occurs in the case of ciliated cells that have an elongated morphology. From a clinical perspective, ciliated cell number does not influence diagnosis, as they are normally present all among nasal mucosa, so this problem is not to be considered crucial.

3.3.2. Classification Evaluation

The cell classification experimented with here is fully based on the cell classification module as designed and trained in [11], where the experimentation was done taking into consideration many stained slides from different patients. The Rhinocyt classification module is based on a CNN neural network pre-trained on a CELLS-SM dataset [11,12]. The model architecture is a vanilla CNN: an input layer of 50 px × 50 px full-color images, three blocks of convolutional and pooling layers (with ReLU activation function) for feature extraction and a fully connected layer followed by a Softmax output layer for multiclass classification. Each input image is classified into a prefixed set of seven classes (see Table 2 for details). The class “other” was defined for extracted objects not recognized as cells. The epithelial class was merged from two different cytotypes: metaplastic and ciliated.

4. Experimental Results and Discussion

4.1. Density and Agglomeration

For both processing techniques, 324 image fields were sampled from the FIELDS dataset. The diagnostic protocol designed and used by the Italian Academy of Nasal Cytology provides for the acquisition of 50 fields from SM slides. Therefore, an initial 350 fields (50 × 7 slides) were acquired. From these 350 SM fields, 26 of them contained an excessive quantity of clusters of cells not separated and difficult to separate even during direct observation (and thus useless also for doctors); obviously, we also reduced the CYT fields to 324, excluding 26 fields randomly. For each image, density and agglomeration features were extracted, obtaining two empirical distributions, from which multiple descriptive statistics were computed, which are reported in Table 3.

As can be seen in Figure 8, Figure 9, Figure 10 and Figure 11, these distributions have a positive skewness towards lower values. In Figure 8, it is visible how FIELDS-CYT distribution is affected by the presence of outliers. To increase analysis robustness, as suggested by Leys et al. [44], median, median absolute deviation (MAD) and interquartile range (IQR) were chosen, respectively, as centrality and dispersion indices. Figure 10 reports the density distributions obtained for both the CYT and SM techniques. The CYT distribution has a power-law distribution like shape; in fact, 75% of fields have a density lower than 10%, whereas the SM density values are more uniformly distributed around the median, between 20–40% of density. Such evidence is supported by Q3, MAD and IQR values reported in Table 3.

Figure 12 relates the average density and average sparsity for both SM and CYT techniques. The lower average density value equal to 2.97% for CYT, relative to 29.9% for SM, proves that the cytocentrifugation process leads to a cellular population loss one order of magnitude greater than the direct smear technique. However, as can be seen in Figure 8, the presence of outliers in CYT distribution does not exclude the chance of high-density fields.

Special attention is needed to interpret the results obtained for the agglomeration feature. As already described in Section 3.2, agglomeration is an object-level feature; hence, it is necessary to execute a cell detection process, implying that errors arising at this stage affect the robustness of the extracted feature. For the reasons explained in Section 3.2, the “splitting problem”, has slightly altered the agglomeration values since a single cell (mostly ciliated) can be detected by multiple adjacent regions and counted multiple times, forming a “fake” agglomerate.

In Figure 9 and Figure 11, it is noticeable how the distributions are quite different: The SM distribution follows a Gaussian bell-like shape while the CYT distribution has a strong visible peak in α = 0. This value is mainly due to generally low-density values rather than the separation effectiveness of the cytocentrifugation process. In Figure 13 some examples of SM and CYT fields are reported.

Correlation

Intuitively, agglomeration could be confused with density. Even though both features describe cell distribution, the former encodes the cellular spatial relations, whereas the latter encodes any at all. Despite this, both features are conceptually tightly coupled; in fact, higher density values are evidence of a reduction of intercellular space, leading thus to a higher likelihood that two or more cells are agglomerated. To assess the statistical correlation between density and agglomeration, Spearman’s rank correlation coefficient was computed on SM and CYT distributions, since they are skewed and affected by outliers. For the SM technique, a positive correlation has been observed (ρ_s = 0.68, n = 324, p < 0.05); the same applies to the CYT technique (ρ_s = 0.72, n = 324, p < 0.05).

4.2. System Performance

Two different performance assessments were carried out separately through the Rhino-cyt system, the first using the extraction module and the second using the classification module.

4.2.1. Extraction

Overall, 85 CYT fields were presented as input to the extraction module; from each execution, a confusion matrix was computed. The resulting matrices were aggregated as a sum into a single “global” confusion matrix, reported in Table 4. Since, in this operative context, the primary goal is to quantify the identified cells, the recall metric was used. The presence of false positives in the evaluation has limited significance because, in the classification stage, all false positive cells extracted should be classified with the “other” label; hence, a precision metric was not computed.

The recall value is equal to 52.3%. This value is lower than the recall value reported in the work of Dimauro et al. [11], which is equal to 97.7% and computed on 1639 objects extracted from 50 SM fields. Low recall value must be interpreted considering the low density and low agglomeration values of CYT fields.

In contrast to what was supposed in hypothesis 1, the extraction module performance on CYT fields has deteriorated. The causes mainly lie in two factors:

Low field density: 75% of fields have a density value of less than 10%, implying that each field contains an average of 2–3 cells.
Outliers: CYT distribution is affected by outlier samples (Figure 8) that have a density value of over 35% and agglomeration values different from 0; these two quantities suggest the presence of fields with large agglomerates. These agglomerates are detected from the extraction module as a single region and discarded; since the region size is greater than a fixed empirical threshold, all the cells contained in the agglomerates are unavoidably lost. These two factors counterbalance the low true positive count with a high false negative count.

4.2.2. Classification

A further analysis of the extraction module algorithm reveals the causes behind the splitting problem presented in the previous sections. To detect the watershed transform markers, the extraction algorithm uses a maximum filter with a local window to compute local maxima. The window size is a fixed predefined parameter that does not adapt well to different cell sizes. If the cell area is much greater than the local window, multiple markers will be defined, producing an over-segmentation of the cell. The splitting problem mainly occurs in the case of ciliated cells that have an elongated morphology.

To evaluate the classification module, 1990 cells were sampled from the CELLS-CYT dataset and presented as input to the classifier. The prediction errors are reported as a multiclass confusion matrix in Table 5. From Table 6, it is noticeable how the classifier was not able to correctly predict all samples belonging to the following classes: mast cells, lymphocytes and muciparous classes. This was reported in our previous paper [11], trained with a dataset lacking data for those classes. Thus, such a result for these classes could be imputable to underfitting. The misclassification of eosinophil class is interesting: 65% of them are wrongly classified as a neutrophil. The most plausible reason is that both cytotypes are morphologically and chromatically stained similarly. From Table 5, it is clearly visible that the most prominent class of dataset, namely for epithelial cells, is affected by severe errors: 49% of them are misclassified as a neutrophil. Different results were obtained in the work of Dimauro et al. on 1006 cell images extracted from slides processed with SM [11].

Since the ROI is reduced from the entire field to a single cell, this evaluation could be interpreted as an analysis on a local scale. CNN convolutional blocks act as multiple-level feature extractors, but this process is implicit and lacks direct information that could be used to obtain quantitative results.

5. Conclusions

Through the quantitative differential analysis presented in this paper, it was possible to quantitatively verify how cells’ spatial distribution differs according to the processing technique used. Based on the obtained outcomes, cytocentrifugation, as an alternative processing technique to “direct smear”, seems to be discouraged. The biggest impact of cytocentrifugation appears to be lower density and loss of cellular population; to analyze the same amount of cells, the specialist must acquire a number of fields much greater than the minimum 50 fields suggested by the protocol.

Furthermore, in the face of a more articulated slide preparation procedure, the high cost of cytocentrifuge, and better performance of CAD Rhinocyt on SM rather than CYT, there is an increased effort for field acquisition by the specialist during the clinical exam.

It seems that the same CAD system cannot be used without ad-hoc modifications. Since the evidence suggests that it is impossible to use a single model trained on a single dataset without performance degradation, re-training the classifier neural network on the FIELDS-CYT dataset or a new dataset merged from FIELDS-CYT and FIELDS-SM could help to improve classifier performance. It has to be considered that, if the patients of this study are different from those for the model training, then any inter-patient variations between the training and the testing data may have negatively influenced the prediction on the CYT slides.

An in-depth local cytometric analysis for each cytotype could offer new insights into technique differences. High-density agglomerated fields involve a partial occlusion of cellular structures and less defined cellular edges, factors that negatively impact the performance of the extraction module. Cellular agglomerations are mitigated mainly due to the low-density of the fields, rather than for the separation forces exerted by cytocentrifugation.

Author Contributions

Conceptualization, G.D. and P.R.F.; data curation, S.L. and M.C.; formal analysis, V.B. and D.B.; investigation, A.B. and S.L.; methodology, G.D., V.B. and M.G.; project administration, G.D.; resources, G.D., M.G. and M.C.; software, P.R.F. and A.B.; validation, S.L.; writing—original draft, G.D. and P.R.F.; writing—review and editing, P.R.F., D.B., M.G. and M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Doi, K. Computer-aided diagnosis in medical imaging: Historical review, current status and future potential. Comput. Med. Imaging Graph. 2007, 31, 198–211. [Google Scholar] [CrossRef] [Green Version]
Shiraishi, J.; Li, Q.; Appelbaum, D.; Doi, K. Computer-Aided Diagnosis and Artificial Intelligence in Clinical Imaging. Semin. Nucl. Med. 2011, 41, 449–462. [Google Scholar] [CrossRef] [PubMed]
Halalli, B.; Makandar, A. Computer Aided Diagnosis—Medical Image Analysis Techniques. In Breast Imaging; Kuzmiak, C.M., Ed.; InTech: London, UK, 2018. [Google Scholar]
Dimauro, G.; Caivano, D.; Bevilacqua, V.; Girardi, F.; Napoletano, V. VoxTester, software for digital evaluation of speech changes in Parkinson disease. In Proceedings of the 2016 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Benevento, Italy, 15–18 May 2016; pp. 1–6. [Google Scholar] [CrossRef]
Rubaiat, S.Y.; Rahman, M.M.; Hasan, M.K. Important Feature Selection Accuracy Comparisons of Different Machine Learning Models for Early Diabetes Detection. In Proceedings of the 2018 International Conference on Innovation in Engineering and Technology (ICIET), Dhaka, Bangladesh, 27–28 December 2018; pp. 1–6. [Google Scholar] [CrossRef]
Dimauro, G.; Di Nicola, V.; Bevilacqua, V.; Caivano, D.; Girardi, F. Assessment of Speech Intelligibility in Parkinson’s Disease Using a Speech-To-Text System. IEEE Access 2017, 5, 22199–22208. [Google Scholar] [CrossRef]
Hasan, K.; Aziz, H.; Zarif, I.I.; Hasan, M.; Hashem, M.M.A.; Guha, S.; Love, R.; Ahamed, S. HeLP ME: Recommendations for Non-invasive Hemoglobin Level Prediction in Mobile-phone Environment (Preprint). JMIR mHealth uHealth 2019. [Google Scholar] [CrossRef]
Dimauro, G.; Caivano, D.; Girardi, F.; Ciccone, M.M. The patient centered Electronic Multimedia Health Fascicle—EMHF. In Proceedings of the 2014 IEEE Workshop on Biometric Measurements and Systems for Security and Medical Applications (BIOMS) Proceedings, Rome, Italy, 17 October 2014; pp. 61–66. [Google Scholar] [CrossRef]
Dimauro, G.; Girardi, F.; Caivano, D.; Colizzi, L. Personal Health E-Record—Toward an Enabling Ambient Assisted Living Technology for Communication and Information Sharing Between Patients and Care Providers. In Ambient Assisted Living; Springer: Cham, Switzerland, 2019; pp. 487–499. [Google Scholar] [CrossRef]
Malek, S.; Mosleh, M.; Dhillon, S.K.; Milow, P. Bioimage Informatics. In Encyclopedia of Bioinformatics and Computational Biology; Elsevier: Amsterdam, The Netherlands, 2019; pp. 993–1010. [Google Scholar]
Dimauro, G.; Ciprandi, G.; Deperte, F.; Girardi, F.; Ladisa, E.; Latrofa, S.; Gelardi, M. Nasal cytology with deep learning techniques. Int. J. Med Inform. 2019, 122, 13–19. [Google Scholar] [CrossRef] [PubMed]
Dimauro, G.; Girardi, F.; Gelardi, M.; Bevilacqua, V.; Caivano, D. Rhino-Cyt: A System for Supporting the Rhinologist in the Analysis of Nasal Cytology. Lect. Notes Comput. Sci. 2018, 619–630. [Google Scholar] [CrossRef]
Dimauro, G.; di Pierro, D.; Maglietta, R.; Renò, V.; Caivano, D.; Gelardi, M. RhinoSmart: A smartphone based system for rhino-cell segmentation. IEEE Xplore 2020, in press. [Google Scholar]
Bartoli, M.L.; Cristofani-Mencacci, L.; Scarano, M.; Nacci, A.; Latorre, M.; Bacci, E.; Paggiaro, P.; Seccia, V. Nasal Cytology: A Comparative Study of Two Different Techniques of Processing—Smeared versus Cytocentrifuged Slides. Mediat. Inflamm. 2018, 2018. [Google Scholar] [CrossRef]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [Green Version]
Bevilacqua, V.; Brunetti, A.; Trotta, G.F.; Dimauro, G.; Elez, K.; Alberotanza, V.; Scardapane, A. A novel approach for Hepatocellular Carcinoma detection and classification based on triphasic CT Protocol. In Proceedings of the 2017 IEEE Congress on Evolutionary Computation (CEC), San Sebastian, Spain, 5–8 June 2017; pp. 1856–1863. [Google Scholar] [CrossRef]
Dimauro, G.; Caivano, D.; Girardi, F. A New Method and a Non-Invasive Device to Estimate Anemia Based on Digital Images of the Conjunctiva. IEEE Access 2018, 6, 46968–46975. [Google Scholar] [CrossRef]
Dimauro, G.; Baldari, L.; Caivano, D.; Colucci, G.; Girardi, F. Automatic Segmentation of Relevant Sections of the Conjunctiva for Non-Invasive Anemia Detection. In Proceedings of the 2018 3rd International Conference on Smart and Sustainable Technologies (SpliTech), Split, Croatia, 26–29 June 2018; pp. 1–5, ISBN 978-953290083-5. [Google Scholar]
Casalino, G.; Castellano, G.; Consiglio, A.; Liguori, M.; Nuzziello, N.; Primiceri, D. A Predictive Model for MicroRNA Expressions in Pediatric Multiple Sclerosis Detection. Appl. Evol. Comput. 2019, 11676, 177–188. [Google Scholar] [CrossRef]
Maglietta, R.; Initiative, T.A.D.N.; Amoroso, N.; Boccardi, M.; Bruno, S.; Chincarini, A.; Frisoni, G.B.; Inglese, P.; Redolfi, A.; Tangaro, S.; et al. Automated hippocampal segmentation in 3D MRI using random undersampling with boosting algorithm. Pattern Anal. Appl. 2015, 19, 579–591. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shahin, A.; Guo, Y.; Amin, K.M.; Sharawi, A.A. White blood cells identification system based on convolutional deep neural learning networks. Comput. Methods Programs Biomed. 2019, 168, 69–80. [Google Scholar] [CrossRef] [PubMed]
Tiwari, P.; Qian, J.; Li, Q.; Wang, B.; Gupta, N.; Khanna, A.; Rodrigues, J.J.P.C.; De Albuquerque, V.H.C. Detection of subtype blood cells using deep learning. Cogn. Syst. Res. 2018, 52, 1036–1044. [Google Scholar] [CrossRef]
Xu, M.; Papageorgiou, D.P.; Abidi, S.Z.; Dao, M.; Zhao, H.; Karniadakis, G.E. A deep convolutional neural network for classification of red blood cells in sickle cell anemia. PLoS Comput. Boil. 2017, 13, e1005746. [Google Scholar] [CrossRef]
Gupta, A.; Harrison, P.J.; Wieslander, H.; Pielawski, N.; Kartasalo, K.; Partel, G.; Solorzano, L.; Suveer, A.; Klemm, A.H.; Spjuth, O.; et al. Deep Learning in Image Cytometry: A Review. Cytom. Part A 2018, 95, 366–380. [Google Scholar] [CrossRef] [Green Version]
Song, Y.; Zhang, L.; Chen, S.; Ni, D.; Li, B.; Zhou, Y.; Lei, B.; Wang, T. A deep learning based framework for accurate segmentation of cervical cytoplasm and nuclei. In Proceedings of the 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, IL, USA, 26–30 August 2014; pp. 2903–2906. [Google Scholar] [CrossRef]
Stanley, R.J.; Sornapudi, S.; Stoecker, W.V.; Almubarak, H.; Long, R.; Antani, S.; Thoma, G.; Zuna, R.; Frazier, S.R. Deep Learning Nuclei Detection in Digitized Histology Images by Superpixels. J. Pathol. Inform. 2018, 9, 5. [Google Scholar] [CrossRef]
Hu, L.; Bell, D.; Antani, S.; Xue, Z.; Yu, K.; Horning, M.P.; Gachuhi, N.; Wilson, B.; Jaiswal, M.S.; Befano, B.; et al. An Observational Study of Deep Learning and Automated Evaluation of Cervical Images for Cancer Screening. J. Natl. Cancer Inst. 2019, 111, 923–932. [Google Scholar] [CrossRef]
Sato, M.; Horie, K.; Hara, A.; Miyamoto, Y.; Kurihara, K.; Tomio, K.; Yokota, H. Application of deep learning to the classification of images from colposcopy. Oncol. Lett. 2018, 15, 3518–3523. [Google Scholar] [CrossRef] [Green Version]
Teramoto, A.; Tsukamoto, T.; Kiriyama, Y.; Fujita, H. Automated Classification of Lung Cancer Types from Cytological Images Using Deep Convolutional Neural Networks. BioMed Res. Int. 2017, 2017. [Google Scholar] [CrossRef]
Gelardi, M.; Fiorella, M.L.; Leo, G.; Incorvaia, C. Cytology in the diagnosis of rhinosinusitis. Pediatr. Allergy Immunol. 2007, 18, 50–52. [Google Scholar] [CrossRef]
Heffler, E.; Landi, M.; Caruso, C.; Fichera, S.; Gani, F.; Guida, G.; Liuzzo, M.T.; Pistorio, M.P.; Pizzimenti, S.; Riccio, A.M.; et al. Nasal cytology: Methodology with application to clinical practice and research. Clin. Exp. Allergy 2018, 48, 1092–1106. [Google Scholar] [CrossRef] [PubMed]
Stokes, B.O. Principles of Cytocentrifugation. Lab. Med. 2004, 35, 434–437. [Google Scholar] [CrossRef]
Deserno, T.M. Fundamentals of Biomedical Image Processing; Springer Science and Business Media LLC: New York, NY, USA, 2010; pp. 1–51. [Google Scholar]
Kothari, S.; Phan, J.H.; Stokes, T.H.; Wang, M.D. Pathology imaging informatics for quantitative analysis of whole-slide images. J. Am. Med Inform. Assoc. 2013, 20, 1099–1108. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Xing, F.; Yang, L. Robust Nucleus/Cell Detection and Segmentation in Digital Pathology and Microscopy Images: A Comprehensive Review. IEEE Rev. Biomed. Eng. 2016, 9, 234–263. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
Xing, F.; Xie, Y.; Su, H.; Liu, F.; Yang, L. Deep Learning in Microscopy Image Analysis: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 4550–4568. [Google Scholar] [CrossRef]
Rodenacker, K.; Bengtsson, E. A Feature Set for Cytometry on Digitized Microscopic Images. Anal. Cell. Pathol. 2003, 25, 1–36. [Google Scholar] [CrossRef] [Green Version]
Dimauro, G. A new image quality metric based on human visual system. In Proceedings of the 2012 IEEE International Conference on Virtual Environments Human-Computer Interfaces and Measurement Systems (VECIMS) Proceedings, Tianjin, China, 2–4 July 2012; pp. 69–73. [Google Scholar] [CrossRef]
Dimauro, G.; Altomare, N.; Scalera, M. PQMET: A digital image quality metric based on human visual system. In Proceedings of the 2014 4th International Conference on Image Processing Theory, Tools and Applications (IPTA), Paris, France, 14–17 October 2014; pp. 1–6. [Google Scholar] [CrossRef]
Caicedo, J.C.; Cooper, S.; Heigwer, F.; Warchal, S.; Qiu, P.; Molnar, C.; Vasilevich, A.S.; Barry, J.; Bansal, H.S.; Kraus, O.; et al. Data-analysis strategies for image-based cell profiling. Nat. Methods 2017, 14, 849–863. [Google Scholar] [CrossRef]
Kumar, R.M.; Sreekumar, K. A survey on image feature descriptors. Int. J. Comput. Sci. Inf. Technol. 2014, 5, 7668–7673. [Google Scholar]
Tremeau, A.; Colantoni, P. Regions adjacency graph applied to color image segmentation. IEEE Trans. Image Process. 2000, 9, 735–744. [Google Scholar] [CrossRef] [PubMed]
Leys, C.; Ley, C.; Klein, O.; Bernard, P.; Licata, L. Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 2013, 49, 764–766. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Example of nephelometer.

Figure 2. Example of cytocentrifuge.

Figure 3. Example of cytofunnel.

Figure 4. General workflow used in the quantitative differential analysis. Steps reported within highlighted boxes are optional. A sampling of the images in the dataset is carried out as the first step. Next, the ROI is defined based on the type of the analysis (global or local); then, depending on the specific selected feature, a preprocessing step and a segmentation step may follow. For each technique and each image, a feature is extracted. The result is an empirical distribution: statistical descriptors are computed to assess the similarities/differences between SM and CYT.

Figure 5. An example of a full-color field image (input) and binarized version (output) used in the field density extraction algorithm.

Figure 6. Main steps of agglomeration computation. Input image (full color) and output image (regions) of the cell detection algorithm. Input image (regions) and output object (RAG) of the RAG build process. Notice the one-to-one relation between vertices and regions, as each vertex has the same region identifier (upper half: SM; lower half: CYT).

Figure 7. Some examples of the splitting problem. Each ciliated cell (green bounding box) is detected in three different regions (red bounding boxes). Since the three regions cover more than 75% of the cell area, these are counted as true positives.

Figure 8. Density distributions comparison.

Figure 9. Agglomeration distributions comparison.

Figure 10. Comparison of density distributions.

Figure 11. Comparison of agglomeration distributions.

Figure 12. Comparison between average density and average sparsity for SM and CYT distributions.

Figure 13. Some examples of SM and CYT fields.

Table 1. Cell types of the nasal mucosa, useful to diagnose rhinitis and nasal diseases.

Name	Description	Occurrence
Ciliated cell	The most common cell among the nasal mucosa cells. It presents a polygonal shape characterized by three main regions: the apical portion containing the ciliary apparatus; the cell body containing most of the cytoplasm; and the caudal portion containing the basal region.	Frequent
Muciparous cell	A chalice-shaped unicellular gland with the nucleus in a basal position below the vacuoles, which gives it the typical shape.	Frequent
Ribbed cell	A column-shaped cell with a nucleus located toward the lower pole.	Frequent
Basal cell	A small dimension cell with electrodense cytoplasm and a big and hyperchromatic nucleus. It adheres to the basal membrane.	Frequent
Neutrophil	A granulocyte with polylobate nucleus and finely colored grains.	Normal
Eosinophil	A granulocyte with a nucleus often bilobed and orange acidophile grains.	Normal
Lymphocyte	The kidney-shaped nucleus occupies most of the cell, surrounded with a thin light blue cytoplasmatic portion.	Rare
Mast cell	A granulocyte with an ovular nucleus, usually covered with purple basophils grains.	Rare

Table 2. Class distribution in dataset CELLS-CYT.

Class	Number of Samples	%
Epithelial	576	28.94
Neutrophil	327	16.43
Eosinophil	58	2.91
Mast cell	40	2.01
Lymphocyte	31	1.56
Muciparous	34	1.71
Other	924	46.43

Table 3. Descriptive statistics for density and agglomeration empirical distributions.

Descriptive Statistic	Density		Agglomeration
	SM	CYT	SM	CYT
Number of fields	324	324	324	324
Min	0.0154	0.0034	0.0000	0.0000
Q1	0.1637	0.0132	0.2500	0.0000
Median	0.2909	0.0297	0.3372	0.1667
Q3	0.4287	0.0944	0.4333	0.3089
Max	0.7315	0.5740	0.7353	0.7059
MAD	0.3092	0.2264	0.2125	0.5396
IQR	0.2650	0.0812	0.1833	0.3089

Table 4. Confusion matrix for extraction module.

Real/Predicted	Cell	Not Cell
cell	168	153
not cell	76	52

Table 5. Confusion matrix—Classification on CELLS-CYT dataset.

Class	Epithelial	Neutrophil	Eosinophil	Lymphocyte	Muciparous	Others
epithelial	194	259	50	0	10	63
neutrophil	4	295	1	0	1	26
eosinophil	2	38	0	0	0	18
mastcell	10	13	5	1	0	11
lymphocyte	5	21	2	0	0	3
muciparous	17	11	4	0	0	2
others	57	259	12	0	3	593

Table 6. Evaluation metrics—classification on CELLS-CYT dataset.

Class	Precision	Recall	F1	Support
epithelial	0.67	0.34	0.45	576
neutrophil	0.33	0.90	0.48	327
eosinophil	0.00	0.00	0.00	58
mastcell	0.00	0.00	0.00	40
lymphocyte	0.00	0.00	0.00	31
muciparous	0.00	0.00	0.00	34
others	0.83	0.64	0.72	924

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dimauro, G.; Bevilacqua, V.; Fina, P.R.; Buongiorno, D.; Brunetti, A.; Latrofa, S.; Cassano, M.; Gelardi, M. Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques. Electronics 2020, 9, 952. https://doi.org/10.3390/electronics9060952

AMA Style

Dimauro G, Bevilacqua V, Fina PR, Buongiorno D, Brunetti A, Latrofa S, Cassano M, Gelardi M. Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques. Electronics. 2020; 9(6):952. https://doi.org/10.3390/electronics9060952

Chicago/Turabian Style

Dimauro, Giovanni, Vitoantonio Bevilacqua, Pio Raffaele Fina, Domenico Buongiorno, Antonio Brunetti, Sergio Latrofa, Michele Cassano, and Matteo Gelardi. 2020. "Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques" Electronics 9, no. 6: 952. https://doi.org/10.3390/electronics9060952

APA Style

Dimauro, G., Bevilacqua, V., Fina, P. R., Buongiorno, D., Brunetti, A., Latrofa, S., Cassano, M., & Gelardi, M. (2020). Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques. Electronics, 9(6), 952. https://doi.org/10.3390/electronics9060952

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of Rhino-Cytological Specimens with Image Analysis and Deep Learning Techniques

Abstract

1. Introduction

1.1. Context and Goals

1.2. Related Work

2. Background

2.1. Nasal Cytology

2.2. Cytocentrifugation

2.3. Cell Detection and Segmentation

2.4. Deep Learning

3. Materials and Methods

3.1. Sampling and Dataset

3.1.1. Sampling

3.1.2. Data Acquisition

3.2. Quantitative Differential Analysis

3.2.1. Density

3.2.2. Agglomeration

3.3. Rhinocyt System Evaluation

3.3.1. Extraction Process Evaluation

3.3.2. Classification Evaluation

4. Experimental Results and Discussion

4.1. Density and Agglomeration

Correlation

4.2. System Performance

4.2.1. Extraction

4.2.2. Classification

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI