Deep Learning-Based Screening of Urothelial Carcinoma in Whole Slide Images of Liquid-Based Cytology Urine Specimens

Simple Summary In this study, we aimed to investigate the use of deep learning for classifying whole-slide images of urine liquid-based cytology specimens into neoplastic and non-neoplastic (negative). To do so, we used a total of 786 whole-slide images to train models using four different approaches, and we evaluated them on 750 whole-slide images. The best model achieved good classification performance, demonstrating the promising potential of use of such models for aiding the screening process for urothelial carcinoma in routine clinical practices. Abstract Urinary cytology is a useful, essential diagnostic method in routine urological clinical practice. Liquid-based cytology (LBC) for urothelial carcinoma screening is commonly used in the routine clinical cytodiagnosis because of its high cellular yields. Since conventional screening processes by cytoscreeners and cytopathologists using microscopes is limited in terms of human resources, it is important to integrate new deep learning methods that can automatically and rapidly diagnose a large amount of specimens without delay. The goal of this study was to investigate the use of deep learning models for the classification of urine LBC whole-slide images (WSIs) into neoplastic and non-neoplastic (negative). We trained deep learning models using 786 WSIs by transfer learning, fully supervised, and weakly supervised learning approaches. We evaluated the trained models on two test sets, one of which was representative of the clinical distribution of neoplastic cases, with a combined total of 750 WSIs, achieving an area under the curve for diagnosis in the range of 0.984–0.990 by the best model, demonstrating the promising potential use of our model for aiding urine cytodiagnostic processes.


Introduction
For routine clinical practices, clinicians obtain urinary tract cytology specimens for the screening of urothelial carcinoma [1,2]. Urine specimens play a critical role in the clinical evaluation of patients who have clinical signs and symptoms (e.g., haematuria and painful urination) suggestive of pathological changes within the urinary tract [3]. Urothelial carcinoma is the most common malignant neoplasm detected by urine cytology. The most common site of origin of urothelial carcinoma is bladder. According to the Global Cancer Statistics 2020 [4], bladder cancer is the tenth most commonly diagnosed cancer with 573,278 of new cases and 212,536 of new deaths worldwide in 2020. Most of the bladder cancers are urothelial in origin (approximately 90% of bladder cancers) and primary adenocarcinoma of the bladder is rare [3,5,6]. Bladder cancer often presents insidiously. Haematuria is the most common presentation of bladder cancer, which is typically intermittent, frank, painless and at times present throughout micturition [3]. Delayed diagnosis of urothelial carcinoma is associated with high grade muscle invasion which has the potential to progress rapidly and cancer metastasis [3]. Of course, cystoscopy with a biopsy is the gold standard for diagnosis of urothelial carcinoma in clinical practice; however, it is aggressive and relatively inconvenient as a follow-up monitoring approach [7]. It has been reported that 48.6% of biopsy proven low-grade urothelial carcinomas had a urine cytodiagnosis of atypical or neoplastic suspicious, which could conclude that existing urine cytology screening and surveillance systems are accurate in diagnosing urothelial carcinoma [8]. Therefore, cytological urothelial carcinoma screening in urine specimens plays a key role in early stage cancer detection and treatment in routine clinical practices [9,10].
Liquid-based cytology (LBC) was developed as an alternative to conventional smear cytology in the 1990s [11]. LBC has several advantages in preparation and diagnostic process compared with conventional smear [12][13][14][15]. The LBC technique preserves the cells of interest in a liquid medium and removes most of the debris, blood, and exudate either by filtering or density gradient centrifugation [16,17]. LBC provides automated and standardized processing techniques that produce a uniformly distributed and cell-enriched slide [18][19][20]. Moreover, residual specimens can be used for additional investigations (e.g., immunocytochemistry) [21][22][23]. ThinPrep (Hologic, Inc., Marlborough, MA, USA) and SurePath (Becton Dickinson, Inc., Franklin Lakes, NJ, USA) for LBC specimen preparation have been approved by the US Food and Drug Administration (FDA). Compared to the conventional smear cytology, LBC has lower background elements, provides better cell preservation, and has a higher satisfaction rate [24]. As for the sensitivity, it has been reported that LBC achieved at 0.58 (CI: 0.51-0.65) and conventional smear achieved at 0.38 [7,11]. It is understandable that the efficiency of diagnosis employing the LBC is high because of the cell collection rate. It was shown that the accuracy of diagnoses made employing the LBC method can be increased by understanding the characteristics of the cell morphology in suspicious cases (e.g., high-grade urothelial carcinoma and low-grade urothelial carcinoma [25], and in other malignancies [26,27]). LBC specimens performed significantly better in urinary cytology when evaluating malignant categories especially high-grade urothelial carcinoma (HGUC), which facilitate a more accurate diagnosis than conventional preparations [15]. Moreover, from the standpoint of rationality, preparation and screening times were 2.25 and 1.33-2.00 times greater when using LBC (ThinPrep) compared with cytocentrifugation (conventional smear cytology) [28]. Therefore, computational screening (cytodiagnostic) aids for urine LBC specimens would be a great benefit for urothelial carcinoma screening as medial image analysis.
Whole-slide images (WSIs) are digitisations of the conventional glass slides obtained via specialised scanning devices (WSI scanners), and they are considered to be comparable to microscopy for primary diagnosis [29]. It has been reported that evaluation of WSI is generally equivalent to using conventional glass slides under microscopy [30]. The use of WSI has to some degree met the goals of saving pathologists working time and providing high quality pathological images with convenient access and easily navigable viewing online based software which saves resources and costs by eliminating slide glass shipping expenses [30]. The advent of WSIs led to the application of medical image analysis techniques, machine learning, and deep learning techniques for aiding pathologists in inspecting WSIs [31]. Importantly, a routine scanning of LBC slides in a single layer of WSIs would be suitable for further high throughput analysis (e.g., automated image based cytological screening and medical image analysis) [20]. Indeed, deep learning approaches and its clinical application to classify cytopathological changes (e.g., neoplastic transformation) were reported in the recent years [32][33][34][35][36][37][38][39][40][41].
In this study, we trained deep learning models based on convolutional neural networks (CNN) using a training dataset of 786 urine LBC (ThinPrep) WSIs. We evaluated the model on two test sets with a combined total of 750 WSIs, achieving ROC area under the curve (AUC) for WSI neoplastic classification in the range of 0.984-0.990.

Clinical Cases and Cytopathological Records
In this retrospective study, a total of 1556 LBC ThinPrep Pap test (Hologic, Inc.) conventionally prepared cytopathological slide glass specimens of human urine cytology were collected from a private clinical laboratory in Japan after routine cytopathological review of those glass slides by cytoscreeners and pathologists. The private clinical laboratory in Japan that provided urine LBC specimen glass slides in the present study was anonymized due to the confidentiality agreement. The LBC specimens were selected randomly to reflect a real clinical settings as much as possible. We have also collected LBC specimens so as to compile test sets with an equal balance and a clinical balance of negative and neoplastic. The equal balance test set consisted of 50% negative and 50% neoplastic urine LBC cases ( Table 1). The clinical balance test set consisted of a ratio of 10 (negative) to 1 (neoplastic) urine LBC cases based on a real clinical setting which was reported by the Japanese Society of Clinical Cytology as the statistics on cytodiagnosis in 2016 to 2021 (https://jscc.or.jp/, accessed on 27 January 2022). Prior to the start of the experiments, the cytoscreeners and pathologists excluded inadequate LBC specimens (n = 21) which had inadequate cellularity or had significant artifacts like dust or ink markings. All WSIs were scanned at a magnification of ×20 using the same Leica Aperio AT2 Digital Whole Slide Scanner (Leica Biosystems, Tokyo, Japan) and were saved in the SVS file format with JPEG2000 compression. Each WSI was observed by at least two cytoscreeners or pathologists to confirm the diagnosis, with the final checking and verification performed by a senior cytoscreener or pathologist. We have confirmed that cytoscreeners and pathologists were able to classify ( LGUC and suspicious for HGUC; Class V: HGUC). The cytoscreeners and pathologists had to agree whether the output class was negative or neoplastic on each urine LBC WSI.

Annotation
A cohort of 62 training cases and 10 validation cases were manually annotated by experienced pathologists (Table 1). Coarse manually drawing polygonal annotations were obtained by free-hand drawing in-house online tool developed by customising the opensource OpenSeadragon tool at https://openseadragon.github.io/ (accessed on 25 July 2021) which is a web-based viewer for zoomable images. On average, the cytoscreeners and pathologists manually annotated 180 cells (or cellular clusters) per WSI. Annotated neoplastic WSIs consisted of Class III, Class IV, and Class V cytodiagnostic classes except for Class I and Class II (Table 1). We set three annotation labels for neoplastic urothelial epithelial cells: atypical cell, low-grade urothelial carcinoma (LGUC) cell, and high-grade urothelial carcinoma (HGUC) cell (Table 2 and Figure 1). For example, on the Class III ( Figure 1A,B), Class IV ( Figure 1C,D), and Class V ( Figure 1E,F) WSIs, cytoscreeners and pathologists performed annotations around the atypical cells ( Figure 1A,B), LGUC cells ( Figure 1C,D), and HGUC cells ( Figure 1E,F) based on the representative neoplastic urothelial epithelial cell morphology (e.g., hyperchromatism, irregular chromatin distribution, abnormalities of nuclear shape, increased nuclear/cytoplasmic ratio, irregular nuclear distribution, nuclear enlargement, abnormal cytoplasm, prominent nucleolus, cellular and nuclear polymorphism). If the WSIs were classified as Class V, for example, it would be possible to have atypical cell, LGUC cell, and HGUC cell annotations in a WSI. In contrast, the cytoscreeners and pathologists did not annotate areas where it was difficult to cytologically determine that the cells were neoplastic. The negative subset of the training and validation sets (Table 1) was not annotated and the entire cell spreading areas within the WSIs were used. The average annotation time per WSI was about 90 min. Annotations performed by the cytoscreeners and pathologists were modified (if necessary) and verified by a senior cytoscreener.

Deep Learning Models
We performed training using transfer learning with fine-tuning using two different weight initialisations: ImageNet (IN) and pre-training on a uterine cervix (UC) neoplastic (×10, 1024) dataset from a previous study [35]. We used two different approaches for training during fine-tuning: fully supervised (FS) and weakly supervised (WS) learning. We used a modified version of EfficientNetB1 (ENB1) [42] with a tile size of 1024 × 1024 px. This resulted in a total of four models, all trained at magnification ×10 and tile size 1024 × 1024 px: ENB1-UC-FS+WS, ENB1-UC-WS, ENB1-IN-FS+WS, and ENB1-IN-WS. In addition, for comparaison with other model architectures, we trained models using ResNet50V2 [43], DenseNet121 [44] and InceptionV3 [45]. For these models we trained uisng FS + WS method and with initialisation from ImageNet, as we did not have access to models trained with these architecture on uterine cervix. We performed the fine-tuning of the models using the partial fine-tuning approach [46], which consists of only finetuning the affine parameters of batch-normalization layers and the final classification layer ( Figure 2). starting with pre-trained weights on ImageNet. Figure 2 shows an overview of the training method and trained deep learning models. The training methodology that we used in the present study was exactly the same as reported in our previous studies [47]. We performed slide tiling by extracting square tiles from tissue regions of the WSIs. We started by detecting the tissue regions in order to eliminate most of the white background. This was conducted by performing thresholding on a grayscale version of the WSIs using Otsu's method [48]. For the CNN, we have used the EfficientNetB1 architecture [42] with a modified input size of 1024 × 1024 px to allow a larger view; this is based on cytologists' input that they usually need to view the neighbouring cells around a given cell in order to diagnose more accurately. We used the partial fine-tuning approach [46] for the tuning the CNN component.
For training and inference, we then proceeded by extracting 1024 × 1024 px tiles from the tissue regions. We performed the extraction in real-time using the OpenSlide library [49]. To perform inference on a WSI, we used a sliding window approach with a fixed-size stride of 512 × 512 px (half the tile size). This results in a grid-like output of predictions on all areas that contained cells, which then allowed us to visualise the prediction as a heatmap of probabilities that we can directly superimpose on top of the WSI. Each tile had a probability of being neoplastic; to obtain a single probability that is representative of the WSI, we computed the maximum probability from all the tiles.
During fully supervised learning, we maintained an equal balance of positively and negatively labelled tiles in the training batch. To do so, for the positive tiles, we extracted them randomly from the annotated regions (annotation label: atypical cell, LGUC cell, and HGUC cell) of neoplastic WSIs, such that within the 1024 × 1024 px, at least one annotated cell was visible anywhere inside the tile. For the negative tiles, we extracted them randomly anywhere from the tissue regions of negative WSIs (Table 1). We then interleaved the positive and negative tiles to construct an equally balanced batch that was then fed as input to the CNN. In addition, to reduce the number of false positives, given the large size of the WSIs, we performed a hard mining of tiles, whereby at the end of each epoch, we performed full sliding window inference on all the negative WSIs in order to adjust the random sampling probability such that false positively predicted tiles of negative were more likely to be sampled.
During weakly supervised learning, to maintain the balance on the WSI, we oversampled from WSIs to ensure that the model trained on tiles from all WSIs in each epoch. We then switched to hard mining tiles. To perform hard mining, we alternated between training and inference. During inference, the CNN was applied in a sliding window fashion on all the tissue regions in the WSI, and we then selected the k tiles with the highest probability for being positive. This step effectively selects tiles that are most likely to be false positives when the WSI is negative. The selected tiles were placed in a training subset, and once that subset contained N tiles, training was initiated. We used k = 8, N = 256, and a batch size of 32.
During training, we performed real-time augmentation of the extracted tiles using variations of brightness, saturation, and contrast. We trained the model using the Adam optimisation algorithm [50], with the binary cross entropy loss, beta 1 = 0.9, beta 2 = 0.999, and a learning rate of 0.001. We applied a learning rate decay of 0.95 every 2 epochs. We used early stopping by tracking the performance of the model on a validation set, and training was stopped automatically when there was no further improvement on the validation loss for 10 epochs. The model with the lowest validation loss was chosen as the final model.

Software and Statistical Analysis
The deep learning models were implemented and trained using the open-source TensorFlow library [51]. AUCs were calculated in python using the scikit-learn package [52] and plotted using matplotlib [53]. The 95% CIs of the AUCs were estimated using the bootstrap method [54] with 1000 iterations. The ROC curve was computed by varying the probability threshold from 0.0 to 1.0 and computing both the TPR and FPR at the given threshold.

Insufficient AUC Performance of Whole Slide Image (WSI) Neoplastic Evaluation on Urine LBC WSIs Using Existing Series of LBC Cytopathological Model
Prior to training urine LBC neoplastic screening models, we applied existing LBC uterine cervix neoplastic screening model [35] and histopathological classification models and evaluated their AUC performances on urine LBC test sets (Table 1). This is summarised in Table 3. Table 3. ROC-AUC and log-loss scores for existing deep learning models to classify liquid-based cytology (LBC) and histopathology whole slide images (WSIs).

Existing Models ROC-AUC Log Loss
Liquid  [47] with fully supervised (FS) learning [35,55], and weakly supervised (WS) learning [56] approaches as described elsewhere. These models are all based on the EfficientNetB1 convolutional neural network (CNN) architecture. To compare transfer learning models' performance ([ENB1-UC-FS+WS] and [ENB1-UC-WS]), we trained two models using EfficientNetB1 architecture at a same magnification of ×10 and tile size (1024 × 1024 px). To train deep learning models, we used a total of 62 neoplastic (with annotation) and 724 negative (without annotation) training set WSIs and 10 neoplastic (with annotation) and 10 negative (without annotation) validation set WSIs (Table 1). This resulted in four different models:  (Table 1). For each test set (equal and clinical balance), we computed the ROC-AUC, log-loss, accuracy, sensitivity, and specificity and summarized in Table 4 and Figure 3 and 4. Overall, four different trained deep learning models achieved equivalent ROC-AUC, log-loss, accuracy, sensitivity, and specificity at whole-slide level (WSI evaluation) in both equal and clinical balance test sets (Table 4, Figure 3). However, heatmap image appearances were different among four trained deep learning models (Figure 4). The localization patterns of predicted tiles were approximately same among four trained deep learning models (Figure 4). Looking at heatmap images of the same urine LBC WSIs (WSI-1 and WSI-2) (Figure 4) that were correctly predicted (true-positive) as neoplastic WSI using four different trained models, all models could predict tiles with neoplastic urothelial epithelial cells ( Figure 4A-D,M-P) satisfactorily ( Figure 4E-L,Q-X). However, probabilities in each neoplastic predicted tiles were totally different among four trained models (Figure 4). Among the four trained model, ENB1-UC-FS+WS exhibited the best tile prediction overall based on inspection of the heatmap images ( Figure 4E,F,Q,R). Therefore, our results show that the ENB1-UC-FS+WS model is the best model for urine LBC neoplastic urothelial epithelial cell screening (Table 4

True Positive Prediction
The model ENB1-UC-FS+WS satisfactorily predicted neoplastic urothelial epithelial cells in urine LBC WSIs ( Figure 5). Cytopathologically, Figure 5A exhibited atypical urothelial epithelial cells ( Figure 5B) and was diagnosed as Class III. Figure 5E showed low grade urothelial carcinoma (LGUC) cells ( Figure 5F) and was diagnosed as Class IV. Figure 5I showed high grade urothelial carcinoma (HGUC) cells ( Figure 5J) and was diagnosed as Class V. These three WSIs should be classified as neoplastic in this study. The heatmap images show true positive predictions of atypical utorhelial cells ( Figure 5C,D), LGUC cells ( Figure 5G,H), and HGUC cells ( Figure 5K,L) which were confirmed by a cytoscreener and a cytopathologist by viewing original WSIs and predicted heatmap images. In contrast, in low probability tiles (light blue and blue background) ( Figure 5), two independent cytoscreeners confirmed there were no neoplastic urothelial epithelial cells.

True Negative Prediction
The model ENB1-UC-FS+WS satisfactorily predicted negative cases (cytopathologically as Class I and Class II) in urine LBC WSIs ( Figure 6). The heatmap images show true negative predictions of neoplastic urothelial epithelial cells ( Figure 6C,F). In zero probability tiles (blue background color) ( Figure 6C,F), there are no neoplastic urothelial epithelial cells in pyuria (cytodiagnosed as Class I) ( Figure 6A) which consisted of infective fluid with small number of non-neoplastic epithelial cells ( Figure 6B) and urothelial epithelial cells with slight nuclear enlargement ( Figure 6D,E) (cytodiagnosed as Class II).

False Positive Prediction
A cytopathologically diagnosed negative (Class I) case ( Figure 7A) consisted of metaplastic squamous epithelial cells and non-neoplastic urothelial epithelial cells ( Figure 7B Figure 7D). Cytopathologically, there are non-neoplastic urothelial epithelial cells with a slightly increased nuclear cytoplasmic (N/C) ratio and metaplastic squamous epithelial cells ( Figure 7B), which could be a major cause of false positive.

False Negative Prediction
According to the cytodiagnosis report and additional cytoscreener and cytopathologist's review, in this urine LBC WSI ( Figure 8A), there were cellular clusters of atypical (neoplastic) urothelial epithelial cells ( Figure 8B,C) with high nuclear cytoplasmic ratio, indicating this WSI ( Figure 8A) should be classified as neoplastic (Class III). However, our model [ENB1-UC-FS+WS ] did not predict or very low level predicted neoplastic urothelial epithelial cells ( Figure 8D-F). It would be speculated that neoplastic urothelial epithelial cellular clustering could be a possible cause of false negative due to the overlapping morphology.

Discussion
In this study, we trained deep learning models for the classification of neoplastic  (Table 1) were collected based on the cytodiagnoses, reviewed by two independent cytoscreeners or cytopathologists, then verified by a senior cytoscreener or cytopathologist. We ensured that we had consensus on the diagnoses of the test set WSIs. Our best model (ENB1-UC-FS+WS) also achieved high accuracy (0.945-0.946), sensitivity (0.940-0.960), and specificity (0.929-0.946) in WSI level. It has been reported that at urine LBC (ThinPrep) WSI level, the deep learning model predicted neoplastic (positive) WSI at 0.842 (accuracy), 0.795 (sensitivity), and 0.845 (specificity) [57]. Our latest reported uterine cervix LBC (ThinPrep) model demonstrated accuracy at 0.907, sensitivity at 0.850, and specificity at 0.911 at WSI level [35]. In this study, we have trained total four deep learning models (ENB1-UC-FS+WS, ENB1-UC-WS, ENB1-IN-FS+WS, and ENB1-IN-WS) using two different weight initialisation: ImageNet and pre-trained uterine cervix neoplastic LBC model from a previous study [35] (Figure 2). At WSI level, these four models showed almost comparable ROC-AUC, log-loss, accuracy, sensitivity, and specificity (Table 4 and Figure 3). However, there was wide variety of tile level prediction as visualized by the heatmap images between the four models ( Figure 4). Based on the WSI and tile level (heatmap) evaluations, we have concluded that the model (ENB1-UC-FS+WS) which was trained using the pre-trained uterine cervix LBC model [35] weight initialisation, performed best. As for the false-negative prediction outputs in the urine LBC WSI which was cytodiagnosed as Class III ( Figure 8A), the model (ENB1-UC-FS+WS) could not predict neoplastic atypical urothelial epithelial cell cluster ( Figure 8B-F) in which neoplastic urothelial epithelial cells were overlapping and nuclear shapes and structures were hard to determine in the WSI ( Figure 8B,C). The model (ENB1-UC-FS+WS) could predict true negative urine crystal and cell debris precisely. False negative prediction outputs were most likely due to neoplastic urothelial epithelial cell clusters that mimicked urine crystal or cell debris.
According to the annual statistics on cytodiagnosis by the Japanese Society of Clinical Cytology (https://jscc.or.jp/, accessed on 13 January 2022), in 2021, there were 2,041,547 urine cytodianosis reports in Japan. In 2021, the total number of cytodiagnosis in Japan was 7,157,413. Therefore, the population of urine cytodiagnosis was approximately 28.5%. In Japan, urine cytology was the second most common cytology in 2021, as cervical cytology was the most common (3,289,877 cases, 50.0%) (https://jscc.or.jp/, accessed on 13 January 2022). LBC of urine specimens is commonly used in cytology laboratories throughout the world and various processing methods, such as ThinPrep and SurePath, have been reported [58,59]. The LBC technique preserves the cells of interest (e.g., urothelial epithelial cells) in a liquid medium and removes most of the debris, blood, and exudate either by filtering or density gradient centrifugation. The efficiency of diagnosis employing the LBC is high because of the cell collection rate. It was demonstrated that the accuracy of diagnoses made employing the LBC method can be increased by understanding the characteristics of the urothelial epithelial cell morphology in suspicious cases [25]. Following the appropriate LBC specimen preparation steps, cell morphology (structure) is satisfactorily preserved, which allows more accurate diagnosing of LBC slides as shown by the significant concordance between cytological and histological diagnosis (92%), the significant number of LGUC (20.5%) revealed by urinary cytology and validated by histology, and the low rate (8%) of misjudgement of cytological diagnosis [60]. In addition, the leftover urine LBC material can be used for other techniques such as immunocytochemistry, molecular biology and flow cytometry [61]. Therefore, LBC has been applied with good results in urine cytology and can be regarded as an appropriate substitute for conventional smear urine cytology. LBC techniques opens new possibilities for a systemic urothelial carcinoma screening by integrating digital pathology WSI technique and deep learning model(s), resulting a standardised high-quality readout (e.g., classification).
One limitation of this study is that it primarily included urine LBC (ThinPrep) WSIs (both training and test sets) from a single private clinical laboratory in Japan. Therefore, the deep learning models could potentially be biased to such specimens. Validations on a wide variety of specimens from multiple different origins (both clinical laboratories and hospitals) and other LBC method(s) (e.g., SurePath) would be essential for ensuring the robustness of the models. Another potential validation study should involve the comparison of the performance of the models against cytoscreeners and cytopathologists in a clinical setting.

Conclusions
In the present study, we trained deep learning models for the classification of neoplastic urine LBC WSIs. We have evaluated the models on two test sets (equal and clinical balance) achieving ROC-AUCs for diagnosis in the range of 0.984-0.990 by the best model (ENB1-UC-FS+WS). At WSI level, the model (ENB1-UC-FS+WS) achieved high accuracy (0.945-0.946), sensitivity (0.940-0.960), and specificity (0.929-0.946). Not only at WSI level, the model (ENB1-UC-FS+WS) satisfactorily predicted neoplastic urothelial epithelial cells (atypical, LGUC, and HGUC cells) by the heatmap images. Therefore, our model (ENB1-UC-FS+WS) can infer whether the urine LBC WSI is neoplastic ( Figure 5) or negative ( Figure 6) by inspecting model prediction outputs easily at WSI level as well as heatmap image, which makes it possible to use a deep learning model such as ours as a tool to aid in the urine LBC screening process in the clinical setting (workflow) for ranking cases by order of priority. Cytoscreeners and/or cytopathologists will need to perform full screening and subclassification (e.g., negative, atypical cells, suspicious for malignancy, and malignant) after the primary screening by our deep learning model, which could reduce their working time as the model would have highlighted the suspected neoplastic regions, and they would not have to perform an exhaustive search throughout the entire WSI. Funding: This study is based on results obtained from a project, JPNP14012, subsidized by the New Energy and Industrial Technology Development Organization (NEDO).

Institutional Review Board Statement:
The experimental protocol in this study was approved by the ethical board of the private clinical laboratory in Japan. All research activities complied with all relevant ethical regulations and were performed in accordance with relevant guidelines and regulations in the private clinical laboratory. Due to the confidentiality agreement with the private clinical laboratory, the name of the private clinical laboratory cannot be disclosed.

Informed Consent Statement:
Written informed consent to use cytopathological samples (liquidbased cytology glass slides) and cytopathological reports for research purposes in this study had previously been obtained from all patients and the opportunity for refusal to participate in research had been guaranteed by an opt-out manner.

Data Availability Statement:
The datasets generated during and/or analysed during the current study are not publicly available due to specific institutional requirements governing privacy protection; however, they are available from the corresponding author and from the private clinical laboratory in Japan on reasonable request. The datasets that support the findings of this study are available from the private clinical laboratory (Japan), but restrictions apply to the availability of these data, which were used under a data-use agreement that was made according to the Ethical Guidelines for Medical and Health Research Involving Human Subjects as set by the Japanese Ministry of Health, Labour and Welfare (Tokyo, Japan) and, thus, are not publicly available. However, the datasets are available from the authors upon reasonable request for private viewing and with permission from the corresponding private clinical laboratory within the terms of the data use agreement and if compliant with the ethical and legal requirements as stipulated by the Japanese Ministry of Health, Labour and Welfare.