Next Article in Journal
MedSAM/MedSAM2 Feature Fusion: Enhancing nnUNet for 2D TOF-MRA Brain Vessel Segmentation
Previous Article in Journal
Decolorization with Warmth–Coolness Adjustment in an Opponent and Complementary Color System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Automation of Multi-Class Microscopy Image Classification Based on the Microorganisms Taxonomic Features Extraction

1
Higher School of Digital Culture, ITMO University, St. Petersburg 197101, Russia
2
Faculty of Radio Engineering and Telecommunications, St. Petersburg Electrotechnical University “LETI”, St. Petersburg 197022, Russia
3
Department of Microbiological Synthesis Technology, St. Petersburg State Institute of Technology, St. Petersburg 190013, Russia
4
Information Systems Department, International IT University, Almaty 050000, Kazakhstan
*
Author to whom correspondence should be addressed.
J. Imaging 2025, 11(6), 201; https://doi.org/10.3390/jimaging11060201
Submission received: 1 May 2025 / Revised: 11 June 2025 / Accepted: 14 June 2025 / Published: 18 June 2025
(This article belongs to the Section Medical Imaging)

Abstract

This study presents a unified low-parameter approach to multi-class classification of microorganisms (micrococci, diplococci, streptococci, and bacilli) based on automated machine learning. The method is designed to produce interpretable taxonomic descriptors through analysis of the external geometric characteristics of microorganisms, including cell shape, colony organization, and dynamic behavior in unfixed microscopic scenes. A key advantage of the proposed approach is its lightweight nature: the resulting models have significantly fewer parameters than deep learning-based alternatives, enabling fast inference even on standard CPU hardware. An annotated dataset containing images of four bacterial types obtained under conditions simulating real clinical trials has been developed and published to validate the method. The results (Precision = 0.910, Recall = 0.901, and F1-score = 0.905) confirm the effectiveness of the proposed method for biomedical diagnostic tasks, especially in settings with limited computational resources and a need for feature interpretability. Our approach demonstrates performance comparable to state-of-the-art methods while offering superior efficiency and lightweight design due to its significantly reduced number of parameters.

1. Introduction

Infectious diseases remain one of the leading causes of mortality and economic loss on a global scale. According to the WHO [1], in 2019, bacterial infections, including pneumonia, meningitis, and sepsis, killed more than 7 million people, and the growth of antibiotic resistance threatens to negate the achievements of modern medicine [2,3]. The COVID-19 [4] pandemic vividly demonstrated humanity’s vulnerability to pathogens: bacterial co-infections significantly worsened the prognosis in patients with SARS-CoV-2 [5], and errors in diagnosis aggravated the burden on healthcare systems [6,7]. In this context, the rapid and accurate identification of microorganisms is becoming not just a scientific task but a critical tool for saving lives.
Bacteria surround humans everywhere: they are involved in digestion, drug production, food fermentation, and maintaining ecological balance [8]. However, among thousands of bacterial species, dangerous pathogens can cause serious diseases. For example, Lactobacillus promotes intestinal health, while Salmonella leads to food poisoning [9]. This diversity requires a clear distinction between beneficial and harmful microorganisms, especially in clinical practice.
Biologists traditionally classify bacteria based on morphology [10] (cell shape, size, colony structure, etc.) and biochemical properties [11]. For instance, cocci [12] (spherical), bacilli [13] (rod-shaped), and spirilla [14] (spiral-shaped) can be visually distinguished, as shown in Figure 1, yet even within a single group, there exist so-called “look-alike” species. For example, Streptococcus pneumoniae [15] (diplococcus) and Enterococcus faecalis [16] (chains of cocci) can be misidentified during manual microscopy [17]. Moreover, some bacteria alter their shape and metabolism in response to environmental conditions, further complicating their identification even for experienced specialists [18].
Among bacteria requiring close attention, four groups stand out due to their clinical and epidemiological significance:
1.
Micrococci [19]: These are facultatively pathogenic Gram-positive cocci that can cause opportunistic infections, particularly in immunocompromised patients. While generally considered low-virulence organisms, their ability to exploit weakened host defenses makes them a notable concern in healthcare settings [20].
2.
Diplococci [21]: These pathogens are responsible for severe diseases such as pneumonia and meningitis. Notably, Streptococcus pneumoniae alone accounts for approximately 15% of childhood mortality in low-income countries, highlighting its devastating impact on vulnerable populations.
3.
Streptococci [22]: This genus includes pathogens that cause a wide range of diseases, from pharyngitis and scarlet fever to severe post-infectious complications such as rheumatic fever. Globally, streptococcal infections affect over 600 million people annually, underscoring their pervasive public health burden [23].
4.
Bacilli [24]: This group encompasses species such as B. anthracis, the causative agent of anthrax, and B. cereus, a common culprit in foodborne illnesses. While B. anthracis poses significant biosecurity risks due to its potential use as a biological weapon, B. cereus is a frequent cause of gastroenteritis, particularly in improperly stored food. Both species exemplify the dual threat posed by bacilli to both individual health and broader biosecurity [25].
Given the high clinical and epidemiological impact of these bacterial groups, accurate identification is critical for effective treatment and infection control. While conventional diagnostic methods, such as Gram staining and culture-based techniques, remain widely used, they have limitations, including long processing times and potential misidentifications due to morphological similarities.
Recent advancements in molecular diagnostics have significantly improved bacterial identification. Methods such as MALDI-TOF mass spectrometry [26], polymerase chain reaction (PCR) [27], and next-generation sequencing (NGS) [28] enable rapid and precise differentiation of closely related bacterial species. These technologies allow clinicians to identify pathogens within hours rather than days, leading to faster and more targeted antibiotic therapy. However, in resource-limited settings, where bacterial infections are most prevalent, accessibility to these advanced methods remains a challenge. Developing cost-effective, rapid diagnostic solutions is, therefore, a key priority for global health initiatives.

2. Related Work

The automatic classification of microorganisms has been extensively studied in biomedical image processing, with a primary focus on deep learning techniques, feature extraction methods, and automated machine learning (AutoML) [29,30,31] approaches. Traditional microbiological diagnostics, which rely on manual microscopy and biochemical assays, have long been the gold standard for bacterial identification but suffer from subjectivity, long processing times, and potential human errors [11,32]. Recent advancements in machine learning have enabled the development of automated methods that enhance diagnostic accuracy and efficiency.
Currently, there are many publicly available datasets for the task of classifying microorganisms in microscope images. At the same time, for these datasets [33], there are quality metrics for a wide range of models, such as convolutional networks [34] or transformers [35]. However, it should be noted that most of these datasets [36] contain images of a fixed microscopic scene, in which the microorganisms are immobilized and the scene itself has high contrast compared to the image of a non-fixed substance. In our work, we focus precisely on the study of microscopic scenes, which prompted us to collect and annotate our own dataset presented in this article.
Early efforts in bacterial classification employed handcrafted feature extraction techniques, such as texture analysis [37], shape descriptors [38], and statistical models [39]. The mentioned approaches explore the use of statistical image analysis and machine learning to classify bacteria based on morphological and textural features extracted from microscopic images. The authors developed computational models to automate bacterial identification, reducing reliance on manual analysis. This method’s approach enhances accuracy and scalability, providing a robust framework for high-throughput analysis. Regarding its limitations, the performance depends on image quality, and the method requires large labeled datasets, while some models lack interpretability. The obtained results highlight the potential of data-driven bacterial identification but also underscore key challenges in reproducibility and real-world applicability.
These methods allowed for some degree of automation but lacked the adaptability of modern deep-learning models. The advent of convolutional neural networks (CNNs) revolutionized bacterial image classification by enabling end-to-end feature learning directly from images [40,41]. Several studies have demonstrated the effectiveness of CNNs [42,43] in recognizing various bacterial morphologies, including cocci, bacilli, and spirochetes, under different microscopy conditions [44,45].
However, deep learning approaches have significant limitations in microbiology. First, CNNs require large, well-annotated datasets to achieve high accuracy, which can be challenging to obtain in specialized domains [46]. Second, biomedical images often exhibit domain-specific challenges, such as low contrast, noise, and motion artifacts, making deep learning models prone to overfitting or misclassification [47]. Third, the interpretability of CNNs remains a concern, as black-box models hinder the trustworthiness and clinical adoption of AI-based diagnostic systems [48].
Hybrid approaches combining deep learning with traditional feature engineering have been proposed to address these challenges. Methods leveraging deep feature extraction followed by classical classifiers (e.g., Support Vector Machines and Random Forests) have shown promise in improving classification performance while retaining some level of interpretability [49,50]. Additionally, transformer-based architectures, such as Vision Transformers (ViTs), have recently gained traction in medical imaging because they capture long-range dependencies in visual data [51]. However, ViTs require large-scale datasets and computational resources, limiting their feasibility in microbiology [52].
AutoML represents a promising alternative by automating machine learning pipeline selection, optimization, and evaluation. Recent studies have explored AutoML for various biomedical applications, including histopathological image classification [53], tumor detection [54], and genomic analysis [55]. AutoML has been utilized in bacterial classification to optimize feature extraction and model selection, reducing the need for extensive hyperparameter tuning and domain expertise [56]. The ability of AutoML to generate interpretable feature spaces by leveraging geometric and morphological descriptors makes it particularly suitable for microbiological diagnostics [57].
In this study, we apply AutoML to the multi-class classification of microorganisms, including micrococci, diplococci, streptococci, and bacilli. Our approach integrates feature space generation based on external geometric properties with an automated pipeline for classifier selection and optimization. Compared to existing deep learning-based methods, our technique enhances interpretability while maintaining high classification performance under challenging microscopy conditions. Notably, the resulting models have few parameters and demonstrate fast inference even on standard CPU hardware, making the approach suitable for real-time or resource-constrained biomedical applications. The results demonstrate that AutoML can be a robust and scalable tool for biomedical diagnostics, addressing the growing need for rapid and reliable pathogen identification.

3. Problem Statement

The classification of microorganisms is traditionally based on microscopic analysis, where experts identify bacteria by their shape, size, and cell organization. However, this approach has two serious limitations. Firstly, it is subject to subjectivity, since the interpretation of morphological features depends on the experience of the laboratory technician. For example, diplococci may be mistaken for single cocci and streptococci for image artefacts [20]. Secondly, manual analysis requires considerable time. In the context of epidemics, the promptness of diagnosis becomes a critical factor: a 24-hour delay in the identification of Streptococcus pneumoniae increases the risk of death by sepsis by 18% [58].
Automation of the classification process based on computer vision methods seems to be a promising solution that can eliminate subjectivity and increase the speed of analysis. However, modern neural network approaches, in particular convolutional neural networks (CNNs), face several difficulties. First, they require large amounts of host data, which are often unavailable in microbiology. Secondly, biomedical images have specific features such as low contrast, blurred object boundaries, and motion artifacts in unfixed samples, which make it difficult to classify them [59] accurately.
In this study, we solve the problems of multi-class classification of microorganisms by using the AutoML method to construct a feature space and optimize classifiers automatically. There are six classes in our dataset: micrococci, diplococci, streptococci, bacilli, other microorganisms, and areas without bacteria. The proposed method combines the analysis of the geometric characteristics of bacterial cells with the capabilities of AutoML, which makes it possible to achieve high classification accuracy and adaptability to difficult microscopy conditions. The main goal of the work is to develop an effective and interpretable tool for the diagnosis of bacterial infections, which can be used both in well-equipped laboratories and resource-limited conditions.

4. Materials and Methods

Our method is built upon a sequential architecture that progressively transforms raw microscopic images into meaningful predictions. The pipeline begins with image preprocessing aimed at enhancing visual clarity and ensuring consistency across samples. This preparatory step lays the foundation for robust analysis by minimizing noise and standardizing input quality. Next, the system performs targeted extraction of informative geometric and morphological features, capturing the essential structural traits of microorganisms. These primary descriptors are then automatically expanded into higher-order representations, enriching the feature space with more abstract and discriminative patterns. Finally, a classifier processes the resulting feature vectors, yielding accurate microorganism identification. An overview of the full classification workflow is presented in Figure 2.

4.1. Image Preprocessing

Since unfixed microscopic scenes have poor contrast, as well as artifacts associated with blurriness and variable brightness, and it was difficult to select parameters for the corresponding correction filters based on any rules, we used a lightweight neural network model based on the LFIEM (UniFi modification) [60,61] neural network architecture, which selects parameters in such a way as to configure preprocessing filters to achieve maximum quality metrics. The model used showed high results [61] on common color correction datasets and also contains a multiple of fewer parameters compared to analogs [60,61]. The experimental data also confirmed the feasibility of its use to improve the quality of our classifier.
The preprocessing approach presented in [60] was employed with several refinements, as outlined below. The modified structure of the corrective transformation process is depicted in the diagram that follows.
I e = I o + i = 1 n f i ( I o , h i ( I s o ) ) .
This architecture consists of multiple independent processing units, with their quantity determined by the selected filters. Each unit i operates on a downsampled version of the original image I s o through a parameter generator h i , which derives the corresponding filter parameters p i for f i . These filters are then individually applied to the initial image I o , and the final enhanced image is obtained by combining the original input with the outputs of the filtering operations.
To mitigate common challenges in microscopic scene images, such as blurring, low contrast, and insufficient sharpness, we employed specialized filters tailored for corrective transformations.
The sharp filter is defined using the following auxiliary formula:
I o u t = I i n 1 ν ( K + M · q ) ,
where K represents the filter kernel matrix, M denotes a mapping matrix of the same dimensions as K, and ν is the sum of all elements in ( K + M · q ) , ensuring proper normalization of the kernel matrix. This transformation is independently applied to the red, green, and blue color channels, with each controlled by a distinct trainable parameter. Therefore, the parameters that define the sharp filter modification are as follows:
K = 1 4 6 4 1 4 16 24 16 4 6 24 476 24 6 4 16 24 16 4 1 4 6 4 1 , M = 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.9 1 0.9 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 .
The automatic contrast correction is achieved by adjusting the parameter p [ 1 , 1 ] , which governs the transformation applied to each pixel in the input image. As a result, the original image undergoes the following mapping:
I o u t [ x , y ] = ( I i n [ x , y ] 0.5 ) · 1 1 r , if r > 0 ( I i n [ x , y ] 0.5 ) · ( 1 r ) , otherwise .
It is important to emphasize that global exposure adjustments are essential due to the significant variations in illumination conditions observed during microscopic imaging.
The following image transformation performs automatic exposure correction:
I o u t [ x , y ] = I i n [ x , y ] · 2 t .
By reducing the number of transformations, we successfully integrated predictors for all parameters into a unified neural network encoder.
Thus, the structural design of the preprocessing module is illustrated in Figure 3.
To train our neural network parameter encoder, we manually selected 2000 good-quality microorganism images and applied fixed-value filters to emulate bad ones. We then trained a hybrid preprocessing scheme to reconstruct good images, following the procedure described in the original LFIEM paper.

4.2. Contour Primitives Determination

To achieve precise bacterial segmentation within bounding boxes, we employed advanced active contour models combined with morphological optimization. Contour selection (Figure 4) is used for further calculation of the characteristics of interest to us related to the shape of the objects under study (for example, when studying area characteristics and properties of segmentation masks). The pipeline begins with Selective Adaptive Thresholding–Active Contours (SAT-AC) [62], which integrates region-based energy terms with edge-driven forces for robust boundary detection in low-contrast microscopy images. Gradient enhancement is performed using anisotropic multi-scale Gabor filters [63], improving sensitivity to faint bacterial edges while suppressing noise. For shape preservation, curvature-constrained morphological smoothing [64] is applied, maintaining delicate structures (e.g., flagella or diplococci chains). It should also be noted that in our pipeline, we have minimized the risk of artifacts when selecting contours using the previous step of image preprocessing (getting rid of blur, increasing contrast, and correcting artifacts related to brightness). For each pipeline we studied, we selected the optimal parameters for these transformations using grid search and selected an appropriate configuration according to overall classification performance metrics maximization.
Using the contour characteristic, we calculated the statistical values describing the shapes of microorganisms, as is given further in the text of the article.

4.3. Automated Feature Generation

Within the framework of the presented methodology, we embarked on an in-depth exploration of the multi-faceted traits demonstrated by microorganisms, as revealed through advanced computational microscopy. At the outset, our procedure emphasizes the deliberate extraction of a wide array of quantifiable descriptors characterizing the biological sample. To streamline both interpretation and subsequent processing stages, these descriptors, along with the analytical techniques employed, are systematically organized into three foundational categories. This tripartite structure not only brings clarity to the complexity of the data but also lays the groundwork for constructing resilient and well-generalized analytical frameworks capable of supporting reliable classification and interpretation.
The first category. To initiate the analysis, we extract a range of prominent geometric descriptors that capture the overall size and spatial configuration of each microorganism observed in the image. Among these descriptors are the diameter and area of the outermost circumscribing circle ( d e x t and S e x t , respectively), along with the internal area occupied by the microorganism itself ( S o b j ).
Further, we compute the dimensions of the smallest rotated rectangle that can fully enclose the object, denoted by its side lengths a 1 and a 2 . These and other shape-defining parameters collectively form the foundation of our first feature group. Representative examples from this group are presented in Figure 5 organized by microorganism type: micrococci, diplococci, streptococci, and bacilli.
Beyond individual measurements, we also focus on how different features interact by forming pairwise ratios between logically connected parameters. This results in a derived set, β 0 , composed of relational metrics that provide additional insight into the morphological proportions and structural nuances of the analyzed samples:
β 0 0 = a 1 a 2 , β 0 1 = d e x t a 1 , β 0 2 = S o b j a 1 · a 2 , β 0 3 = S e x t a 1 · a 2 ,
and so on.
The process of distinguishing between various types of microorganisms and isolating their defining traits relies on a comprehensive utilization of the features contained within the β 0 set. These features are not limited to raw ratio values β 0 i derived from earlier geometric measurements but are further enriched through transformation and combination techniques.
Specifically, enhanced descriptors can be generated by scaling individual ratios with empirically determined coefficients α 0 i or by constructing more complex representations through weighted linear aggregations. These aggregations take the general form
γ 0 j α 0 i j β 0 i j 1 A k ( j ) ,
where γ 0 j defines optimization-derived weights obtained during training, and 1 A k ( j ) is an indicator function defining specific feature groupings A k within the power set 2 { 1 , . . . , i j } .
The second category. This stage of the analysis focuses on unveiling deeper and less immediately apparent spatial features of the microorganism, expanding beyond the more basic geometric descriptors discussed earlier. A central aspect involves evaluating the variation in distances from the object’s centroid to its boundary, specifically by measuring both the farthest ( L m a x ) and nearest ( L m i n ) contour points relative to the center of mass. Additionally, we consider the radius of the smallest circle that fully encompasses the object ( r e x t ), which provides insight into the overall spread of the structure.
To better understand asymmetries and irregularities in object shape, we also calculate the spatial offset between key geometric centers. This includes the distance between the center of mass ( O m ) and the center of the enclosing circle ( O e x t ), as well as the separation between O m and the center of the rotated rectangle of maximal area ( O r e c e x t ). These spatial relationships help characterize internal imbalances or directional elongation within the microorganism.
Figure 6 presents representative visualizations of features belonging to this second category of descriptors.
Building on these measurements, we further construct a feature set β 1 by computing ratios between logically related parameters. Each element of β 1 represents a quantitative relationship that captures interactions between structural dimensions, such as centrality shifts, enclosure tightness, or directional elongation, allowing for a richer and more discriminative morphological representation:
β 1 0 = d i s t ( O m ; O r e c e x t ) r e x t , β 1 1 = L m i n L m a x , β 1 2 = d i s t ( O m ; O e x t ) L m a x , β 1 3 = d i s t ( O m ; O e x t ) r e x t ,
and so forth.
The feature set β 1 serves as a foundation for distinguishing microscopic structures and reliably classifying various microorganism types. Rather than relying solely on the raw ratios β 1 i derived from prior measurements, we expand the representation space by introducing weighted transformations and composite features.
These extended descriptors emerge from two key operations: scaling individual elements β 1 i with experimentally determined coefficients α 1 i and constructing hierarchical linear combinations. The latter are defined by
γ 1 j α 1 i j β 1 i j 1 A k ( j ) ,
where γ 1 j defines significant weights optimized during model training, and 1 A k ( j ) is an indicator function that selectively activates feature subsets A k drawn from the power set 2 { 1 , . . . , i j } , with k [ 1 . . i j ] .
The third category. The third group of descriptors focuses on more complex geometric interactions identified from the microscopic imagery of microorganisms. In particular, we analyze spatial relationships between key geometric centers and boundaries of the object. Notable parameters in this category include the distance from the center of the circumscribed circle ( O e x t ) to the center of the largest-area rotated bounding rectangle ( O r e c e x t ), denoted as K 1 m a x , and the greatest separation between the object’s outer contour and the edges of this rectangle, referred to as K 1 r e c m a x . These values serve as indicators of asymmetry and spatial irregularities within the microorganism’s shape. Additional geometric descriptors complement these measures, offering a more nuanced morphological profile.
Representative illustrations of these features are shown in Figure 7.
To further enrich this representation, we examine functionally linked pairs of parameters and compute their ratios to form a derived numerical feature set β 2 . Each element in this set captures a structural relationship between two measurements, revealing, for example,
β 2 0 = K 1 m a x K 1 r e c m a x , β 2 1 = K 1 r e c m a x K 1 m a x ,
along with other combinations that express complementary dependencies.
The full set β 2 then serves as the basis for identifying the defining features of the observed biological entities and enables accurate classification. Rather than limiting the analysis to raw ratios β 2 i , we expand the feature space through transformations using empirically selected scaling coefficients α 2 i , as well as composite features derived from weighted linear formulations such as
γ 2 j α 2 i j β 2 i j 1 A k ( j ) ,
where γ 2 j defines coefficients optimized during the training phase, and the indicator function 1 A k ( j ) selects active subsets A k within the combinatorial space 2 { 1 , . . . , i j } .
This flexible and layered feature construction framework enables the synthesis of higher-order morphological descriptors by capturing complex dependencies and interactions among geometric attributes. By encoding subtle structural variations and relational patterns, the approach significantly enhances the model’s discriminatory power, improving both the accuracy and robustness of microorganism classification.

4.4. Classifiers

For data classification, we employed a range of classifiers commonly utilized in vector space classification tasks. The evaluated methods encompassed Support Vector Machines (SVMs) [65], Linear Regression (LR) [66], Random Forests (RFs) [67], Gradient Boosting Machines (GBMs) [68], and Fully Connected Neural Networks (FCNs) [69]. As demonstrated in the experimental results, the GBM classifier outperformed the others, achieving the highest classification accuracy.

5. Experiments, Results, and Discussion

5.1. Dataset Description

The dataset used in this study consists of ~35,000 microscopic images categorized into six distinct classes: micrococci, diplococci, bacilli, streptococci, other microorganisms, and random microscopy regions without bacteria. We used a 5:1:1 split for training, validation, and test sets, respectively. Our dataset was initially balanced; we also used augmentation by rotation and brightness change within acceptable values, which increased the sample size by eight times. These categories reflect both clinically significant bacterial groups and background noise, ensuring robust model generalization. Examples from each class can be found in Figure 8.
All microorganism samples used in this study were obtained exclusively from swab specimens collected from commercially available food products. The dataset comprises biological material extracted from open-access food items representing a broad range of product types found in the consumer market. This approach ensured that the dataset reflects realistic conditions relevant to public health and food safety while avoiding the use of any restricted, clinical, or proprietary sources.
The original images were acquired using a Levenhuk MED D30T microscope, capturing live, unfixed bacterial samples under realistic clinical conditions. The images were carefully annotated and split into three subsets: a training set for model learning, a validation set for hyperparameter tuning, and a test set for final performance evaluation.
This dataset has been made publicly available as an open-access microscopy dataset for microorganism classification. By providing a well-annotated and diverse dataset, we aim to facilitate further research in biomedical image processing, machine learning applications, and rapid pathogen identification.

5.2. Experiments

Analyzing the values of ratios and their linear combinations enables the identification of key characteristics of microscopic objects. This plays a crucial role in the development of advanced machine learning models that can accurately detect and classify microorganisms in images. The performance of the models and their respective configurations was systematically evaluated using the following metrics across a set of classifiers:
P r e c i s i o n = T P T P + F P ,
R e c a l l = T P T P + F N ,
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l .
Furthermore, an ablation study was conducted to assess the impact of various filter combinations within the LFIEM framework [60], which was trained on our dataset. The study focused on reconstructing the original image from its distorted counterpart and optimizing the preprocessing pipeline within our integrated classifier scheme. Our preprocessing filter-based module made a significant contribution to overall classifier performance. It should be noted that for training deep classifier models, we used the following environment configuration: an NVIDIA RTX 4090 GPU with 24 GB GDDR6X VRAM paired with an AMD Ryzen 9 7950X processor (16 cores, 32 threads, base clock 4.5 GHz, boost up to 5.7 GHz), 128 GB DDR5 RAM, and a 2 TB NVMe SSD (Samsung 990 Pro), where the models were trained images with RandAugment [70], Mixup [71], CutMix [72], and LFIEM-based preprocessing, using mixed precision [73] and distributed training, optimized via AdamW [74], with an initial learning rate of 3 × 10 4 , weight decay of 0.05, and cosine annealing scheduler [75] over 100 epochs, incorporating label smoothing [76], gradient clipping [77], and early stopping [78] for robust convergence. It should be noted that the method we proposed works in general equally effectively for all types of presented microorganisms, which was confirmed by the corresponding values in the confusion matrices (Figure 9 and Figure 10).
In addition to comparing our method with other approaches, we investigated various configurations of our preprocessing pipeline using the filters from the original paper. The results are presented in Table 1. The best configuration for our method (with Exposure + Contrast + Sharpness filters) is presented along with other best configurations of other methods in Table 2. The highest-performing configurations are summarized in Table 2 as well.

5.3. Discussion

The proposed approach for microbial typing demonstrates significant advantages, particularly its ability to analyze non-fixed specimens through specialized preprocessing while maintaining high classification performance metrics. Moreover, our analysis method is quite lightweight and resource-efficient compared to modern deep learning approaches. Our method enables robust morphological feature extraction and ensures interpretability and transparency throughout the classification pipeline. However, the current implementation exhibits certain limitations regarding feature universality and generation procedures, especially when considering the vast diversity of microbial morphologies. Moreover, when analyzing the outlayers, we found out that the main reason was the improper calibration of the capture device integrated with the microscope; further work is planned to be devoted to correcting this shortcoming of the overall system. Also, future research directions should focus on developing automated machine learning (AutoML) procedures for feature generation and the identification of more generalized morphological descriptors to enhance the method’s applicability across broader microbial taxa. Based on the solution of the described problems, a system can be built on the basis of our analytical core in conjunction with the corresponding hardware equipment, solving the problems of automated laboratory analysis of swabs.
It should be noted that for our model, all three groups of features had a large contribution to the classification, which was confirmed by the criteria of the mean Shapley values [79] by feature groups (Figure 11).
Also, when analyzing the errors of our classifier, we found that the largest group was characterized by unsuccessful frame capture by the microscope camera, that is, incorrect calibration parameters of the microscope camera; this problem often cannot be solved using our preprocessing due to very strong distortions or the absence of the necessary information in the frame (Figure 12). As a result, after preprocessing, these images (Figure 12) were classified as missing, but in fact, after calibrating the microscope camera, micrococci were present in this region. That is, errors mostly occurred when other settings of the internal calibration of the microscope camera were required to display visual signs of a microorganism. Even though our dataset is balanced and the quality metrics are high, the automation procedure for such studies should also include a change in the parameters of the internal calibration of the microscope camera, since various properties of the substance being studied are affected differently depending on them.
To balance the dataset and expand the dataset, it is possible to use image transformations (rotations, brightness changes, etc.) that do not change the shape of microorganisms.

6. Conclusions

In this study, we developed a hybrid neural network framework tailored for the classification of unfixed microscopic images containing micrococci, diplococci, streptococci, and bacilli. To rigorously assess the effectiveness of our approach, we curated, annotated, and publicly released a specialized dataset. By leveraging explicitly defined, interpretable taxonomic features, our method generated highly distinctive image descriptors, achieving superior performance compared to existing approaches on the evaluated dataset. A notable advantage of our framework is its lightweight architecture: it operates with significantly fewer parameters than typical deep learning models, enabling fast and efficient inference even on standard CPU hardware, making it suitable for real-time applications and deployment in low-resource environments. Additionally, our pipeline facilitated the identification of a set of interpretable taxonomic features, detailed within this work, that can be employed independently of the classifier for manual microbial identification in microscopic images. In future research, we plan to expand this methodology to include additional microbial species and explore its applicability in complex and dynamic microscopic environments.

Author Contributions

Conceptualization, A.S. (Aleksei Samarin); methodology, A.S. (Aleksei Samarin), A.N., A.D. and G.N.; software, A.N., A.S. (Alexander Savelev) and A.T.; validation, A.S. (Alexander Savelev), A.T., A.M., E.M., V.M., A.D. and G.N.; formal analysis, E.K., A.M., E.M. and V.M.; investigation, A.S. (Aleksei Samarin), A.N., A.S. (Alexander Savelev), A.T., A.D. and G.N.; resources, A.S. (Aleksei Samarin) and A.N.; data curation, E.K., A.M., E.M. and V.M.; writing—original draft preparation, A.N. and E.K.; writing—review and editing, A.S. (Aleksei Samarin) and E.K.; visualization, A.N., A.S. (Alexander Savelev) and A.T.; supervision, A.S. (Aleksei Samarin); project administration, A.S. (Aleksei Samarin). All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the ITMO University Research Projects in AI Initiative (project No. 640113).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset collected and used during this study has been made publicly available at the following link: https://github.com/itmo-cv-lab/mmicd (accessed on 1 May 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Moser, F.; Bump, J.B. Assessing the World Health Organization: What does the academic debate reveal and is it democratic? Soc. Sci. Med. 2022, 314, 115456. [Google Scholar] [CrossRef] [PubMed]
  2. Global Antimicrobial Resistance and Use Surveillance System Report; Technical Report; World Health Organization: Geneva, Switzerland, 2021.
  3. O’Neill, J. Tackling Drug-Resistant Infections Globally: Final Report and Recommendations; Technical Report; The Review on Antimicrobial Resistance: London, UK, 2016. [Google Scholar]
  4. Rodriguez-Morales, A.; Bonilla-Aldana, D.; Tiwari, R.; Sah, R.; Rabaan, A.; Dhama, K. COVID-19, an Emerging Coronavirus Infection: Current Scenario and Recent Developments—An Overview. J. Pure Appl. Microbiol. 2020, 14, 6150. [Google Scholar] [CrossRef]
  5. Rabaan, A.A.; Alenazy, M.F.; Alshehri, A.A.; Alshahrani, M.A.; Al-Subaie, M.F.; Alrasheed, H.A.; Al Kaabi, N.A.; Thakur, N.; Bouafia, N.A.; Alissa, M.; et al. An updated review on pathogenic coronaviruses (CoVs) amid the emergence of SARS-CoV-2 variants: A look into the repercussions and possible solutions. J. Infect. Public Health 2023, 16, 1870–1883. [Google Scholar] [CrossRef] [PubMed]
  6. Lansbury, L.; Lim, B.; Baskaran, V.; Lim, W.S. Co-infections in people with COVID-19: A systematic review and meta-analysis. J. Infect. 2020, 81, 266–275. [Google Scholar] [CrossRef]
  7. Suneja, M.; Beekmann, S.E.; Dhaliwal, G.; Miller, A.C.; Polgreen, P.M. Diagnostic delays in infectious diseases. Diagnosis 2022, 9, 332–339. [Google Scholar] [CrossRef]
  8. Sender, R.; Fuchs, S.; Milo, R. Revised Estimates for the Number of Human and Bacteria Cells in the Body. PLoS Biol. 2016, 14, e1002533. [Google Scholar] [CrossRef]
  9. Thursby, E.; Juge, N. Introduction to the Human Gut Microbiota. Biochem. J. 2017, 474, 1823–1836. [Google Scholar] [CrossRef]
  10. Sheikh, J.; Tan, T.S.; Malik, S.; Saidin, S.; Chua, L.S. Bacterial Morphology and Microscopic Advancements: Navigating from Basics to Breakthroughs. Microbiol. Immunol. Commun. 2024, 3, 03–41. [Google Scholar] [CrossRef]
  11. Madigan, M.T.; Bender, K.S.; Buckley, D.H.; Sattley, W.M.; Stahl, D.A. Brock Biology of Microorganisms, 16th ed.; Global Edition; Pearson Education Limited: London, UK, 2021. [Google Scholar]
  12. Hiremath, P.; Bannigidad, P. Identification and classification of cocci bacterial cells in digital microscopic images. Int. J. Comput. Biol. Drug Des. 2011, 4, 262–273. [Google Scholar] [CrossRef]
  13. Qian, J.; Wang, Y.; Hu, Z.; Shi, T.; Wang, Y.; Ye, C.; Huang, H. Bacillus sp. as a microbial cell factory: Advancements and future prospects. Biotechnol. Adv. 2023, 69, 108278. [Google Scholar] [CrossRef]
  14. Podkopaeva, D.; Grabovich, M.; Dubinina, G.; Lysenko, A.; Tourova, T.; Kolganova, T. Two new species of microaerophilic sulfur spirilla, Spirillum winogradskii sp. nov. and Spirillum kriegii sp. nov. Microbiology 2006, 75, 172–179. [Google Scholar] [CrossRef]
  15. Loughran, A.J.; Orihuela, C.J.; Tuomanen, E.I. Streptococcus pneumoniae: Invasion and Inflammation. Microbiol. Spectr. 2019, 7. [Google Scholar] [CrossRef]
  16. Archambaud, C.; Nunez, N.; da Silva, R.A.G.; Kline, K.A.; Serror, P. Enterococcus faecalis: An overlooked cell invader. Microbiol. Mol. Biol. Rev. 2024, 88, e00069-24. [Google Scholar] [CrossRef] [PubMed]
  17. Murray, P.R.; Rosenthal, K.S.; Pfaller, M.A. Medical Microbiology, 9th ed.; Elsevier: Amsterdam, The Netherlands, 2020. [Google Scholar]
  18. Justice, S.; Hunstad, D.; Cegelski, L.; Hultgren, S. Morphological plasticity as a bacterial survival strategy. Nat. Rev. Microbiol. 2008, 6, 162–168. [Google Scholar] [CrossRef]
  19. Kocur, M.; Kloos, W.; Schleifer, K. The Genus Micrococcus. In The Prokaryotes; Springer: New York, NY, USA, 2006; Volume 3, pp. 961–971. [Google Scholar] [CrossRef]
  20. Kumar, A.; Roberts, D.; Wood, K.E.; Light, B.; Parrillo, J.E.; Sharma, S.; Suppes, R.; Feinstein, D.; Zanotti, S.; Taiberg, L.; et al. Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit. Care Med. 2006, 34, 1589–1596. [Google Scholar] [CrossRef] [PubMed]
  21. Gurvichz, B. Serotherapy for epidemic cerebrospinal meningitis. Kazan Med. J. 2021, 32, 52–53. [Google Scholar] [CrossRef]
  22. Brouwer, S.; Hernandez, T.; Curren, B.; Harbison-Price, N.; De Oliveira, D.; Jespersen, M.; Davies, M.; Walker, M. Pathogenesis, epidemiology and control of Group A Streptococcus infection. Nat. Rev. Microbiol. 2023, 21, 431–447. [Google Scholar] [CrossRef]
  23. Chan, H.P.; Samala, R.K.; Hadjiiski, L.M.; Zhou, C. Deep Learning in Medical Image Analysis. In Deep Learning in Medical Image Analysis; Advances in Experimental Medicine and Biology; Springer: Cham, Switzerland, 2020; Volume 1213, pp. 3–21. [Google Scholar] [CrossRef]
  24. Mohamad, N.A.; Jusoh, N.A.; Htike, Z.Z.; Win, S.L. Bacteria Identification From Microscopic Morphology: A Survey. Int. J. Soft Comput. Artif. Intell. Appl. (IJSCAI) 2014, 3, 12. [Google Scholar] [CrossRef]
  25. Bottone, E.J. Bacillus cereus, a Volatile Human Pathogen. Clin. Microbiol. Rev. 2010, 23, 382–398. [Google Scholar] [CrossRef]
  26. Singhal, N.; Kumar, M.; Kanaujia, P.; Virdi, J. MALDI-TOF mass spectrometry: An emerging technology for microbial identification and diagnosis. Front. Microbiol. 2015, 6, 791. [Google Scholar] [CrossRef]
  27. Rahman, M.; Uddin, M.; Sultana, R.; Moue, A.; Setu, M. Polymerase Chain Reaction (PCR): A Short Review. Anwer Khan Mod. Med. Coll. J. 2013, 4, 30–36. [Google Scholar] [CrossRef]
  28. Mandlik, J.; Patil, A.; Singh, S. Next-Generation Sequencing (NGS): Platforms and Applications. J. Pharm. Bioallied Sci. 2024, 16, S41–S45. [Google Scholar] [CrossRef] [PubMed]
  29. Truong, A.; Walters, A.; Goodsitt, J.; Hines, K.; Bruss, C.; Farivar, R. Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools. In Proceedings of the 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 4–6 November 2019; pp. 1471–1479. [Google Scholar] [CrossRef]
  30. He, X.; Zhao, K.; Chu, X. AutoML: A Survey of the State-of-the-Art. arXiv 2019, arXiv:1908.00709. [Google Scholar] [CrossRef]
  31. Baratchi, M.; Wang, C.; Limmer, S.; van Rijn, J.N.; Hoos, H.H.; Bäck, T.; Olhofer, M. Automated machine learning: Past, present and future. Artif. Intell. Rev. 2024, 57, 122. [Google Scholar] [CrossRef]
  32. Kumar, S.; Das, P. Duration of bacterial infections and the need for rapid diagnostics. J. Clin. Microbiol. 2006, 44, 2578–2583. [Google Scholar]
  33. Wang, X.; Shi, Y.; Guo, S.; Qu, X.; Xie, F.; Duan, Z.; Hu, Y.; Fu, H.; Shi, X.; Quan, T.; et al. A Clinical Bacterial Dataset for Deep Learning in Microbiological Rapid On-Site Evaluation. Sci. Data 2024, 11, 608. [Google Scholar] [CrossRef]
  34. Shaily, T.; Kala, S. Bacterial Image Classification Using Convolutional Neural Networks. In Proceedings of the 2020 IEEE 17th India Council International Conference (INDICON), New Delhi, India, 10–13 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
  35. Visitsattaponge, S.; Bunkum, M.; Pintavirooj, C.; Paing, M.P. A Deep Learning Model for Bacterial Classification Using Big Transfer (BiT). IEEE Access 2024, 12, 15609–15621. [Google Scholar] [CrossRef]
  36. Spahn, C.; Laine, R.; Pereira, P.; Gómez de Mariscal, E.; Chamier, L.; Conduit, M.; Pinho, M.; Holden, S.; Jacquemet, G.; Heilemann, M.; et al. DeepBacs: Bacterial image analysis using open-source deep learning approaches. Commun. Biol. 2022, 5, 688. [Google Scholar] [CrossRef]
  37. Haralick, R.M.; Shanmugam, K. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, 3, 610–621. [Google Scholar] [CrossRef]
  38. Hu, M.K. Visual Pattern Recognition by Moment Invariants. IRE Trans. Inf. Theory 1962, 8, 179–187. [Google Scholar]
  39. Trattner, S.; Greenspan, H.; Tepper, G.; Abboud, S. Statistical Imaging for Modeling and Identification of Bacterial Types. In Medical Imaging 2004: Image Processing; Fitzpatrick, J.M., Reinhardt, J.M., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3117, pp. 329–340. [Google Scholar] [CrossRef]
  40. Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
  41. Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
  42. O’Shea, K.; Nash, R. An Introduction to Convolutional Neural Networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
  43. Purwono, P.; Ma’arif, A.; Rahmaniar, W.; Imam, H.; Fathurrahman, H.I.K.; Frisky, A.; Haq, Q.M.U. Understanding of Convolutional Neural Network (CNN): A Review. Int. J. Robot. Control Syst. 2023, 2, 739–748. [Google Scholar] [CrossRef]
  44. Talo, M. An Automated Deep Learning Approach for Bacterial Image Classification. arXiv 2019, arXiv:1912.08765. [Google Scholar] [CrossRef]
  45. Sarker, M.I.; Khan, M.M.R.; Prova, S.; Khan, M.; Morshed, M.; Reza, A.W.; Arefin, M. Utilizing Deep Learning for Microscopic Image Based Bacteria Species Identification. In Proceedings of the 2024 International Conference on Computing, Power and Advanced Systems (COMPAS), Cox’s Bazar, Bangladesh, 25–26 September 2024; IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar] [CrossRef]
  46. Zhang, H.; Qie, Y. Applying Deep Learning to Medical Imaging: A Review. Appl. Sci. 2023, 13, 10521. [Google Scholar] [CrossRef]
  47. Mazurowski, M.A.; Buda, M.; Saha, A.; Bashir, M.R. Deep learning in radiology: An overview of the concepts and a survey of the state of the art with focus on MRI. J. Magn. Reson. Imaging 2019, 49, 939–954. [Google Scholar] [CrossRef]
  48. Samek, W.; Montavon, G.; Vedaldi, A.; Hansen, L.K.; Müller, K.-R. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning Models. Digit. Signal Process. 2019, 93, 1–15. [Google Scholar]
  49. Kotwal, D.; Rani, P.; Arif, T.; Manhas, D. Machine Learning and Deep Learning Based Hybrid Feature Extraction and Classification Model Using Digital Microscopic Bacterial Images. SN Comput. Sci. 2023, 4, 21. [Google Scholar] [CrossRef]
  50. Tammineedi, V.S.V.; Naureen, A.; Ashraf, M.S.; Manna, S.; Mateen Buttar, A.; Muneeshwari, P.; Ahmad, M.W. Biomedical Microscopic Imaging in Computational Intelligence Using Deep Learning Ensemble Convolution Learning-Based Feature Extraction and Classification. Comput. Intell. Neurosci. 2022, 2022, 3531308. [Google Scholar] [CrossRef]
  51. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
  52. He, K.; Gan, C.; Li, Z.; Rekik, I.; Yin, Z.; Ji, W.; Gao, Y.; Wang, Q.; Zhang, J. Transformers in Medical Image Analysis: A Review. arXiv 2022, arXiv:2202.12165. [Google Scholar] [CrossRef]
  53. Hutter, F.; Kotthoff, L.; Vanschoren, J. (Eds.) Automated Machine Learning: Methods, Systems, Challenges; The Springer Series on Challenges in Machine Learning; Springer: Cham, Switzerland, 2019; ISBN 978-3-030-05318-5. [Google Scholar] [CrossRef]
  54. Demir, C.; Yener, B. Automated Cancer Diagnosis Based on Histopathological Images: A Systematic Survey. Pattern Recognit. 2004, 37, 1033–1049. [Google Scholar] [CrossRef]
  55. Khan, A.; Dwivedi, P.; Mugde, S.; S a, S.; Sharma, G.; Soni, G. Toward Automated Machine Learning for Genomics: Evaluation and Comparison of State-of-the-Art AutoML Approaches. In Automated and AI-Based Approaches for Bioinformatics and Biomedical Research; Academic Press: Cambridge, MA, USA, 2023; pp. 129–152. [Google Scholar] [CrossRef]
  56. Elangovan, K.; Lim, G.; Ting, D. A Comparative Study of an On Premise AutoML Solution for Medical Image Classification. Sci. Rep. 2024, 14, 10483. [Google Scholar] [CrossRef] [PubMed]
  57. Laccourreye, P.; Bielza, C.; Larranaga, P. Explainable Machine Learning for Longitudinal Multi-Omic Microbiome. Mathematics 2022, 10, 1994. [Google Scholar] [CrossRef]
  58. Smith, K.P.; Kirby, J.E. Human Error in Clinical Microbiology. J. Clin. Microbiol. 2020, 58, e00342-20. [Google Scholar]
  59. Hameurlaine, M.; Moussaoui, A.; Benaissa, S. Deep Learning for Medical Image Analysis. In Proceedings of the 4th International Conference on Recentt Advances iin Ellecttrriicall Systtems; 2019. Available online: https://www.researchgate.net/publication/338490461_Deep_Learning_for_Medical_Image_Analysis (accessed on 13 June 2025).
  60. Tatanov, O.; Samarin, A. LFIEM: Lightweight Filter-based Image Enhancement Model. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 873–878. [Google Scholar] [CrossRef]
  61. Samarin, A.; Nazarenko, A.; Savelev, A.; Toropov, A.; Dzestelova, A.; Mikhailova, E.; Motyko, A.; Malykh, V. A Model Based on Universal Filters for Image Color Correction. Pattern Recognit. Image Anal. 2024, 34, 844–854. [Google Scholar] [CrossRef]
  62. Chen, G.; Zhang, H.; Chen, I.; Yang, W. Active Contours with Thresholding Value for Image Segmentation. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2266–2269. [Google Scholar] [CrossRef]
  63. Li, Y.; Bi, Y.; Zhang, W.; Sun, C. Multi-Scale Anisotropic Gaussian Kernels for Image Edge Detection. IEEE Access 2020, 8, 1803–1812. [Google Scholar] [CrossRef]
  64. Schamberger, B.; Ziege, R.; Anselme, K.; Ben Amar, M.; Bykowski, M.; Castro, A.; Cipitria, A.; Coles, R.; Dimova, R.; Eder, M.; et al. Curvature in Biological Systems: Its Quantification, Emergence, and Implications across the Scales. Adv. Mater. 2023, 35, 2206110. [Google Scholar] [CrossRef]
  65. Hearst, M.; Dumais, S.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
  66. Huang, M. Theory and Implementation of linear regression. In Proceedings of the 2020 International Conference on Computer Vision, Image and Deep Learning (CVIDL), Chongqing, China, 10–12 July 2020; pp. 210–217. [Google Scholar] [CrossRef]
  67. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  68. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  69. Scabini, L.F.; Bruno, O.M. Structure and performance of fully connected neural networks: Emerging complex network properties. Phys. A: Stat. Mech. Its Appl. 2023, 615, 128585. [Google Scholar] [CrossRef]
  70. Cubuk, E.D.; Zoph, B.; Shlens, J.; Le, Q.V. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 3008–3017. [Google Scholar] [CrossRef]
  71. Psaroudakis, A.; Kollias, D. MixAugment & Mixup: Augmentation Methods for Facial Expression Recognition. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 2366–2374. [Google Scholar] [CrossRef]
  72. Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. arXiv 2019, arXiv:1905.04899. [Google Scholar]
  73. Micikevicius, P.; Narang, S.; Alben, J.; Diamos, G.; Elsen, E.; Garcia, D.; Ginsburg, B.; Houston, M.; Kuchaev, O.; Venkatesh, G.; et al. Mixed Precision Training. arXiv 2017, arXiv:1710.03740. [Google Scholar] [CrossRef]
  74. Guan, L. Weight Prediction Boosts the Convergence of AdamW. In Advances in Knowledge Discovery and Data Mining; Springer: Cham, Switzerland, 2023; pp. 329–340. [Google Scholar] [CrossRef]
  75. Liu, Z. Super Convergence Cosine Annealing with Warm-Up Learning Rate. In Proceedings of the CAIBDA 2022; 2nd International Conference on Artificial Intelligence, Big Data and Algorithms, Nanjing, China, 17–19 June 2022; pp. 1–7. [Google Scholar]
  76. Müller, R.; Kornblith, S.; Hinton, G.E. When Does Label Smoothing Help? arXiv 2019, arXiv:1906.02629. [Google Scholar]
  77. Zhang, J.; He, T.; Sra, S.; Jadbabaie, A. Why gradient clipping accelerates training: A theoretical justification for adaptivity. arXiv 2020, arXiv:1905.11881. [Google Scholar]
  78. Bai, Y.; Yang, E.; Han, B.; Yang, Y.; Li, J.; Mao, Y.; Niu, G.; Liu, T. Understanding and Improving Early Stopping for Learning with Noisy Labels. arXiv 2021, arXiv:2106.15853. [Google Scholar]
  79. Lundberg, S.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]
Figure 1. Bacterial cell morphology: (a) micrococci; (b) streptococci; (c) diplococci; (d) bacilli.
Figure 1. Bacterial cell morphology: (a) micrococci; (b) streptococci; (c) diplococci; (d) bacilli.
Jimaging 11 00201 g001
Figure 2. The general structure of the proposed classification model using the image of diplococci as an example.
Figure 2. The general structure of the proposed classification model using the image of diplococci as an example.
Jimaging 11 00201 g002
Figure 3. Image preprocessing module structure using the image of micrococci as an example.
Figure 3. Image preprocessing module structure using the image of micrococci as an example.
Jimaging 11 00201 g003
Figure 4. Visualization of the result of applying the automatic contour extraction procedure we used using the example of an image of a bacillus obtained from a microscope.
Figure 4. Visualization of the result of applying the automatic contour extraction procedure we used using the example of an image of a bacillus obtained from a microscope.
Jimaging 11 00201 g004
Figure 5. The first category of extracted characteristics of the examined objects. The first line contains features of micrococci, the second of diplococci, the third of streptococci, and the fourth of bacilli.
Figure 5. The first category of extracted characteristics of the examined objects. The first line contains features of micrococci, the second of diplococci, the third of streptococci, and the fourth of bacilli.
Jimaging 11 00201 g005
Figure 6. The second category of extracted characteristics of the examined objects. The first line contains features of micrococci, the second of diplococci, the third of streptococci, and the fourth of bacilli.
Figure 6. The second category of extracted characteristics of the examined objects. The first line contains features of micrococci, the second of diplococci, the third of streptococci, and the fourth of bacilli.
Jimaging 11 00201 g006
Figure 7. The third category of extracted characteristics of the examined objects. The first line contains features of micrococci, the second of diplococci, the third of streptococci, and the fourth of bacilli.
Figure 7. The third category of extracted characteristics of the examined objects. The first line contains features of micrococci, the second of diplococci, the third of streptococci, and the fourth of bacilli.
Jimaging 11 00201 g007
Figure 8. Examples of images from the six classes: (a) an image of micrococci; (b) an image of diplococci; (c) an image of bacilli; (d) an image of streptococci; (e) an image of another microorganism; (f) an image of a random region of microscopy scene that does contain not any microorganism.
Figure 8. Examples of images from the six classes: (a) an image of micrococci; (b) an image of diplococci; (c) an image of bacilli; (d) an image of streptococci; (e) an image of another microorganism; (f) an image of a random region of microscopy scene that does contain not any microorganism.
Jimaging 11 00201 g008
Figure 9. Non-normalized confusion matrix.
Figure 9. Non-normalized confusion matrix.
Jimaging 11 00201 g009
Figure 10. Normalized confusion matrix.
Figure 10. Normalized confusion matrix.
Jimaging 11 00201 g010
Figure 11. Feature contribution analysis. Mean Shapley values by feature categories.
Figure 11. Feature contribution analysis. Mean Shapley values by feature categories.
Jimaging 11 00201 g011
Figure 12. Examples of unsuccessful microscopic scene captures.
Figure 12. Examples of unsuccessful microscopic scene captures.
Jimaging 11 00201 g012
Table 1. Comparison of preprocessing methods with Features Gen + GBM classifier.
Table 1. Comparison of preprocessing methods with Features Gen + GBM classifier.
Preprocessing Method CombinationPrecisionRecallF1-Score
No preprocessing0.8120.8030.807
Exposure only0.8450.8320.838
Sharpness only0.8270.8180.822
Contrast only0.8510.8420.846
Linear transformation only0.8360.8240.830
Trainable kernel only0.8380.8290.833
Exposure + Sharpness0.8670.8540.860
Exposure + Contrast0.8820.8710.876
Sharpness + Contrast0.8750.8630.869
Blur + Contrast0.8410.8330.837
Linear transformation + Sharpness0.8530.8410.847
Trainable kernel + Exposure0.8640.8530.858
Exposure + Contrast + Blur0.8780.8660.872
Exposure + Linear transformation + Sharpness0.8810.8700.875
Exposure + Contrast + Sharpness0.9100.9010.905
Exposure + Contrast + Sharpness + Blur0.8920.8810.886
All filters combined0.8850.8740.879
Table 2. Comparative analysis of classification pipelines using the MMICD dataset (top-30).
Table 2. Comparative analysis of classification pipelines using the MMICD dataset (top-30).
Filters ConfigurationClassifier ModelParams (M)FLOPs (G)PrecisionRecallF1
Exposure + Contrast + SharpnessMobileNetV35.40.220.7230.6980.710
Exposure + Contrast + SharpnessInceptionResNetV127.95.710.7350.7120.723
Exposure + ContrastResNet15260.211.310.7480.7250.736
Exposure + ContrastEfficientNetB05.30.390.7520.7310.741
Exposure + ContrastFeatures Gen + SVM0.80.050.7610.7390.750
Exposure + ContrastInceptionResNetV255.912.980.7680.7450.756
Exposure + Contrast + SharpnessFeatures Gen + SVM0.80.050.7740.7520.763
Exposure + Contrast + SharpnessEfficientNetB05.30.390.7810.7600.770
Exposure + ContrastResNet10144.67.850.7850.7640.774
Exposure + Contrast + SharpnessEfficientNetB17.80.700.7920.7710.781
Exposure + ContrastEfficientNetB29.21.010.7980.7780.788
Exposure + Contrast + SharpnessResNet10144.67.850.8030.7840.793
Exposure + Contrast + SharpnessEfficientNetB312.21.860.8090.7900.799
Exposure + ContrastEfficientNetB419.33.390.8150.7960.805
Exposure + ContrastCoAtNet42.16.520.8210.8030.812
Exposure + ContrastEfficientNetB643.010.340.8270.8100.818
Exposure + ContrastFeatures Gen + RF1.20.080.8320.8150.823
Exposure + ContrastSE-ResNext5027.64.250.8380.8210.829
Exposure + Contrast + SharpnessResNet15260.211.310.8430.8270.835
Exposure + Contrast + SharpnessFeatures Gen + RF1.20.080.8490.8330.841
Exposure + ContrastFeatures Gen + GBM1.50.120.8540.8390.846
Exposure + Contrast + SharpnessCoAtNet42.16.520.8600.8450.852
Exposure + ContrastViT-L/16304.3190.70.8660.8520.859
Exposure + ContrastEfficientNetB312.21.860.8720.8580.865
Exposure + Contrast + SharpnessEfficientNetB419.33.390.8780.8650.871
Exposure + Contrast + SharpnessInceptionResNetV255.912.980.8840.8710.877
Exposure + Contrast + SharpnessSE-ResNext5027.64.250.8900.8780.884
Exposure + Contrast + SharpnessViT-L/16304.3190.70.8960.8850.890
Exposure + Contrast + SharpnessEfficientNetB643.010.340.9020.8920.897
Exposure + Contrast + SharpnessFeatures Gen + AutoML1.80.150.9100.9010.905
Note: Params = Number of trainable parameters in millions (M), FLOPs = Floating Point Operations per inference in billions (G). Our method achieved superior performance with significantly lower computational requirements.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Samarin, A.; Savelev, A.; Toropov, A.; Dozortseva, A.; Kotenko, E.; Nazarenko, A.; Motyko, A.; Narova, G.; Mikhailova, E.; Malykh, V. Automation of Multi-Class Microscopy Image Classification Based on the Microorganisms Taxonomic Features Extraction. J. Imaging 2025, 11, 201. https://doi.org/10.3390/jimaging11060201

AMA Style

Samarin A, Savelev A, Toropov A, Dozortseva A, Kotenko E, Nazarenko A, Motyko A, Narova G, Mikhailova E, Malykh V. Automation of Multi-Class Microscopy Image Classification Based on the Microorganisms Taxonomic Features Extraction. Journal of Imaging. 2025; 11(6):201. https://doi.org/10.3390/jimaging11060201

Chicago/Turabian Style

Samarin, Aleksei, Alexander Savelev, Aleksei Toropov, Aleksandra Dozortseva, Egor Kotenko, Artem Nazarenko, Alexander Motyko, Galiya Narova, Elena Mikhailova, and Valentin Malykh. 2025. "Automation of Multi-Class Microscopy Image Classification Based on the Microorganisms Taxonomic Features Extraction" Journal of Imaging 11, no. 6: 201. https://doi.org/10.3390/jimaging11060201

APA Style

Samarin, A., Savelev, A., Toropov, A., Dozortseva, A., Kotenko, E., Nazarenko, A., Motyko, A., Narova, G., Mikhailova, E., & Malykh, V. (2025). Automation of Multi-Class Microscopy Image Classification Based on the Microorganisms Taxonomic Features Extraction. Journal of Imaging, 11(6), 201. https://doi.org/10.3390/jimaging11060201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop