Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections

Dutta, Anurag; Ramamoorthy, A.; Lakshmi, M. Gayathri; Kumar, Pijush Kanti

doi:10.3390/jmp6010006

Open AccessArticle

Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections

¹

Department of Computer Science and Engineering, Government College of Engineering and Textile Technology, Serampore 712201, Calcutta, India

²

Department of Mathematics, St. Joseph’s Institute of Technology, Chennai 600119, Tamil Nadu, India

³

Department of Mathematics, Saveetha Engineering College, Chennai 602105, Tamil Nadu, India

⁴

Department of Information Technology, Government College of Engineering and Textile Technology, Serampore 712201, Calcutta, India

^*

Author to whom correspondence should be addressed.

J. Mol. Pathol. 2025, 6(1), 6; https://doi.org/10.3390/jmp6010006

Submission received: 4 January 2025 / Revised: 11 February 2025 / Accepted: 3 March 2025 / Published: 5 March 2025

(This article belongs to the Special Issue Automation in the Pathology Laboratory)

Download

Browse Figures

Versions Notes

Abstract

:

Medical diagnostics is an important step in the identification and detection of any disease. Generally, diagnosis requires expert supervision, but in recent times, the evolving emergence of machine intelligence and its widespread applications has necessitated the integration of machine intelligence with pathological expert supervision. This research aims to mitigate the diagnostics of urinary tract infections (UTIs) by visual recognition of Colony-Forming Units (CFUs) in urine culture. Recognizing the patterns specific to positive, negative, or uncertain UTI suspicion has been complemented with several neural networks inheriting the Multi-Layered Perceptron (MLP) architecture, like Vision Transformer, Class-Attention in Vision Transformers, etc., to name a few. In contrast to the fixed model edge weights of MLPs, the novel Kolmogorov–Arnold Network (KAN) architecture considers a set of trainable activation functions on the edges, therefore enabling better extraction of features. Inheriting the novel KAN architecture, this research proposes a set of three deep learning models, namely, K²AN, KAN-C-Norm, and KAN-C-MLP. These models, experimented on an open-source pathological dataset, outperforms the state-of-the-art deep learning models (particularly those inheriting the MLP architecture) by nearly

7.8361 %

. By rapid UTI detection, the proposed methodology reduces diagnostic delays, minimizes human error, and streamlines laboratory workflows. Further, preliminary results can complement (expert-supervised) molecular testing by enabling them to focus only on clinically important cases, reducing stress on traditional approaches.

Keywords:

urinary tract infections; multi-layered perceptrons; Kolmogorov–Arnold network; computer vision

1. Introduction

In human beings, the urinary system plays an important role in filtering blood by eliminating waste materials and extra water in the form of urine. The urinary system consists of several important organs like kidneys, ureters, urinary bladder, and urethra. Urine is either sterile or has a very low concentration of pathogenic germs in healthy individuals, but for individuals with a higher concentration of pathogens, urinary tract infections prevail [1]. Urinary tract infections (UTIs) are a common bacterial ailment, affecting 150 million people annually and carrying a high risk of morbidity and expensive medical expenses. These infections can affect the urethra, the urinary bladder, or the kidneys. UTIs typically involve members of the Enterobacteriaceae family, with Uropathogenic Escherichia coli being the most common pathogen to be isolated [2]. UTIs initiate when adhesins in the lower alimentary canal begin to nurture pathogens that gradually reach the urethra and then the urinary bladder. The bacteria start to develop and produce toxins and enzymes that help them counter the host’s inflammatory reaction to flush the toxins. Bacteria may develop from subsequent kidney colonization if the infection penetrates the kidney epithelial barrier.

A urine culture tests urine samples for bacterial or fungal (yeast) infections, as identified and quantified by trained microbiologists. Now, this is a labor-intensive and time-consuming process, which is also prone to human error that often leads to delays in diagnosis and treatment.

Over the last few decades, artificial intelligence has shown promising results thanks to the availability of large-scale datasets. Besides several interdisciplinary domains, artificial intelligence has revolutionized the fields of medical diagnosis, by recognizing certain critical features of importance. Convolutional neural networks (CNNs) and other machine learning models inheriting the CNN architecture have shown good validation accuracies in image recognition tasks, making them well suited for the analysis of medical images [3]. The application of deep learning for the detection of bacterial growth in urine culture samples overrides a potential for faster, efficient diagnosis of UTIs. Traditionally, almost every machine (deep) learning model is built by inheriting the Multi-layered Perceptron (MLP) architecture with a fixed activation function, which further varies the weights and biases of the network leading to the output, but recently, a state-of-the-art architecture—the Kolmogorov–Arnold Network (KAN)—showed up, with variable activation functions in each edge of the graphical representation of the network. Several works [4,5] have inherited the KAN architecture and have shown better results in many instances.

In recent years, numerous research have manifested the power of machine intelligence for medical diagnosis and experienced satisfactory results. Agrawal et al. contributed a Content-Based Medical Image Retrieval (CBMIR) system using deep neural models with transfer learning for lung disease detection using COVID-19 X-ray images, which resulted in an improvement in diagnostic metrics across subclasses in their research [6]. Sheikh and Chachoo, in their research, introduced a class-wise dictionary learning approach for low-rank representation-based medical image classification improving robustness against noise by learning patterns for each class as tuples in the dictionary, addressing performance degradation caused by outliers in medical images. The model achieved good results based on its performance on the biomedical database [7]. Asghari developed an IoT-based predictive model using smart wearable embedded systems for early colorectal cancer (CRC) detection in elderly patients, analyzing vital health indicators using machine learning methods in his research. The model achieved good results, especially in its implementation during the COVID-19 pandemic [8]. Singh and Kumar, in their contribution, developed Inception network block (InDAENET)-integrated Denoising Autoencoders with an Inception network block to improve the quality of histopathological images of breast cancer. The proposed approach has been validated on the BreakHis dataset and is found to outperform traditional Denoising Autoencoder methods [9]. Singh and Agarwal developed a novel convolutional neural network (CNN) architecture for the automated classification and segmentation of brain tumors from MRI images through their research. The proposed model was tested on contrast-enhanced T1 MRI images and achieved a classification accuracy of 92.50% using ten-fold cross-validation [10]. Mahajan et al. contributed an ontology-based intelligent system for the prognosis of Myasthenia Gravis (MG) using ontology, semantic web rules, and reasoners to determine patient status (positive or negative) through their research [11].

The diagnosis of microbial infections (like urinary tract infections) has traditionally relied on culture-based methods, biochemical assays, and molecular techniques—like polymerase chain reaction (PCR) [12], 16S rRNA sequencing [13], and matrix-assisted laser desorption/ionization–time of flight mass spectrometry (MALDI-TOF MS) [14]. These techniques often offer good specificity and therefore accurate identification of pathogens, but they require specialized equipment, trained personnel, and higher costs for their operation. Computer vision-based techniques can reduce the additional constraints of cost and equipment, and can assist expert supervision using machine intelligence. The specific advantages such techniques can offer are as follows: They can come up with rapid preliminary results, allowing molecular testing to focus only on clinically relevant cases, thus reducing stress on traditional techniques. It will enhance pathogen identification (overall) by combining phenotypic (culture-based) and genotypic (molecular-based) data. Additionally, such techniques are generally much more scalable; i.e., even in places where resources of traditional techniques do not exist, such techniques can be taken into account. Particular to molecular biology, multiple works have cited the improved accuracies and scalability of artificial intelligence (computer vision)-based techniques for the diagnosis of microbial infections. Goździkiewicz and her colleagues performed a review of AI-based techniques for the diagnosis of urinary tract infections [15]. They reported that “AI models achieve a high performance in retrospective studies”. They further added that though technically relevant, computer vision is a comparatively new field, thus requiring further research. Shelke et al. identified the application of artificial intelligence to improve existing disease management, antibiotic resistance, epidemiological monitoring, etc. [16]. They also reported faster, precise, and scalable applications. Further, Tsitou et al. reviewed the transformative impact of artificial intelligence on microbiology [17]. They suggested the need of “interpretable AI models that align with medical and ethical standards” to digitalize the diagnosis of microbial infections. The need of the hour is therefore to integrate machine intelligence with expert knowledge, and therefore make the diagnostics much more affordable, reliable, scalable, and robust.

In this research, a dataset consisting of urine culture on Petri dishes as contributed by da Silva et al. [18] is considered, and further, some deep learning models (e.g., ResNet-18, DenseNet, GoogLeNet, etc.) including a few state-of-the-art models (e.g., Class-Attention in Vision Transformers, Vision Transformer, etc.) have been considered for the detection of bacterial growth in the urine samples. These models specifically inherit the MLP architecture. As mentioned previously, a good number of studies reveal improved performance for several tasks using the architecture of Kolmogorov–Arnold Networks. Thus, a collection of three deep learning models inheriting the KAN architecture is proposed—namely, K²AN, KAN-C-Norm, and KAN-C-MLP. Experiments on the aforementioned urine culture dataset reveals

\approx 7.8361 %

more accurate classification of UTI by the KAN-based models than the state-of-the-art MLP architecture-inheriting model. The contributions of this research can be summed up as follows: To the best of our knowledge, this is the first work on the application of the Kolmogorov–Arnold Network or its variants on urology. Existing models (inheriting the MLP architecture) were able to achieve a maximum of ≈

80.33 \pm 0.92 %

accuracy on the classification of urine culture samples, while the best of the three proposed models inheriting KAN architecture—KAN-C-MLP—achieved a validation accuracy of ≈

87.16 \pm 0.97 %

. Further, the proposed model is also computationally lightweight with very limited neural layers, suggesting much room for improvement in accuracy and scopes for further research.

The organization of the research is as follows: Section 2 presents the dataset and its relevant details, along with the deep learning methodologies developed for this research. Section 3 presents the results on a few well-known metrics. In Section 4, we summarize the pros and cons of the proposed KAN-based urine culture diagnostic system and finally the work is concluded in Section 5.

2. Data and Methods

This section discusses the urine culture dataset, and further, the deep learning methodologies that have been utilized to label them. The dataset and the relevant statistics are delineated in Section 2.1, and the deep learning models are discussed in Section 2.2.

2.1. Data

Gabriel Rodrigues da Silva, from the University of São Paulo, along with his colleagues, provided a dataset [18] consisting of a collection of images of urine culture Petri dishes, captured under standardized lighting and placement conditions to ensure consistency. The authors added that the images were acquired using a hardware chamber equipped with a smartphone camera (12 MP resolution) and an LED lighting source, ensuring uniform illumination; the culture plates were incubated at 35 °C for a minimum duration of 24 h, following standard microbiological procedures. When no significant growth was observed after 24 h, the incubation was extended to 48 h before any verdict. For preliminary screening [19], cultures producing two or more Colony-Forming Units (CFUs) of the same morphology were classified as Positive. These plates were considered ready for result interpretation after 24 h of incubation. Cultures showing no bacterial growth (0 CFU) were labeled as Negative, whereas cultures with only one CFU or mixed colony growth were categorized as Uncertain. While this classification framework facilitates the early identification of microbial growth, it does not align with traditional clinical microbiology cut-offs, where significant bacteriuria is typically defined as ≥

10^{3}

CFU/mL for symptomatic patients and ≥

10^{5}

CFU/mL for asymptomatic individuals. The choice of two CFUs as a threshold in this dataset (as chosen by da Silva et al.) was designed for the automated detection of early microbial growth in the cultures, thus enabling computer vision-based diagnostics.

Figure 1 illustrates examples from the dataset, showcasing three different samples for each of the three categories. Quantitatively, the dataset consists of 498 Positive, 500 Negative, and 502 Uncertain annotated urine sample images, providing a balanced distribution for deep learning model training and evaluation.

2.2. Methods

Getting supervised by the training dataset as described in Section 2.1, the deep learning models would be able to classify the urine cultures based on bacterial presence. For the context of this research, we have classified the datasets using models inheriting Multi-Layered Perceptrons and the Kolmogorov–Arnold Network architecture.

2.2.1. Multi-Layered Perceptrons

Multi-Layered Perceptrons (MLPs) are the foundation of many deep learning models used today. They have demonstrated excellent performance when compared to deterministic models in a variety of areas, including computer vision, audio processing, natural language processing, etc. A visual representation of a shallow MLP is provided by Figure 2a, in which the edges of learnable weights

(w_{i})

and each intermittent node are composed as a fixed activation function

(σ)

. The components are combined as per the Universal Approximation Theorem (refer to Theorem 1).

Theorem 1.

Let

σ : R \to R

be a variable, nonlinear, bounded, and continuous function. Assume

I^{m}

is a unit dimensional hypercube, and the set of all continuous functions on

I^{m}

is denoted as

C (I^{m})

; then, for any function

f \in C (I^{m})

, there exists

N \in Z^{+}

and constants

v_{i}, b_{i} \in R, \forall i \leq N

, and

w_{i} \in R^{m}

, such that

f (x) \approx \sum_{i = 1}^{N} v_{i} \cdot σ (w_{i}^{T} \cdot x + b_{i})

Multi-Layered Perceptrons, also known as “nonlinear regressors”, are widely used in machine learning. However, they have certain inherent drawbacks. For example, in MLPs, the interconnections between neurons are facilitated by a fixed nonlinear activation function; this means that the only flexibility available is for training the weights attached to the edges (see Figure 2a). The perceptrons are unable to capture extremely fine details as a result. A prime example may be found in the research paper “Attention is all you need” [20] by Vaswani et al., in which they suggested the use of Transformers, a cutting-edge Multi-Layered Perceptron, but even those Transformers lack interpretabilty.

The deep learning models with the MLP architecture taken into consideration for the research are as follows:

VGG-16 and VGG-19 [21] with 16 and 19 layers, respectively, achieving excellent performance on image recognition tasks.
Tiny VGG [22], a simplified, lightweight, and efficient version of the VGG architecture.
Vision Permutator [23], which focuses on permuting the input data across different dimensions, enabling efficient and flexible attention mechanisms for visual tasks.
Vision Transformer [24], which applies the Transformer [20] architecture directly to image patches, treating them as sequences. ViT has demonstrated state-of-the-art performance on image classification tasks.
Class-Attention in Vision Transformers [25], which incorporates attention weights that emphasize the relevance of image patches to a particular class during the classification process.
Other well-implemented models like DenseNet [26], Xception [27], ResNet-18 [28], and GoogLeNet [29].

2.2.2. Kolmogorov–Arnold Networks

Recently, Liu, Wang et al., in their groundbreaking contribution, “KAN: Kolmogorov–Arnold Networks” [30], addressed the same issue and proposed a network that possesses learnable activation (here, B-Splines) functions rather than weights on the edges. The segments are further combined using the Kolmogorov–Arnold Representation Theorem (refer to Theorem 2).

Theorem 2.

Assume

I^{m}

is a unit dimensional hypercube, and the set of all continuous functions on

I^{m}

is denoted as

C (I^{m})

; then, for any function

f \in C (I^{m})

, there exist uni-variate, continuous functions

Φ_{i}

,

ϕ_{i, j}

, and

N \in Z^{+}

, such that

f (x) = \sum_{i = 1}^{2 \cdot N + 1} Φ_{i} (\sum_{j = 1}^{N} ϕ_{i, j} (x_{j}))

Figure 2b gives a pictorial demonstration of a shallow KAN. The innovation of KANs opens up a broad scope of applicability across all domains and provides a benchmark of their capability in contrast to MLPs. The contributors have bolstered their applicability by proving their superiority over MLPs on several experiments like fitting toy datasets, solving partial differential equations, and solving the toy continual learning problem (supporting the ability of KANs to overcome catastrophic forgetting). The different deep learning architectures utilizing KAN Layers taken into consideration for the research are as follows:

K²AN (refer to Figure 3a), with a KAN network after the convolution operations made using KAN-Convolution. Herein, the input features are followed by two consecutive layers of KAN-Convolution and a two-dimensional Max-Pooling layer. The result is eventually flattened to pass through a KAN-Linear layer (for any supervised learning problem with a training set, $T \leftarrow {x_{i}, y_{i}}$ , where ${x_{i}}$ are the feature vectors and ${y_{i}}$ are the respective targets,

$KAN - Linear (x) = (Φ_{ℓ} \circ Φ_{ℓ - 1} \circ \dots \circ Φ_{1}) x$

and $Φ = [\begin{matrix} ϕ_{1, 1} & ϕ_{1, 2} & \dots & ϕ_{1, n_{output}} \\ ϕ_{2, 1} & ϕ_{2, 2} & \dots & ϕ_{2, n_{output}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ ϕ_{n_{input}, 1} & ϕ_{n_{input}, 2} & \dots & ϕ_{n_{input}, n_{output}} \end{matrix}]$ , considering the KAN layer of input dimension $n_{input}$ , an output dimension $n_{output}$ , with $ϕ (x) = ω_{b} \cdot (\frac{x}{1 + e^{- x}}) + ω_{s} \cdot spline (x)$ , and $spline (x) = \sum_{i} λ_{i} \cdot B_{i} (x)$ ), and finally gives a ternary output on whether the input urine culture image is “positive”, “negative”, or “uncertain”.
KAN-C-Norm (refer to Figure 3b), with a batch-normalized version of KAN-Convolution [31] (for any given image ${x_{i, j}}_{(i, j) = (0, 0)}^{(i, j) = (n, m)}$ , the KAN-Convolution, inheriting from the basics of KAN-Linear is defined as

$KAN - Convolution (x) = [\begin{matrix} \sum_{κ = 1}^{N} \sum_{ℓ = 1}^{M} ϕ_{κ, ℓ} (x_{1 + κ, 1 + ℓ}) & \dots & \sum_{κ = 1}^{N} \sum_{ℓ = 1}^{M} ϕ_{κ, ℓ} (x_{1 + κ, n + ℓ}) \\ \sum_{κ = 1}^{N} \sum_{ℓ = 1}^{M} ϕ_{κ, ℓ} (x_{2 + κ, 1 + ℓ}) & \dots & \sum_{κ = 1}^{N} \sum_{ℓ = 1}^{M} ϕ_{κ, ℓ} (x_{2 + κ, n + ℓ}) \\ ⋮ & ⋱ & ⋮ \\ \sum_{κ = 1}^{N} \sum_{ℓ = 1}^{M} ϕ_{κ, ℓ} (x_{m + κ, 1 + ℓ}) & \dots & \sum_{κ = 1}^{N} \sum_{ℓ = 1}^{M} ϕ_{κ, ℓ} (x_{m + κ, n + ℓ}) \end{matrix}]$

). Herein, the input features are followed by a sequence of two consecutive 2D-Convolution and 2D-Batch-Normalization layers. The result is passed through a two-dimensional Max-Pooling layer, then flattened to pass through a KAN-Linear layer, and finally gives a ternary output on whether the input urine culture image is “positive”, “negative”, or “uncertain”.
KAN-C-MLP (refer to Figure 3c), with a combination of KAN-Convolution, together with the traditional MLP substructure. Herein, the input features are followed by two consecutive layers of KAN-Convolution and a two-dimensional Max-Pooling layer. The result is eventually flattened to pass through two consecutive Linear layers, and finally gives a ternary output on whether the input urine culture image is “positive”, “negative”, or “uncertain”.

Figure 3 gives the block diagram of the different architectures.

3. Experimental Results

This section compares the well-known deep learning models named in Section 2.2 based on their ability to effectively classify the different samples of urine culture. For comparison, we considered the common metrics for benchmarking classifiers, which are discussed under Section 3.1. Further, the real-time comparison (or contrast) is as in Table 1 under Section 3.2, and the implementational details, especially for the architectures following the substructure of the Kolmogorov–Arnold Networks, are discussed under Section 3.3.

3.1. Metrics

The most common metrics for a classifier model in machine learning (or deep learning) are Accuracy (validation), Precision, Recall, and F1-Score [32]. For simplicity, let us consider patterns belonging to two classes

p_{pos}

and

p_{neg}

, with

c_{pos}

and

c_{neg}

data points, respectively, such that

c_{pos} + c_{neg} = c

, where c is the total number of data points. Suppose the classifier can classify accurately

c_{pos}^{*}

out of

c_{pos}

, and

c_{neg}^{*}

out of

c_{neg}

data points. Thus, we can annotate the classifications into four categories, namely True Positive (

c_{pos}^{*}

entities), True Negative (

c_{neg}^{*}

entities), False Positive (

c_{neg} - c_{neg}^{*}

entities), and False Negative (

c_{pos} - c_{pos}^{*}

entities). Based on these annotations, the metrics can be represented mathematically as follows:

Accuracy: The ratio of correctly classified instances to the total number of instances evaluated by the classifier. This includes both True Positives and True Negatives among the correctly predicted instances. Mathematically, it is expressed as in Equation (1).

$Accuracy = \frac{# True Positive + # True Negative}{# All Cases} = \frac{c_{pos}^{*} + c_{neg}^{*}}{c}$

(1)
Precision: The ratio of True Positives evaluated by the classifier to the total number of correctly classified instances. It can also be interpreted as the accuracy of the positive predictions. Mathematically, it is expressed as in Equation (2).

$Precision = \frac{# True Positive}{# True Positive + # False Positive} = \frac{c_{pos}^{*}}{c_{pos}^{*} + c_{neg} - c_{neg}^{*}}$

(2)
Recall: The ratio of True Positives evaluated by the classifier to the actual positively classified instances. It can also be interpreted as the ability of the classifier to capture positive predictions. Mathematically, it is expressed as in Equation (3).

$Recall = \frac{# True Positive}{# True Positive + # False Negative} = \frac{c_{pos}^{*}}{c_{pos}^{*} + c_{pos} - c_{pos}^{*}}$

(3)
F1-Score: The harmonic mean of precision and recall, balancing the trade-off between them, especially useful when dealing with imbalanced datasets. Mathematically, it is expressed as in Equation (4).

$F 1 - Score = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}$

(4)

Apart from these, the time needed to train the respective models (TrT, in seconds) and time to inference (IrT, in milliseconds) are reported in Table 1.

3.2. Comparative Analysis

A comparative analysis is performed to benchmark the efficiency of the deep learning models on their ability to classify the urine culture samples well. The contrast involves deep learning-based models involving both the Multi-Layered Perceptrons and Kolmogorov–Arnold Network architectures (refer to Section 2.2). Also, for each of the models, a set of 10 executions are performed, and the reported values are the arithmetic average of the observations made throughout the 10 iterations. The standard deviation of the mean is also reported. It is evident from Table 1 that the KAN-C-MLP (refer to Figure 3c) module under the KAN architecture resulted in the most successful classification in contrast to its KAN counterparts (i.e., K²AN, KAN-C-MLP, and KAN-C-Norm), and the models with the MLP architecture. Notably, amongst several state-of-the-art MLP architecture-based models, like ViT, CAiT, etc., Vision Permutator achieved a higher performance with a validation accuracy as high as 81%. Further, confusion matrices for the three top-performing models in chronological order by their achieved accuracies are presented in Figure 4. Additionally, for error analysis (as a scope for future research), a few misclassified data points are shown in Figure 5.

3.3. Implementation

This research focuses mainly on a computer vision problem that involves implementing several deep learning models on the urine culture dataset. Each of these models had varying parameters and complexities, and especially due to computational limitations on our local machine, we carried out the experimentations of the research on the Google Colab Platform with a 12.7 GB System RAM, 100 GB Disk Space, and an NVIDIA T4 GPU. The models were supported by the PyTorch-2.0 framework as a backend. Though the MLP-based models are quite common, and their implementation is widely available in several online repositories, KAN being a novel innovation makes it difficult to find its implementable code. Thus, to enable the reproduction of the experiments in this research, all the code and materials have been made available at the GitHub repository, https://github.com/Anurag-Dutta/bactourine (accessed on 2 March 2025). Further, the Vision Transformer model uses images of input resolution 16 × 16 units (words). The same patch size was used for the KAN-based architectural models. Further, for the KAN-Convolution and 2D-Convolution layers, the kernel size and the padding are as in Figure 3.

4. Discussion

This study highlights the significant potential of Kolmogorov–Arnold Networks (KANs) in improving the accuracy and efficiency of urine culture-based UTI diagnostics. The results presented in the section above indicate that the models inheriting the KAN architecture outperform traditional deep learning models, particularly those based on Multi-Layered Perceptrons (MLPs), by achieving an improved classification accuracy of nearly 7.8361%. Despite achieving higher accuracy, the proposed KAN-C-MLP model (best performer) maintains a relatively lightweight structure, suggesting a minimal computational overhead. Notably, the results also demonstrate competitive performance by the traditional MLP-based architectures such as Vision Transformer and Vision Permutator; however, that performance remains constrained due to fixed nonlinearity in activation functions. KAN-based models, in contrast, can model complex nonlinear relationships in the dataset distribution, which becomes important in clinical settings, as rapid and accurate detection of UTIs is necessary for avoiding delays in diagnosis. While the proposed KAN-based urine culture classification system comes with many positives, a limitation of the proposed system is its inability to distinguish between different bacterial species. By combining molecular confirmation with phenotypic colony recognition, the process would become much more efficient.

5. Conclusions

Artificial intelligence and the large-scale availability of data have benefited many branches of sciences, technology, engineering, and management. Not only are the tasks supported by the higher accuracy of deep learning-based models, but also, they are time-efficient. Medical science has long been supported by the strong judgmental capabilities of medical professionals, but despite this, there are some false positives or false negatives, which affect the diagnosis process. Urine culture is performed in a clinical setting to identify the bacterial growth in one’s urine samples, which further fosters the diagnosis of UTIs. Studying urine samples requires expert supervision, and takes a considerable amount of time. This research suggested making use of deep learning-based image recognition techniques to effectively identify the presence of bacteria in one’s urine sample. Further, while the long-hailed Multi-Layered Perceptron architecture was able to detect UTI presence with an accuracy of nearly 80%, a state-of-the-art Kolmogorov–Arnold Network architecture was able to beat the accuracy benchmark and achieve an overwhelming accuracy of nearly 87%.

As addressed previously, a limitation of this proposed urine culture classification system is that it does not differentiate between bacterial species. For example, Staphylococcus epidermidis, a common skin commensal, is frequently isolated in urine cultures due to improper sample collection, which may result in false positives. In contrast, the presence of high CFU counts of uropathogenic Escherichia coli, Klebsiella pneumoniae, or Proteus mirabilis is clinically significant, but might be considered as a false negative. Thus, further research can be conducted by combining phenotypic colony recognition with molecular confirmation. This would make the diagnostic process much more reliable, scalable, and rapid.

Author Contributions

Conceptualization, A.D.; methodology, A.D. and A.R.; software, P.K.K. and M.G.L.; supervision, P.K.K.; validation, P.K.K. and A.R.; visualization, A.R.; writing—original draft, A.D.; writing—review and editing, M.G.L. All authors have read and agreed to the published version of the manuscript.

Funding

The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset of interest in this article can be accessed from the article, https://www.sciencedirect.com/science/article/pii/S235234092300152X. The experimental results can be reproduced by executing the code available at https://github.com/Anurag-Dutta/bactourine.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lee, J.B.L.; Neild, G.H. Urinary tract infection. Medicine 2007, 35, 423–428. [Google Scholar] [CrossRef]
Dougnon, V.; Assogba, P.; Anago, E.; Déguénon, E.; Dapuliga, C.; Agbankpè, J.; Zin, S.; Akotègnon, R.; Moussa, L.B.; Bankolé, H. Enterobacteria responsible for urinary infections: A review about pathogenicity, virulence factors and epidemiology. J. Appl. Biol. Biotechnol. 2020, 8, 117–124. [Google Scholar]
Yu, H.; Yang, L.T.; Zhang, Q.; Armstrong, D.; Deen, M.J. Convolutional neural networks for medical image analysis: State-of-the-art, comparisons, improvement and perspectives. Neurocomputing 2021, 444, 92–110. [Google Scholar] [CrossRef]
Wang, J.; Cai, P.; Wang, Z.; Zhang, H.; Huang, J. CEST-KAN: Kolmogorov-Arnold Networks for CEST MRI Data Analysis. arXiv 2024, arXiv:2406.16026. [Google Scholar]
Li, C.; Liu, X.; Li, W.; Wang, C.; Liu, H.; Yuan, Y. U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation. arXiv 2024, arXiv:2406.02918. [Google Scholar]
Agrawal, S.; Chowdhary, A.; Agarwala, S.; Mayya, V.; Kamath S, S. Content-based medical image retrieval system for lung diseases using deep CNNs. Int. J. Inf. Technol. 2022, 14, 3619–3627. [Google Scholar] [CrossRef]
Sheikh, I.M.; Chachoo, M.A. An enforced block diagonal low-rank representation method for the classification of medical image patterns. Int. J. Inf. Technol. 2022, 14, 1221–1228. [Google Scholar] [CrossRef]
Asghari, P. A diagnostic prediction model for colorectal cancer in elderlies via internet of medical things. Int. J. Inf. Technol. 2021, 13, 1423–1429. [Google Scholar] [CrossRef]
Singh, S.; Kumar, R. Microscopic biopsy image reconstruction using inception block with denoising auto-encoder approach. Int. J. Inf. Technol. 2024, 16, 2413–2423. [Google Scholar] [CrossRef]
Singh, R.; Agarwal, B.B. An automated brain tumor classification in MR images using an enhanced convolutional neural network. Int. J. Inf. Technol. 2023, 15, 665–674. [Google Scholar] [CrossRef]
Mahajan, P.; Agarwal, T.; Vekariya, D.; Gupta, R.; Malviya, A.; Anandaraj, S.P.; Jain, G.; Anand, D. OntoMG: A unique and ontological-based intelligent framework for early identification of myasthenia gravis (MG). Int. J. Inf. Technol. 2024, 16, 3847–3853. [Google Scholar] [CrossRef]
Zering, J.; Stohs, E.J. Urine polymerase chain reaction tests: Stewardship helper or hindrance? Antimicrob. Steward. Healthc. Epidemiol. 2024, 4, e77. [Google Scholar] [CrossRef]
Marshall, C.W.; Kurs-Lasky, M.; McElheny, C.L.; Bridwell, S.; Liu, H.; Shaikh, N. Performance of Conventional Urine Culture Compared to 16S rRNA Gene Amplicon Sequencing in Children with Suspected Urinary Tract Infection. Microbiol. Spectr. 2021, 9, e0186121. [Google Scholar] [CrossRef] [PubMed]
Chen, X.F.; Hou, X.; Xiao, M.; Zhang, L.; Cheng, J.W.; Zhou, M.L.; Huang, J.J.; Zhang, J.J.; Xu, Y.C.; Hsueh, P.R. Matrix-Assisted Laser Desorption/Ionization Time of Flight Mass Spectrometry (MALDI-TOF MS) Analysis for the Identification of Pathogenic Microorganisms: A Review. Microorganisms 2021, 9, 1536. [Google Scholar] [CrossRef] [PubMed]
Goździkiewicz, N.; Zwolińska, D.; Polak-Jonkisz, D. The Use of Artificial Intelligence Algorithms in the Diagnosis of Urinary Tract Infections—A Literature Review. J. Clin. Med. 2022, 11, 2734. [Google Scholar] [CrossRef] [PubMed]
Shelke, Y.P.; Badge, A.K.; Bankar, N.J. Applications of Artificial Intelligence in Microbial Diagnosis. Cureus 2023, 15, e49366. [Google Scholar] [CrossRef]
Tsitou, V.M.; Rallis, D.; Tsekova, M.; Yanev, N. Microbiology in the era of artificial intelligence: Transforming medical and pharmaceutical microbiology. Biotechnol. Biotechnol. Equip. 2024, 38, 2349587. [Google Scholar] [CrossRef]
Da Silva, G.R.; Rosmaninho, I.B.; Zancul, E.; De Oliveira, V.R.; Francisco, G.R.; Dos Santos, N.F.; De Mello Macêdo, K.; Da Silva, A.J.; De Lima, É.K.; Lemo, M.E.B.; et al. Image dataset of urine test results on petri dishes for deep learning classification. Data Brief 2023, 47, 109034. [Google Scholar] [CrossRef]
Brugger, S.D.; Baumberger, C.; Jost, M.; Jenni, W.; Brugger, U.; Mühlemann, K. Automated counting of bacterial colony forming units on agar plates. PLoS ONE 2012, 7, e33695. [Google Scholar] [CrossRef]
Vaswani, A. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Wang, Z.J.; Turko, R.; Shaikh, O.; Park, H.; Das, N.; Hohman, F.; Kahng, M.; Chau, D.H. CNN 101: Interactive visual learning for convolutional neural networks. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1–7. [Google Scholar]
Hou, Q.; Jiang, Z.; Yuan, L.; Cheng, M.-M.; Yan, S.; Feng, J. Vision permutator: A permutable MLP-like architecture for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 1328–1334. [Google Scholar] [CrossRef] [PubMed]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Touvron, H.; Cord, M.; Sablayrolles, A.; Synnaeve, G.; Jégou, H. Going deeper with image transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 32–42. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-Arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
Bodner, A.D.; Tepsich, A.S.; Spolski, J.N.; Pourteau, S. Convolutional Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2406.13155. [Google Scholar]
Alpaydin, E. Machine Learning; MIT Press: Cambridge, MA, USA, 2021. [Google Scholar]

Figure 1. An instance of the urine culture dataset. Each image in the first column corresponds to the Positive category, wherein there exist

\geq 2

CFUs. The second column corresponds to the Negative category, wherein there is 0 CFU, and the third column corresponds to the Uncertain category, wherein there is either 1 CFU or growth of mixed colonies.

Figure 1. An instance of the urine culture dataset. Each image in the first column corresponds to the Positive category, wherein there exist

\geq 2

CFUs. The second column corresponds to the Negative category, wherein there is 0 CFU, and the third column corresponds to the Uncertain category, wherein there is either 1 CFU or growth of mixed colonies.

Figure 2. Contrast between a Multi-Layered Perceptron and the Kolmogorov–Arnold Network based on their architectural view. While the network in the left subfigure is a pictorial demonstration of the well-known Fully Connected Neural Network (which adheres to the Universal Approximation Theorem), the subfigure on the right gives a novel representation of the same Fully Connected Neural Network, where, unlike the weight distribution on the edges, learnable functions are deployed, resulting in better (parametrized) modeling of the data distribution. (a) Architectural view of a shallow Multi- Layered Perceptron, wherein each intermittent node is composed of a fixed activation function

(σ)

and the edges of learnable weights

(w_{i})

. They are combined as per the Universal Approximation Theorem,

f (x) \approx \sum_{i = 1}^{N} v_{i} \cdot σ (w_{i}^{T} \cdot x + b_{i})

. These are further stacked for deeper networks. (b) Architectural view of a shallow Kolmogorov–Arnold Network, wherein each intermittent node is composed of summation operators and the edges of learnable activation functions

(ϕ)

. They are combined as per the Kolmogorov–Arnold Representation Theorem,

f (x) = \sum_{i = 1}^{2 \cdot N + 1} Φ_{i} (\sum_{j = 1}^{N} ϕ_{i, j} (x_{j}))

.

Figure 2. Contrast between a Multi-Layered Perceptron and the Kolmogorov–Arnold Network based on their architectural view. While the network in the left subfigure is a pictorial demonstration of the well-known Fully Connected Neural Network (which adheres to the Universal Approximation Theorem), the subfigure on the right gives a novel representation of the same Fully Connected Neural Network, where, unlike the weight distribution on the edges, learnable functions are deployed, resulting in better (parametrized) modeling of the data distribution. (a) Architectural view of a shallow Multi- Layered Perceptron, wherein each intermittent node is composed of a fixed activation function

(σ)

and the edges of learnable weights

(w_{i})

. They are combined as per the Universal Approximation Theorem,

f (x) \approx \sum_{i = 1}^{N} v_{i} \cdot σ (w_{i}^{T} \cdot x + b_{i})

. These are further stacked for deeper networks. (b) Architectural view of a shallow Kolmogorov–Arnold Network, wherein each intermittent node is composed of summation operators and the edges of learnable activation functions

(ϕ)

. They are combined as per the Kolmogorov–Arnold Representation Theorem,

f (x) = \sum_{i = 1}^{2 \cdot N + 1} Φ_{i} (\sum_{j = 1}^{N} ϕ_{i, j} (x_{j}))

.

Figure 3. Different architectures using the KAN substructure using KAN-Linear and KAN-Convolution. (a) Dual KAN Convolution (K²AN); (b) KAN Convolution with batch normalization (KAN-C-Norm); (c) Dual KAN Convolution with MLP (KAN-C-MLP).

Figure 4. Confusion matrices for the three top-performing models in chronological order by their achieved accuracies. (a) Confusion matrix for Visual Permutation (achieved accuracy: 79.66%); (b) confusion matrix for Visual Transformer (achieved accuracy: 80.33%); (c) confusion matrix for KAN-C-MLP (proposed) (achieved accuracy: 87.16%).

Figure 5. A few instances of misclassified data points (along with their true labels). In most of these cases, the samples classified under the Uncertain category were misclassified. (a) True Label—Uncertain, and Predicted Label—Positive; (b) True Label—Uncertain, and Predicted Label—Negative; (c) True Label—Positive, and Predicted Label—Negative; (d) True Label—Negative, and Predicted Label—Positive; (e) True Label—Uncertain, and Predicted Label—Negative; (f) True Label—Positive, and Predicted Label—Uncertain.

Table 1. Tabular contrast between several deep learning models inheriting MLP, and KAN architectures. For each of the models, their respective Accuracy, Precision, Recall, F1-Score, Training Time (TrT), and Inference Time (IrT) were reported. Additionally, the results are averaged for 10 consecutive executions, reported along with their statistical deviation (in circular brackets). The best results * have been highlighted.

Models	Metrics
Models	Accuracy	Precision	Recall	F1 Score	TrT (s)	IrT (ms)
`Tiny VGG`	67.66	69.05	65.48	0.6722	512.4	4.3
`Tiny VGG`	(1.5391)	(0.9323)	(0.8235)	(0.0862)	(12.6)	(0.5)
`VGG-16`	74.66	77.37	67.48	0.7209	923.1	6.8
`VGG-16`	(2.7392)	(1.1347)	(1.0925)	(0.0572)	(21.7)	(0.7)
`VGG-19`	73.33	72.51	72.33	0.7242	1058.6	7.4
`VGG-19`	(0.4823)	(1.9723)	(1.8241)	(0.0214)	(18.2)	(0.6)
`GoogleNet`	78.66	80.22	74.25	0.7712	672.9	5.1
`GoogleNet`	(3.0419)	(1.8452)	(1.8091)	(0.0627)	(14.5)	(0.4)
`ResNet-18`	78.01	77.57	77.05	0.7731	588.3	3.9
`ResNet-18`	(1.6528)	(0.7139)	(0.6482)	(0.1012)	(16.9)	(0.3)
`Vision Permutator`	79.66	80.21	78.95	0.7957	734.2	6.2
`Vision Permutator`	(2.0987)	(0.9135)	(0.8925)	(0.0514)	(19.8)	(0.5)
`DenseNet`	76.01	74.74	75.40	0.7507	892.7	5.9
`DenseNet`	(1.5763)	(0.8945)	(0.7936)	(0.1128)	(23.4)	(0.4)
`CAiT`	78.66	78.57	77.75	0.7816	1001.4	8.1
`CAiT`	(2.6345)	(1.6345)	(1.8415)	(0.0912)	(27.1)	(0.6)
`Xception`	76.33	75.34	75.92	0.7563	1187.9	7.6
`Xception`	(1.8724)	(1.3497)	(1.1095)	(0.0785)	(22.5)	(0.5)
`ViT`	80.33	80.51	79.08	0.7979	1403.5	9.2
`ViT`	(0.9156)	(1.4802)	(1.3496)	(0.0459)	(30.2)	(0.7)
`K²AN`	75.21	75.55	73.63	0.7458	678.5	4.7
`K²AN`	(1.1187)	(0.4921)	(0.2839)	(0.0063)	(12.8)	(0.3)
`KAN-C-Norm`	86.95	75.57	74.02	0.7479	532.9	3.5
`KAN-C-Norm`	(0.5634)	(0.4283)	(0.3991)	(0.0127)	(11.6)	(0.3)
`KAN-C-MLP` *	87.16	77.03	68.01	0.7224	529.3	3.2
`KAN-C-MLP` *	(0.9654)	(1.2136)	(1.1906)	(0.1921)	(10.4)	(0.2)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dutta, A.; Ramamoorthy, A.; Lakshmi, M.G.; Kumar, P.K. Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections. J. Mol. Pathol. 2025, 6, 6. https://doi.org/10.3390/jmp6010006

AMA Style

Dutta A, Ramamoorthy A, Lakshmi MG, Kumar PK. Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections. Journal of Molecular Pathology. 2025; 6(1):6. https://doi.org/10.3390/jmp6010006

Chicago/Turabian Style

Dutta, Anurag, A. Ramamoorthy, M. Gayathri Lakshmi, and Pijush Kanti Kumar. 2025. "Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections" Journal of Molecular Pathology 6, no. 1: 6. https://doi.org/10.3390/jmp6010006

APA Style

Dutta, A., Ramamoorthy, A., Lakshmi, M. G., & Kumar, P. K. (2025). Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections. Journal of Molecular Pathology, 6(1), 6. https://doi.org/10.3390/jmp6010006

Article Menu

Kolmogorov–Arnold Networks for Automated Diagnosis of Urinary Tract Infections

Abstract

1. Introduction

2. Data and Methods

2.1. Data

2.2. Methods

2.2.1. Multi-Layered Perceptrons

2.2.2. Kolmogorov–Arnold Networks

3. Experimental Results

3.1. Metrics

3.2. Comparative Analysis

3.3. Implementation

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI