applsci-logo

Journal Browser

Journal Browser

Application of Machine Learning to Image Classification and Image Segmentation

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 August 2025 | Viewed by 3732

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science and Technology, College of Computer and Information, Hohai University, Nanjing 210098, China
Interests: computer vision; pattern recognition; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Image classification and image segmentation are two important tasks in the field of computer vision. Image classification aims to distinguish different categories of targets based on the different features reflected in the image information. It uses computers to quantitatively analyze images, categorizing each pixel or region in the image into one of several categories. Image segmentation refers to the process of subdividing a digital image into multiple image subregions that have certain similarities in features, while there are significant differences between different subregions. The goal of image segmentation is to assign a category label to each pixel in the image, achieving a fine understanding of the image. In recent decades, deep learning techniques have made unprecedented advancements in both image classification and image segmentation. Despite promising performance with existing methods, they are still challenged with numerous open issues. This organized Special Issue endeavors to show the new developments in both image classification and image segmentation to highlight future research.

Prof. Dr. Fan Liu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image classification
  • image segmentation
  • deep learning
  • machine learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 503 KiB  
Article
Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models
by Vlad-Octavian Bolocan, Oana Nicu-Canareica, Alexandru Mitoi, Maria Glencora Costache, Loredana Sabina Cornelia Manolescu, Cosmin Medar and Viorel Jinga
Appl. Sci. 2025, 15(10), 5388; https://doi.org/10.3390/app15105388 - 12 May 2025
Viewed by 211
Abstract
Adrenal glands are vital endocrine organs whose accurate segmentation on CT imaging presents significant challenges due to their small size and variable morphology. This study evaluates the efficacy of deep learning approaches for automatic adrenal gland segmentation from multiphase CT scans. We implemented [...] Read more.
Adrenal glands are vital endocrine organs whose accurate segmentation on CT imaging presents significant challenges due to their small size and variable morphology. This study evaluates the efficacy of deep learning approaches for automatic adrenal gland segmentation from multiphase CT scans. We implemented three convolutional neural network architectures (U-Net, SegNet, and NablaNet) and assessed their performance on a dataset comprising 868 adrenal glands from contrast-enhanced abdominal CT scans. Performance was evaluated using the Dice similarity coefficient (DSC), alongside practical implementation metrics including training and deployment time. U-Net demonstrated superior segmentation performance (DSC: 0.630 ± 0.05 for right, 0.660 ± 0.06 for left adrenal glands) compared to NablaNet (DSC: 0.552 ± 0.08 for right, 0.550 ± 0.07 for left) and SegNet (DSC: 0.320 ± 0.10 for right, 0.335 ± 0.09 for left). While all models achieved high specificity, boundary delineation accuracy remained challenging. Our findings demonstrate the feasibility of deep learning-based adrenal gland segmentation while highlighting the persistent challenges in achieving the segmentation quality observed with larger abdominal organs. U-Net provides the optimal balance between accuracy and computational requirements, establishing a foundation for further refinement of AI-assisted adrenal imaging tools. Full article
Show Figures

Figure 1

20 pages, 623 KiB  
Article
Fast Normalization for Bilinear Pooling via Eigenvalue Regularization
by Sixiang Xu, Huihui Dong, Chen Zhang and Chaoxue Wang
Appl. Sci. 2025, 15(8), 4155; https://doi.org/10.3390/app15084155 - 10 Apr 2025
Viewed by 221
Abstract
Bilinear pooling, as an aggregation approach that outputs second-order statistics of deep learning features, has demonstrated effectiveness in a wide range of visual recognition tasks. Among major improvements on the bilinear pooling, matrix square root normalization—applied to the bilinear representation matrix—is regarded as [...] Read more.
Bilinear pooling, as an aggregation approach that outputs second-order statistics of deep learning features, has demonstrated effectiveness in a wide range of visual recognition tasks. Among major improvements on the bilinear pooling, matrix square root normalization—applied to the bilinear representation matrix—is regarded as a crucial step for further boosting performance. However, most existing works leverage Newton’s iteration to perform normalization, which becomes computationally inefficient when dealing with high-dimensional features. To address this limitation, through a comprehensive analysis, we reveal that both the distribution and magnitude of eigenvalues in the bilinear representation matrix play an important role in the network performance. Building upon this insight, we propose a novel approach, namely RegCov, which regularizes the eigenvalues when the normalization is absent. Specifically, RegCov incorporates two regularization terms that encourage the network to align the current eigenvalues with the target ones in terms of their distribution and magnitude. We implement RegCov across different network architectures and run extensive experiments on the ImageNet1K and fine-grained image classification benchmarks. The results demonstrate that RegCov maintains robust recognition to diverse datasets and network architectures while achieving superior inference speed compared to previous works. Full article
Show Figures

Figure 1

30 pages, 4682 KiB  
Article
VITA-D: A Radiomic Web Tool for Predicting Vitamin D Deficiency Levels
by Yuliana Jiménez-Gaona, Oscar Vivanco-Galván, Darwin Castillo-Malla, Israel Vivanco-Gualán and Patricia Díaz-Guzmán
Appl. Sci. 2025, 15(4), 1798; https://doi.org/10.3390/app15041798 - 10 Feb 2025
Viewed by 799
Abstract
Background: Vitamin D deficiency is a significant risk factor for several chronic conditions. This study aims to predict vitamin D deficiency levels in a private database, collected from the southern part of Loja-Ecuador using a graphical web interface tool based on artificial intelligence [...] Read more.
Background: Vitamin D deficiency is a significant risk factor for several chronic conditions. This study aims to predict vitamin D deficiency levels in a private database, collected from the southern part of Loja-Ecuador using a graphical web interface tool based on artificial intelligence algorithms. Methods: Two databases were processed using ML training models: SVM, Random Forest (RF), Linear Regression (LR). (i) Private data collection was undertaken on 465 patients from a local university, where vitamin D levels were measured through a blood sample collection to calculate the concentration of 25-hydroxy vitamin D in plasma and determine it by enzyme-linked immunosorbent assay, and (ii) public data collection was obtained from the FigShare database. Then, a survey was conducted from April 2022 to June 2023, identifying 157 variables, 18 of which were used for ML training models. Results: Vitamin D deficiency levels in private patients reached 18.10 ng/mL and 20.42 ng/mL in the public. The RF algorithm achieved (87.73%) accuracy, the SVM (80.0%), and LR (70.70%). RF was selected as the best performance model for web application design in binary levels classification: deficiency (Class 0) indicates vitamin D levels below 15 ng/mL, and sufficiency (Class 1) indicates vitamin D levels above 15 ng/mL. Conclusions: The “VITA-D” web application was used to monitor and predict vitamin D levels and deficiency factor risk based on clinical and sociodemographic data, providing an efficient and cost-effective alternative to traditional vitamin D testing methods. Full article
Show Figures

Figure 1

21 pages, 4590 KiB  
Article
Deep-Learning-Based Land Cover Mapping in Franciacorta Wine Growing Area
by Girma Tariku, Isabella Ghiglieno, Andres Sanchez Morchio, Luca Facciano, Celine Birolleau, Anna Simonetto, Ivan Serina and Gianni Gilioli
Appl. Sci. 2025, 15(2), 871; https://doi.org/10.3390/app15020871 - 17 Jan 2025
Viewed by 1050
Abstract
Land cover mapping is essential to understanding global land-use patterns and studying biodiversity composition and the functioning of eco-systems. The introduction of remote sensing technologies and artificial intelligence models made it possible to base land cover mapping on satellite imagery in order to [...] Read more.
Land cover mapping is essential to understanding global land-use patterns and studying biodiversity composition and the functioning of eco-systems. The introduction of remote sensing technologies and artificial intelligence models made it possible to base land cover mapping on satellite imagery in order to monitor changes, assess ecosystem health, support conservation efforts, and reduce monitoring time. However, significant challenges remain in managing large, complex satellite imagery datasets, acquiring specialized datasets due to high costs and labor intensity, including a lack of comparative studies for the selection of optimal deep learning models. No less important is the scarcity of aerial datasets specifically tailored for agricultural areas. This study addresses these gaps by presenting a methodology for semantic segmentation of land covers in agricultural areas using satellite images and deep learning models with pre-trained backbones. We introduce an efficient methodology for preparing semantic segmentation datasets and contribute the “Land Cover Aerial Imagery” (LICAI) dataset for semantic segmentation. The study focuses on the Franciacorta area, Lombardy Region, leveraging the rich diversity of the dataset to effectively train and evaluate the models. We conducted a comparative study, using cutting-edge deep-learning-based segmentation models (U-Net, SegNet, DeepLabV3) with various pre-trained backbones (ResNet, Inception, DenseNet, EfficientNet) on our dataset acquired from Google Earth Pro. Through meticulous data acquisition, preprocessing, model selection, and evaluation, we demonstrate the effectiveness of these techniques in accurately identifying land cover classes. Integrating pre-trained feature extraction networks significantly improves performance across various metrics. Additionally, addressing challenges such as data availability, computational resources, and model interpretability is essential for advancing the field of remote sensing, in support of biodiversity conservation and the provision of ecosystem services and sustainable agriculture. Full article
Show Figures

Figure 1

16 pages, 8947 KiB  
Article
Research on Personnel Image Segmentation Based on MobileNetV2 H-Swish CBAM PSPNet in Search and Rescue Scenarios
by Di Zhao, Weiwei Zhang and Yuxing Wang
Appl. Sci. 2024, 14(22), 10675; https://doi.org/10.3390/app142210675 - 19 Nov 2024
Cited by 1 | Viewed by 887
Abstract
In post-disaster search and rescue scenarios, the accurate image segmentation of individuals is essential for efficient resource allocation and effective rescue operations. However, challenges such as image blur and limited resources complicate personnel segmentation. This paper introduces an enhanced, lightweight version of the [...] Read more.
In post-disaster search and rescue scenarios, the accurate image segmentation of individuals is essential for efficient resource allocation and effective rescue operations. However, challenges such as image blur and limited resources complicate personnel segmentation. This paper introduces an enhanced, lightweight version of the Pyramid Scene Parsing Network (MHC-PSPNet). By substituting ResNet50 with the more efficient MobileNetV2 as the model backbone, the computational complexity is significantly reduced. Furthermore, replacing the ReLU6 activation function in MobileNetV2 with H-Swish enhances segmentation accuracy without increasing the parameter count. To further amplify high-level semantic features, global pooled features are fed into an attention mechanism network. The experimental results demonstrate that MHC-PSPNet performs exceptionally well on our custom dataset, achieving 97.15% accuracy, 89.21% precision, an F1 score of 94.53%, and an Intersection over Union (IoU) of 83.82%. Compared to the ResNet50 version, parameters are reduced by approximately 18.6 times, while detection accuracy improves, underscoring the efficiency and practicality of the proposed algorithm. Full article
Show Figures

Figure 1

Back to TopTop