MDPI - Publisher of Open Access Journals

25 pages, 7900 KiB

Open AccessArticle

Multi-Label Disease Detection in Chest X-Ray Imaging Using a Fine-Tuned ConvNeXtV2 with a Customized Classifier

by Kangzhe Xiong, Yuyun Tu, Xinping Rao, Xiang Zou and Yingkui Du

Informatics 2025, 12(3), 80; https://doi.org/10.3390/informatics12030080 - 14 Aug 2025

Viewed by 46

Deep-learning-based multiple label chest X-ray classification has achieved significant success, but existing models still have three main issues: fixed-scale convolutions fail to capture both large and small lesions, standard pooling is lacking in the lack of attention to important regions, and linear classification lacks the capacity to model complex dependency between features. To circumvent these obstacles, we propose CONVFCMAE, a lightweight yet powerful framework that is built on a backbone that is partially frozen (77.08 % of the initial layers are fixed) in order to preserve complex, multi-scale features while decreasing the number of trainable parameters. Our architecture adds (1) an intelligent global pooling module that is learnable, with

1 \times 1

convolutions that are dynamically weighted by their spatial location, and (2) a multi-head attention block that is dedicated to channel re-calibration, along with (3) a two-layer MLP that has been enhanced with ReLU, batch normalization, and dropout. This module is used to enhance the non-linearity of the feature space. To further reduce the noise associated with labels and the imbalance in class distribution inherent to the NIH ChestXray14 dataset, we utilize a combined loss that combines BCEWithLogits and Focal Loss as well as extensive data augmentation. On ChestXray14, the average ROC–AUC of CONVFCMAE is 0.852, which is 3.97 percent greater than the state of the art. Ablation experiments demonstrate the individual and collective effectiveness of each component. Grad-CAM visualizations have a superior capacity to localize the pathological regions, and this increases the interpretability of the model. Overall, CONVFCMAE provides a practical, generalizable solution to the problem of extracting features from medical images in a practical manner. Full article

(This article belongs to the Section Medical and Clinical Informatics)

► Show Figures

Figure 1

16 pages, 2926 KiB

Open AccessArticle

Few Adjustable Parameters Prediction Model Based on Lightweight Prefix-Tuning: Learning Session Dropout Prediction Model Based on Parameter-Efficient Prefix-Tuning

by Yuantong Lu and Zhanquan Wang

Appl. Sci. 2024, 14(23), 10772; https://doi.org/10.3390/app142310772 - 21 Nov 2024

Viewed by 1803

Abstract

In response to the challenge of low predictive accuracy in scenarios with limited data, we propose a few adjustable parameters prediction model based on lightweight prefix-tuning (FAP-Prefix). Prefix-tuning is an efficient fine-tuning method that only adjusts prefix vectors while keeping the model’s original parameters frozen. In each transformer layer, the prefix vectors are connected with the internal key-value pair of the transformer structure. By training on the synthesized sequence of the prefix and original input with masked learning, the transformer model learns the features of individual learning behaviors. In addition, it can also discover hidden connections of continuous learning behaviors. During fine-tuning, all parameters of the pre-trained model are frozen, and downstream task learning is accomplished by adjusting the prefix parameters. Continuous trainable prefix vectors can influence subsequent vector representations, leading to the generation of session dropout prediction results. The experiments show that FAP-Prefix significantly outperforms traditional methods in data-limited settings, with AUC improvements of +4.58%, +3.53%, and +8.49% under 30%, 10%, and 1% data conditions, respectively. It also surpasses state-of-the-art models in prediction performance (AUC +5.42%, ACC +5.3%, F1 score +5.68%). Full article

► Show Figures

Figure 1

15 pages, 3413 KiB

Open AccessArticle

Mobile Application for Tomato Plant Leaf Disease Detection Using a Dense Convolutional Network Architecture

by Intan Nurma Yulita, Naufal Ariful Amri and Akik Hidayat

Computation 2023, 11(2), 20; https://doi.org/10.3390/computation11020020 - 31 Jan 2023

Cited by 20 | Viewed by 5620

Abstract

In Indonesia, tomato is one of the horticultural products with the highest economic value. To maintain enhanced tomato plant production, it is necessary to monitor the growth of tomato plants, particularly the leaves. The quality and quantity of tomato plant production can be preserved with the aid of computer technology. It can identify diseases in tomato plant leaves. An algorithm for deep learning with a DenseNet architecture was implemented in this study. Multiple hyperparameter tests were conducted to determine the optimal model. Using two hidden layers, a DenseNet trainable layer on dense block 5, and a dropout rate of 0.4, the optimal model was constructed. The 10-fold cross-validation evaluation of the model yielded an accuracy value of 95.7 percent and an F1-score of 95.4 percent. To recognize tomato plant leaves, the model with the best assessment results was implemented in a mobile application. Full article

(This article belongs to the Topic Artificial Intelligence and Computational Methods: Modeling, Simulations and Optimization of Complex Systems)

► Show Figures

Figure 1

12 pages, 1387 KiB

Open AccessArticle

Less Is More: Adaptive Trainable Gradient Dropout for Deep Neural Networks

by Christos Avgerinos, Nicholas Vretos and Petros Daras

Sensors 2023, 23(3), 1325; https://doi.org/10.3390/s23031325 - 24 Jan 2023

Cited by 4 | Viewed by 3752

Abstract

The undeniable computational power of artificial neural networks has granted the scientific community the ability to exploit the available data in ways previously inconceivable. However, deep neural networks require an overwhelming quantity of data in order to interpret the underlying connections between them, and therefore, be able to complete the specific task that they have been assigned to. Feeding a deep neural network with vast amounts of data usually ensures efficiency, but may, however, harm the network’s ability to generalize. To tackle this, numerous regularization techniques have been proposed, with dropout being one of the most dominant. This paper proposes a selective gradient dropout method, which, instead of relying on dropping random weights, learns to freeze the training process of specific connections, thereby increasing the overall network’s sparsity in an adaptive manner, by driving it to utilize more salient weights. The experimental results show that the produced sparse network outperforms the baseline on numerous image classification datasets, and additionally, the yielded results occurred after significantly less training epochs. Full article

(This article belongs to the Special Issue Neural Networks and Semantic Analysis in Sensor, Image and Video Processing)

► Show Figures

Figure 1

20 pages, 1461 KiB

Open AccessArticle

Parallelistic Convolution Neural Network Approach for Brain Tumor Diagnosis

by Goodness Temofe Mgbejime, Md Altab Hossin, Grace Ugochi Nneji, Happy Nkanta Monday and Favour Ekong

Diagnostics 2022, 12(10), 2484; https://doi.org/10.3390/diagnostics12102484 - 13 Oct 2022

Cited by 8 | Viewed by 2399

Abstract

Today, Magnetic Resonance Imaging (MRI) is a prominent technique used in medicine, produces a significant and varied range of tissue contrasts in each imaging modalities, and is frequently employed by medical professionals to identify brain malignancies. With brain tumor being a very deadly disease, early detection will help increase the likelihood that the patient will receive the appropriate medical care leading to either a full elimination of the tumor or the prolongation of the patient’s life. However, manually examining the enormous volume of magnetic resonance imaging (MRI) images and identifying a brain tumor or cancer is extremely time-consuming and requires the expertise of a trained medical expert or brain doctor to manually detect and diagnose brain cancer using multiple Magnetic Resonance images (MRI) with various modalities. Due to this underlying issue, there is a growing need for increased efforts to automate the detection and diagnosis process of brain tumor without human intervention. Another major concern most research articles do not consider is the low quality nature of MRI images which can be attributed to noise and artifacts. This article presents a Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm to precisely handle the problem of low quality MRI images by eliminating noisy elements and enhancing the visible trainable features of the image. The enhanced image is then fed to the proposed PCNN to learn the features and classify the tumor using sigmoid classifier. To properly train the model, a publicly available dataset is collected and utilized for this research. Additionally, different optimizers and different values of dropout and learning rates are used in the course of this study. The proposed PCNN with Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm achieved an accuracy of 98.7%, sensitivity of 99.7%, and specificity of 97.4%. In comparison with other state-of-the-art brain tumor methods and pre-trained deep transfer learning models, the proposed PCNN model obtained satisfactory performance. Full article

(This article belongs to the Special Issue Brain Imaging in Epilepsy)

► Show Figures

Figure 1

24 pages, 4543 KiB

Open AccessArticle

Fully Convolutional Deep Neural Networks with Optimized Hyperparameters for Detection of Shockable and Non-Shockable Rhythms

by Vessela Krasteva, Sarah Ménétré, Jean-Philippe Didon and Irena Jekova

Sensors 2020, 20(10), 2875; https://doi.org/10.3390/s20102875 - 19 May 2020

Cited by 47 | Viewed by 8505

Abstract

Deep neural networks (DNN) are state-of-the-art machine learning algorithms that can be learned to self-extract significant features of the electrocardiogram (ECG) and can generally provide high-output diagnostic accuracy if subjected to robust training and optimization on large datasets at high computational cost. So far, limited research and optimization of DNNs in shock advisory systems is found on large ECG arrhythmia databases from out-of-hospital cardiac arrests (OHCA). The objective of this study is to optimize the hyperparameters (HPs) of deep convolutional neural networks (CNN) for detection of shockable (Sh) and nonshockable (NSh) rhythms, and to validate the best HP settings for short and long analysis durations (2–10 s). Large numbers of (Sh + NSh) ECG samples were used for training (720 + 3170) and validation (739 + 5921) from Holters and defibrillators in OHCA. An end-to-end deep CNN architecture was implemented with one-lead raw ECG input layer (5 s, 125 Hz, 2.5 uV/LSB), configurable number of 5 to 23 hidden layers and output layer with diagnostic probability p ∈ [0: Sh,1: NSh]. The hidden layers contain N convolutional blocks × 3 layers (Conv1D (filters = Fi, kernel size = Ki), max-pooling (pool size = 2), dropout (rate = 0.3)), one global max-pooling and one dense layer. Random search optimization of HPs = {N, Fi, Ki}, i = 1, … N in a large grid of N = [1, 2, … 7], Fi = [5;50], Ki = [5;100] was performed. During training, the model with maximal balanced accuracy BAC = (Sensitivity + Specificity)/2 over 400 epochs was stored. The optimization principle is based on finding the common HPs space of a few top-ranked models and prediction of a robust HP setting by their median value. The optimal models for 1–7 CNN layers were trained with different learning rates LR = [10⁻⁵; 10⁻²] and the best model was finally validated on 2–10 s analysis durations. A number of 4216 random search models were trained. The optimal models with more than three convolutional layers did not exhibit substantial differences in performance BAC = (99.31–99.5%). Among them, the best model was found with {N = 5, Fi = {20, 15, 15, 10, 5}, Ki = {10, 10, 10, 10, 10}, 7521 trainable parameters} with maximal validation performance for 5-s analysis (BAC = 99.5%, Se = 99.6%, Sp = 99.4%) and tolerable drop in performance (<2% points) for very short 2-s analysis (BAC = 98.2%, Se = 97.6%, Sp = 98.7%). DNN application in future-generation shock advisory systems can improve the detection performance of Sh and NSh rhythms and can considerably shorten the analysis duration complying with resuscitation guidelines for minimal hands-off pauses. Full article

(This article belongs to the Special Issue Recent Advances in ECG Monitoring)

► Show Figures

Figure 1

21 pages, 5393 KiB

Open AccessArticle

An Encoder-Decoder Based Convolution Neural Network (CNN) for Future Advanced Driver Assistance System (ADAS)

by Robail Yasrab, Naijie Gu and Xiaoci Zhang

Appl. Sci. 2017, 7(4), 312; https://doi.org/10.3390/app7040312 - 23 Mar 2017

Cited by 33 | Viewed by 12921

Abstract

We propose a practical Convolution Neural Network (CNN) model termed the CNN for Semantic Segmentation for driver Assistance system (CSSA). It is a novel semantic segmentation model for probabilistic pixel-wise segmentation, which is able to predict pixel-wise class labels of a given input image. Recently, scene understanding has turned out to be one of the emerging areas of research, and pixel-wise semantic segmentation is a key tool for visual scene understanding. Among future intelligent systems, the Advanced Driver Assistance System (ADAS) is one of the most favorite research topic. The CSSA is a road scene understanding CNN that could be a useful constituent of the ADAS toolkit. The proposed CNN network is an encoder-decoder model, which is built on convolutional encoder layers adopted from the Visual Geometry Group’s VGG-16 net, whereas the decoder is inspired by segmentation network (SegNet). The proposed architecture mitigates the limitations of the existing methods based on state-of-the-art encoder-decoder design. The encoder performs convolution, while the decoder is responsible for deconvolution and un-pooling/up-sampling to predict pixel-wise class labels. The key idea is to apply the up-sampling decoder network, which maps the low-resolution encoder feature maps. This architecture substantially reduces the number of trainable parameters and reuses the encoder’s pooling indices to up-sample to map pixel-wise classification and segmentation. We have experimented with different activation functions, pooling methods, dropout units and architectures to design an efficient CNN architecture. The proposed network offers a significant improvement in performance in segmentation results while reducing the number of trainable parameters. Moreover, there is a considerable improvement in performance in comparison to the benchmark results over PASCAL VOC-12 and the CamVid. Full article

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI