MDPI - Publisher of Open Access Journals

23 pages, 3645 KiB

Open AccessArticle

Color-Guided Mixture-of-Experts Conditional GAN for Realistic Biomedical Image Synthesis in Data-Scarce Diagnostics

by Patrycja Kwiek, Filip Ciepiela and Małgorzata Jakubowska

Electronics 2025, 14(14), 2773; https://doi.org/10.3390/electronics14142773 - 10 Jul 2025

Viewed by 147

Background: Limited availability of high-quality labeled biomedical image datasets presents a significant challenge for training deep learning models in medical diagnostics. This study proposes a novel image generation framework combining conditional generative adversarial networks (cGANs) with a Mixture-of-Experts (MoE) architecture and color histogram-aware [...] Read more.

Background: Limited availability of high-quality labeled biomedical image datasets presents a significant challenge for training deep learning models in medical diagnostics. This study proposes a novel image generation framework combining conditional generative adversarial networks (cGANs) with a Mixture-of-Experts (MoE) architecture and color histogram-aware loss functions to enhance synthetic blood cell image quality. Methods: RGB microscopic images from the BloodMNIST dataset (eight blood cell types, resolution 3 × 128 × 128) underwent preprocessing with k-means clustering to extract the dominant colors and UMAP for visualizing class similarity. Spearman correlation-based distance matrices were used to evaluate the discriminative power of each RGB channel. A MoE–cGAN architecture was developed with residual blocks and LeakyReLU activations. Expert generators were conditioned on cell type, and the generator’s loss was augmented with a Wasserstein distance-based term comparing red and green channel histograms, which were found most relevant for class separation. Results: The red and green channels contributed most to class discrimination; the blue channel had minimal impact. The proposed model achieved 0.97 classification accuracy on generated images (ResNet50), with 0.96 precision, 0.97 recall, and a 0.96 F1-score. The best Fréchet Inception Distance (FID) was 52.1. Misclassifications occurred mainly among visually similar cell types. Conclusions: Integrating histogram alignment into the MoE–cGAN training significantly improves the realism and class-specific variability of synthetic images, supporting robust model development under data scarcity in hematological imaging. Full article

(This article belongs to the Special Issue Deep Learning in Video and Image Processing: Challenges, Solutions, and Future Directions)

► Show Figures

Figure 1

36 pages, 9139 KiB

Open AccessArticle

On the Synergy of Optimizers and Activation Functions: A CNN Benchmarking Study

by Khuraman Aziz Sayın, Necla Kırcalı Gürsoy, Türkay Yolcu and Arif Gürsoy

Mathematics 2025, 13(13), 2088; https://doi.org/10.3390/math13132088 - 25 Jun 2025

Viewed by 415

Abstract

In this study, we present a comparative analysis of gradient descent-based optimizers frequently used in Convolutional Neural Networks (CNNs), including SGD, mSGD, RMSprop, Adadelta, Nadam, Adamax, Adam, and the recent EVE optimizer. To explore the interaction between optimization strategies and activation functions, we [...] Read more.

In this study, we present a comparative analysis of gradient descent-based optimizers frequently used in Convolutional Neural Networks (CNNs), including SGD, mSGD, RMSprop, Adadelta, Nadam, Adamax, Adam, and the recent EVE optimizer. To explore the interaction between optimization strategies and activation functions, we systematically evaluate all combinations of these optimizers with four activation functions—ReLU, LeakyReLU, Tanh, and GELU—across three benchmark image classification datasets: CIFAR-10, Fashion-MNIST (F-MNIST), and Labeled Faces in the Wild (LFW). Each configuration was assessed using multiple evaluation metrics, including accuracy, precision, recall, F1-score, mean absolute error (MAE), and mean squared error (MSE). All experiments were performed using k-fold cross-validation to ensure statistical robustness. Additionally, two-way ANOVA was employed to validate the significance of differences across optimizer–activation combinations. This study aims to highlight the importance of jointly selecting optimizers and activation functions to enhance training dynamics and generalization in CNNs. We also consider the role of critical hyperparameters, such as learning rate and regularization methods, in influencing optimization stability. This work provides valuable insights into the optimizer–activation interplay and offers practical guidance for improving architectural and hyperparameter configurations in CNN-based deep learning models. Full article

(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)

► Show Figures

Figure 1

15 pages, 1313 KiB

Open AccessArticle

mTanh: A Low-Cost Inkjet-Printed Vanishing Gradient Tolerant Activation Function

by Shahrin Akter and Mohammad Rafiqul Haider

J. Low Power Electron. Appl. 2025, 15(2), 27; https://doi.org/10.3390/jlpea15020027 - 2 May 2025

Viewed by 712

Abstract

Inkjet-printed circuits on flexible substrates are rapidly emerging as a key technology in flexible electronics, driven by their minimal fabrication process, cost-effectiveness, and environmental sustainability. Recent advancements in inkjet-printed devices and circuits have broadened their applications in both sensing and computing. Building on [...] Read more.

Inkjet-printed circuits on flexible substrates are rapidly emerging as a key technology in flexible electronics, driven by their minimal fabrication process, cost-effectiveness, and environmental sustainability. Recent advancements in inkjet-printed devices and circuits have broadened their applications in both sensing and computing. Building on this progress, this work has developed a nonlinear computational element coined as mTanh to serve as an activation function in neural networks. Activation functions are essential in neural networks as they introduce nonlinearity, enabling machine learning models to capture complex patterns. However, widely used functions such as Tanh and sigmoid often suffer from the vanishing gradient problem, limiting the depth of neural networks. To address this, alternative functions like ReLU and Leaky ReLU have been explored, yet these also introduce challenges such as the dying ReLU issue, bias shifting, and noise sensitivity. The proposed mTanh activation function effectively mitigates the vanishing gradient problem, allowing for the development of deeper neural network architectures without compromising training efficiency. This study demonstrates the feasibility of mTanh as an activation function by integrating it into an Echo State Network to predict the Mackey–Glass time series signal. The results show that mTanh performs comparably to Tanh, ReLU, and Leaky ReLU in this task. Additionally, the vanishing gradient resistance of the mTanh function was evaluated by implementing it in a deep multi-layer perceptron model for Fashion MNIST image classification. The study indicates that mTanh enables the addition of 3–5 extra layers compared to Tanh and sigmoid, while exhibiting vanishing gradient resistance similar to ReLU. These results highlight the potential of mTanh as a promising activation function for deep learning models, particularly in flexible electronics applications. Full article

(This article belongs to the Special Issue Advancements in Low-Power Ubiquitous Sensing, Computing, and Communication Interfaces for IoT: Circuits, Systems, and Applications)

► Show Figures

Graphical abstract

25 pages, 10770 KiB

Open AccessArticle

Lung Segmentation with Lightweight Convolutional Attention Residual U-Net

by Meftahul Jannat, Shaikh Afnan Birahim, Mohammad Asif Hasan, Tonmoy Roy, Lubna Sultana, Hasan Sarker, Samia Fairuz and Hanaa A. Abdallah

Diagnostics 2025, 15(7), 854; https://doi.org/10.3390/diagnostics15070854 - 27 Mar 2025

Cited by 1 | Viewed by 1439

Abstract

Background: Examining chest radiograph images (CXR) is an intricate and time-consuming process, sometimes requiring the identification of many anomalies at the same time. Lung segmentation is key to overcoming this challenge through different deep learning (DL) techniques. Many researchers are working to improve [...] Read more.

Background: Examining chest radiograph images (CXR) is an intricate and time-consuming process, sometimes requiring the identification of many anomalies at the same time. Lung segmentation is key to overcoming this challenge through different deep learning (DL) techniques. Many researchers are working to improve the performance and efficiency of lung segmentation models. This article presents a DL-based approach to accurately identify the lung mask region in CXR images to assist radiologists in recognizing early signs of high-risk lung diseases. Methods: This paper proposes a novel technique, Lightweight Residual U-Net, combining the strengths of the convolutional block attention module (CBAM), the Atrous Spatial Pyramid Pooling (ASPP) block, and the attention module, which consists of only 3.24 million trainable parameters. Furthermore, the proposed model has been trained using both the RELU and LeakyReLU activation functions, with LeakyReLU yielding superior performance. The study indicates that the Dice loss function is more effective in achieving better results. Results: The proposed model is evaluated on three benchmark datasets: JSRT, SZ, and MC, achieving a Dice score of 98.72%, 97.49%, and 99.08%, respectively, outperforming the state-of-the-art models. Conclusions: Using the capabilities of DL and cutting-edge attention processes, the proposed model improves current efforts to enhance lung segmentation for the early identification of many serious lung diseases. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

28 pages, 14703 KiB

Open AccessArticle

FTIR-SpectralGAN: A Spectral Data Augmentation Generative Adversarial Network for Aero-Engine Hot Jet FTIR Spectral Classification

by Shuhan Du, Yurong Liao, Rui Feng, Fengkun Luo and Zhaoming Li

Remote Sens. 2025, 17(6), 1042; https://doi.org/10.3390/rs17061042 - 16 Mar 2025

Cited by 1 | Viewed by 849

Abstract

Aiming at the overfitting problem caused by the limited sample size in the spectral classification of aero-engine hot jets, this paper proposed a synthetic spectral enhancement classification network FTIR-SpectralGAN for the FT-IR of aeroengine hot jets. Firstly, passive telemetry FTIR spectrometers were used [...] Read more.

Aiming at the overfitting problem caused by the limited sample size in the spectral classification of aero-engine hot jets, this paper proposed a synthetic spectral enhancement classification network FTIR-SpectralGAN for the FT-IR of aeroengine hot jets. Firstly, passive telemetry FTIR spectrometers were used to measure the hot jet spectrum data of six types of aero-engines, and a spectral classification dataset was created. Then, a spectral classification network FTIR-SpectralGAN was designed, which consists of a generator and a discriminator. The generator architecture comprises six Conv1DTranspose layers, with five of these layers integrated with BN and LeakyReLU layers to introduce noise injection. This design enhances the generation capability for complex patterns and facilitates the transformation from noise to high-dimensional data. The discriminator employs a multi-task dual-output structure, consisting of three Conv1D layers combined with LeakyReLU and Dropout techniques. This configuration progressively reduces feature dimensions and mitigates overfitting. During training, the generator learns the underlying distribution of spectral data, while the discriminator distinguishes between real and synthetic data and performs spectral classification. The dataset was randomly partitioned into training, validation, and test sets in an 8:1:1 ratio. For training strategy, an unbalanced alternating training approach was adopted, where the generator is trained first, followed by the discriminator and then the generator again. Additionally, weighted mixed loss and label smoothing strategies were introduced to enhance network training performance. Experimental results demonstrate that the spectral classification accuracy reaches up to 99%, effectively addressing the overfitting issue commonly encountered in CNN-based classification tasks with limited samples. Comparative experiments show that FTIR-SpectralGAN outperforms classical data augmentation methods and CVAE-based synthetic data enhancement approaches. It also achieves higher robustness and classification accuracy compared to other spectral classification methods. Full article

(This article belongs to the Special Issue Recent Advances in Infrared Target Detection)

► Show Figures

Figure 1

24 pages, 8271 KiB

Open AccessArticle

Research on a Potato Leaf Disease Diagnosis System Based on Deep Learning

by Chunhui Zhang, Shuai Wang, Chunguang Wang, Haichao Wang, Yingjie Du and Zheying Zong

Agriculture 2025, 15(4), 424; https://doi.org/10.3390/agriculture15040424 - 18 Feb 2025

Cited by 2 | Viewed by 1042

Abstract

Potato is the fourth largest food crop in the world. Disease is an important factor restricting potato yield. Disease detection based on deep learning has strong advantages in network structure, training speed, detection accuracy, and other aspects. This article took potato leaf diseases [...] Read more.

Potato is the fourth largest food crop in the world. Disease is an important factor restricting potato yield. Disease detection based on deep learning has strong advantages in network structure, training speed, detection accuracy, and other aspects. This article took potato leaf diseases (early blight and viral disease) as the research objects, collected disease images to construct a disease dataset, and expanded the dataset through data augmentation methods to improve the quantity and diversity of the dataset. Four classic deep learning networks (VGG16, MobilenetV1, Resnet50, and Vit) were used to train the dataset, and the VGG16 network had the highest accuracy of 97.26%; VGG16 was chosen as the basic research network. A new, improved algorithm, VGG16S, was proposed to solve the problem of large network parameters by using three improvement methods: changing the network structure of the VGG16 network from “convolutional layer + flattening layer + fully connected layer” to “convolutional layer + global average pooling”, integrating CBAM attention mechanism, and introducing Leaky ReLU activation function for learning and training. The improved VGG16S network has a parameter size of 15 M (1/10 of VGG16), and the recognition accuracy of the test set is 97.87%. This article used response surface analysis to optimize hyperparameters, and the test results indicated that VGG16S, after hyperparameter tuning, had further improved its diagnostic performance. At last, this article completed ablation experiments and public dataset testing. The research results will provide a theoretical basis for the timely adoption of corresponding prevention and control measures, improving the yield and quality of potatoes and increasing economic benefits. Full article

(This article belongs to the Section Digital Agriculture)

► Show Figures

Figure 1

20 pages, 1764 KiB

Open AccessArticle

A Temporal Convolutional Network–Bidirectional Long Short-Term Memory (TCN-BiLSTM) Prediction Model for Temporal Faults in Industrial Equipment

by Jinyin Bai, Wei Zhu, Shuhong Liu, Chenhao Ye, Peng Zheng and Xiangchen Wang

Appl. Sci. 2025, 15(4), 1702; https://doi.org/10.3390/app15041702 - 7 Feb 2025

Cited by 2 | Viewed by 1804

Abstract

Traditional algorithms and single predictive models often face challenges such as limited prediction accuracy and insufficient modeling capabilities for complex time-series data in fault prediction tasks. To address these issues, this paper proposes a combined prediction model based on an improved temporal convolutional [...] Read more.

Traditional algorithms and single predictive models often face challenges such as limited prediction accuracy and insufficient modeling capabilities for complex time-series data in fault prediction tasks. To address these issues, this paper proposes a combined prediction model based on an improved temporal convolutional network (TCN) and bidirectional long short-term memory (BiLSTM), referred to as the TCN-BiLSTM model. This model aims to enhance the reliability and accuracy of time-series fault prediction. It is designed to handle continuous processes but can also be applied to batch and hybrid processes due to its flexible architecture. First, preprocessed industrial operation data are fed into the model, and hyperparameter optimization is conducted using the Optuna framework to improve training efficiency and generalization capability. Then, the model employs an improved TCN layer and a BiLSTM layer for feature extraction and learning. The TCN layer incorporates batch normalization, an optimized activation function (Leaky ReLU), and a dropout mechanism to enhance its ability to capture multi-scale temporal features. The BiLSTM layer further leverages its bidirectional learning mechanism to model the long-term dependencies in the data, enabling effective predictions of complex fault patterns. Finally, the model outputs the prediction results after iterative optimization. To evaluate the performance of the proposed model, simulation experiments were conducted to compare the TCN-BiLSTM model with mainstream prediction methods such as CNN, RNN, BiLSTM, and A-BiLSTM. The experimental results indicate that the TCN-BiLSTM model outperforms the comparison models in terms of prediction accuracy during both the modeling and forecasting stages, providing a feasible solution for time-series fault prediction. Full article

(This article belongs to the Special Issue Applied Artificial Intelligence for Industrial Nondestructive Evaluation NDE4.0)

► Show Figures

Figure 1

29 pages, 81603 KiB

Open AccessArticle

A Pixel-Based Machine Learning Atmospheric Correction for PeruSAT-1 Imagery

by Luis Saldarriaga, Yumin Tan, Neus Sabater and Jesus Delegido

Remote Sens. 2025, 17(3), 460; https://doi.org/10.3390/rs17030460 - 29 Jan 2025

Viewed by 1183

Abstract

Atmospheric correction is essential in remote sensing, as it reduces the effects of light absorption and scattering by suspended particles and gases, enabling accurate surface reflectance computation from the observed Top-of-Atmosphere (TOA) reflectance. Each satellite sensor requires a customized atmospheric correction processor due [...] Read more.

Atmospheric correction is essential in remote sensing, as it reduces the effects of light absorption and scattering by suspended particles and gases, enabling accurate surface reflectance computation from the observed Top-of-Atmosphere (TOA) reflectance. Each satellite sensor requires a customized atmospheric correction processor due to its unique system characteristics. Currently, PeruSAT-1, the first Peruvian remote sensing satellite, does not include this capability in its image processing pipeline, which poses challenges for its effectiveness in defense, security, and natural disaster management applications. This research investigated pixel-based machine learning methods for atmospheric correction of PeruSAT-1, using Sentinel-2 harmonized Bottom-of-Atmosphere (BOA) surface reflectance images as a benchmark, alongside additional atmospheric, terrain, and acquisition parameters. A robust dataset was developed to align data across temporal, spatial, geometric, and contextual conditions. Experimental results showed R² values between 0.886 and 0.938, and RMSE values ranging from 0.009 to 0.025 compared to the benchmarks. Among the models tested, the Feedforward Neural Network (FFNN) using the Leaky ReLU activation function achieved the best overall performance. These findings confirm the robustness of this approach, offering a scalable methodology for satellites with similar characteristics and establishing a foundation for a customized atmospheric correction pipeline for PeruSAT-1. Future work will focus on diversifying the dataset across spectral and seasonal conditions and optimizing the modeling to address challenges in shorter wavelengths and high-reflectance surfaces. Full article

► Show Figures

Figure 1

25 pages, 6447 KiB

Open AccessArticle

ReLU, Sparseness, and the Encoding of Optic Flow in Neural Networks

by Oliver W. Layton, Siyuan Peng and Scott T. Steinmetz

Sensors 2024, 24(23), 7453; https://doi.org/10.3390/s24237453 - 22 Nov 2024

Viewed by 1237

Abstract

Accurate self-motion estimation is critical for various navigational tasks in mobile robotics. Optic flow provides a means to estimate self-motion using a camera sensor and is particularly valuable in GPS- and radio-denied environments. The present study investigates the influence of different activation functions—ReLU, [...] Read more.

Accurate self-motion estimation is critical for various navigational tasks in mobile robotics. Optic flow provides a means to estimate self-motion using a camera sensor and is particularly valuable in GPS- and radio-denied environments. The present study investigates the influence of different activation functions—ReLU, leaky ReLU, GELU, and Mish—on the accuracy, robustness, and encoding properties of convolutional neural networks (CNNs) and multi-layer perceptrons (MLPs) trained to estimate self-motion from optic flow. Our results demonstrate that networks with ReLU and leaky ReLU activation functions not only achieved superior accuracy in self-motion estimation from novel optic flow patterns but also exhibited greater robustness under challenging conditions. The advantages offered by ReLU and leaky ReLU may stem from their ability to induce sparser representations than GELU and Mish do. Our work characterizes the encoding of optic flow in neural networks and highlights how the sparseness induced by ReLU may enhance robust and accurate self-motion estimation from optic flow. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

14 pages, 12763 KiB

Open AccessArticle

Semantic Segmentation Model-Based Boundary Line Recognition Method for Wheat Harvesting

by Qian Wang, Wuchang Qin, Mengnan Liu, Junjie Zhao, Qingzhen Zhu and Yanxin Yin

Agriculture 2024, 14(10), 1846; https://doi.org/10.3390/agriculture14101846 - 19 Oct 2024

Cited by 10 | Viewed by 1445

Abstract

The wheat harvesting boundary line is vital reference information for the path tracking of an autonomously driving combine harvester. However, unfavorable factors, such as a complex light environment, tree shade, weeds, and wheat stubble color interference in the field, make it challenging to [...] Read more.

The wheat harvesting boundary line is vital reference information for the path tracking of an autonomously driving combine harvester. However, unfavorable factors, such as a complex light environment, tree shade, weeds, and wheat stubble color interference in the field, make it challenging to identify the wheat harvest boundary line accurately and quickly. Therefore, this paper proposes a harvest boundary line recognition model for wheat harvesting based on the MV3_DeepLabV3+ network framework, which can quickly and accurately complete the identification in complex environments. The model uses the lightweight MobileNetV3_Large as the backbone network and the LeakyReLU activation function to avoid the neural death problem. Depth-separable convolution is introduced into Atrous Spatial Pyramid Pooling (ASPP) to reduce the complexity of network parameters. The cubic B-spline curve-fitting method extracts the wheat harvesting boundary line. A prototype harvester for wheat harvesting boundary recognition was built, and field tests were conducted. The test results show that the wheat harvest boundary line recognition model proposed in this paper achieves a segmentation accuracy of 98.04% for unharvested wheat regions in complex environments, with an IoU of 95.02%. When the combine harvester travels at 0~1.5 m/s, the normal speed for operation, the average processing time and pixel error for a single image are 0.15 s and 7.3 pixels, respectively. This method could achieve high recognition accuracy and fast recognition speed. This paper provides a practical reference for the autonomous harvesting operation of a combine harvester. Full article

(This article belongs to the Special Issue Agricultural Collaborative Robots for Smart Farming)

► Show Figures

Figure 1

18 pages, 4823 KiB

Open AccessArticle

Prediction of PM_2.5 Concentration Based on Deep Learning for High-Dimensional Time Series

by Jie Hu, Yuan Jia, Zhen-Hong Jia, Cong-Bing He, Fei Shi and Xiao-Hui Huang

Appl. Sci. 2024, 14(19), 8745; https://doi.org/10.3390/app14198745 - 27 Sep 2024

Cited by 2 | Viewed by 1059

Abstract

PM_2.5 poses a serious threat to human life and health, so the accurate prediction of PM_2.5 concentration is essential for controlling air pollution. However, previous studies lacked the generalization ability to predict high-dimensional PM_2.5 concentration time series. Therefore, a new [...] Read more.

PM_2.5 poses a serious threat to human life and health, so the accurate prediction of PM_2.5 concentration is essential for controlling air pollution. However, previous studies lacked the generalization ability to predict high-dimensional PM_2.5 concentration time series. Therefore, a new model for predicting PM_2.5 concentration was proposed to address this in this paper. Firstly, the linear rectification function with leakage (LeakyRelu) was used to replace the activation function in the Temporal Convolutional Network (TCN) to better capture the dependence of feature data over long distances. Next, the residual structure, dilated rate, and feature-matching convolution position of the TCN were adjusted to improve the performance of the improved TCN (LR-TCN) and reduce the amount of computation. Finally, a new prediction model (GRU-LR-TCN) was established, which adaptively integrated the prediction of the fused Gated Recurrent Unit (GRU) and LR-TCN based on the inverse ratio of root mean square error (RMSE) weighting. The experimental results show that, for monitoring station #1001, LR-TCN increased the RMSE, mean absolute error (MAE), and determination coefficient (R²) by 12.9%, 11.3%, and 3.8%, respectively, compared with baselines. Compared with LR-TCN, GRU-LR-TCN improved the index symmetric mean absolute percentage error (SMAPE) by 7.1%. In addition, by comparing the estimation results with other models on other air quality datasets, all the indicators have advantages, and it is further demonstrated that the GRU-LR-TCN model exhibits superior generalization across various datasets, proving to be more efficient and applicable in predicting urban PM_2.5 concentration. This can contribute to enhancing air quality and safeguarding public health. Full article

(This article belongs to the Section Ecology Science and Engineering)

► Show Figures

Figure 1

18 pages, 3237 KiB

Open AccessArticle

Lightweight Wheat Spike Detection Method Based on Activation and Loss Function Enhancements for YOLOv5s

by Jingsong Li, Feijie Dai, Haiming Qian, Linsheng Huang and Jinling Zhao

Agronomy 2024, 14(9), 2036; https://doi.org/10.3390/agronomy14092036 - 6 Sep 2024

Cited by 2 | Viewed by 1032

Abstract

Wheat spike count is one of the critical indicators for assessing the growth and yield of wheat. However, illumination variations, mutual occlusion, and background interference have greatly affected wheat spike detection. A lightweight detection method was proposed based on the YOLOv5s. Initially, the [...] Read more.

Wheat spike count is one of the critical indicators for assessing the growth and yield of wheat. However, illumination variations, mutual occlusion, and background interference have greatly affected wheat spike detection. A lightweight detection method was proposed based on the YOLOv5s. Initially, the original YOLOv5s was improved by combing the additional small-scale detection layer and integrating the ECA (Efficient Channel Attention) attention mechanism into all C3 modules (YOLOv5s + 4 + ECAC3). After comparing GhostNet, ShuffleNetV2, and MobileNetV3, the GhostNet architecture was finally selected as the optimal lightweight model framework based on its superior performance in various evaluations. Subsequently, the incorporation of five different activation functions into the network led to the identification of the RReLU (Randomized Leaky ReLU) activation function as the most effective in augmenting the network’s performance. Ultimately, the network’s loss function of CIoU (Complete Intersection over Union) was optimized using the EIoU (Efficient Intersection over Union) loss function. Despite a minor reduction of 2.17% in accuracy for the refined YOLOv5s + 4 + ECAC3 + G + RR + E network when compared to the YOLOv5s + 4 + ECAC3, there was a marginal improvement of 0.77% over the original YOLOv5s. Furthermore, the parameter count was diminished by 32% and 28.2% relative to the YOLOv5s + 4 + ECAC3 and YOLOv5s, respectively. The model size was reduced by 28.0% and 20%, and the Giga Floating-point Operations Per Second (GFLOPs) were lowered by 33.2% and 9.5%, respectively, signifying a substantial improvement in the network’s efficiency without significantly compromising accuracy. This study offers a methodological reference for the rapid and accurate detection of agricultural objects through the enhancement of a deep learning network. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

25 pages, 1209 KiB

Open AccessArticle

Skin Cancer Classification Using Fine-Tuned Transfer Learning of DENSENET-121

by Abayomi Bello, Sin-Chun Ng and Man-Fai Leung

Appl. Sci. 2024, 14(17), 7707; https://doi.org/10.3390/app14177707 - 31 Aug 2024

Cited by 15 | Viewed by 4035

Abstract

Skin cancer diagnosis greatly benefits from advanced machine learning techniques, particularly fine-tuned deep learning models. In our research, we explored the impact of traditional machine learning and fine-tuned deep learning approaches on prediction accuracy. Our findings reveal significant improvements in predictability and accuracy [...] Read more.

Skin cancer diagnosis greatly benefits from advanced machine learning techniques, particularly fine-tuned deep learning models. In our research, we explored the impact of traditional machine learning and fine-tuned deep learning approaches on prediction accuracy. Our findings reveal significant improvements in predictability and accuracy with fine-tuning, particularly evident in deep learning models. The CNN, SVM, and Random Forest Classifier achieved high accuracy. However, fine-tuned deep learning models such as EfficientNetB0, ResNet34, VGG16, Inception _v3, and DenseNet121 demonstrated superior performance. To ensure comparability, we fine-tuned these models by incorporating additional layers, including one flatten layer and three densely interconnected layers. These layers play a crucial role in enhancing model efficiency and performance. The flatten layer preprocesses multidimensional feature maps, facilitating efficient information flow, while subsequent dense layers refine feature representations, capturing intricate patterns and relationships within the data. Leveraging LeakyReLU activation functions in the dense layers mitigates the vanishing gradient problem and promotes stable training. Finally, the output dense layer with a sigmoid activation function simplifies decision making for healthcare professionals by providing binary classification output. Our study underscores the significance of incorporating additional layers in fine-tuned neural network models for skin cancer classification, offering improved accuracy and reliability in diagnosis. Full article

► Show Figures

Figure 1

17 pages, 3337 KiB

Open AccessArticle

Evaluating Activation Functions in GAN Models for Virtual Inpainting: A Path to Architectural Heritage Restoration

by Ana M. Maitin, Alberto Nogales, Emilio Delgado-Martos, Giovanni Intra Sidola, Carlos Pesqueira-Calvo, Gabriel Furnieles and Álvaro J. García-Tejedor

Appl. Sci. 2024, 14(16), 6854; https://doi.org/10.3390/app14166854 - 6 Aug 2024

Cited by 3 | Viewed by 2172

Abstract

Computer vision has advanced much in recent years. Several tasks, such as image recognition, classification, or image restoration, are regularly solved with applications using artificial intelligence techniques. Image restoration comprises different use cases such as style transferring, improvement of quality resolution, or completing [...] Read more.

Computer vision has advanced much in recent years. Several tasks, such as image recognition, classification, or image restoration, are regularly solved with applications using artificial intelligence techniques. Image restoration comprises different use cases such as style transferring, improvement of quality resolution, or completing missing parts. The latter is also known as image inpainting, virtual image inpainting in this case, which consists of reconstructing missing regions or elements. This paper explores how to evaluate the performance of a deep learning method to do virtual image inpainting to reconstruct missing architectonical elements in images of ruined Greek temples to measure the performance of different activation functions. Unlike a previous study related to this work, a direct reconstruction process without segmented images was used. Then, two evaluation methods are presented: the objective one (mathematical metrics) and an expert (visual perception) evaluation to measure the performance of the different approaches. Results conclude that ReLU outperforms other activation functions, while Mish and Leaky ReLU perform poorly, and Swish’s professional evaluations highlight a gap between mathematical metrics and human visual perception. Full article

(This article belongs to the Topic Preserving Cultural Heritage by Integrating Modern Materials and Technologies: From the Nano to Building Scale)

► Show Figures

Figure 1

17 pages, 4274 KiB

Open AccessArticle

Improved Brain Tumor Segmentation in MR Images with a Modified U-Net

by Hiam Alquran, Mohammed Alslatie, Ali Rababah and Wan Azani Mustafa

Appl. Sci. 2024, 14(15), 6504; https://doi.org/10.3390/app14156504 - 25 Jul 2024

Cited by 2 | Viewed by 3671

Abstract

Detecting brain tumors is crucial in medical diagnostics due to the serious health risks these abnormalities present to patients. Deep learning approaches can significantly improve localization in various medical issues, particularly brain tumors. This paper emphasizes the use of deep learning models to [...] Read more.

Detecting brain tumors is crucial in medical diagnostics due to the serious health risks these abnormalities present to patients. Deep learning approaches can significantly improve localization in various medical issues, particularly brain tumors. This paper emphasizes the use of deep learning models to segment brain tumors using a large dataset. The study involves comparing modifications to U-Net structures, including kernel size, number of channels, dropout ratio, and changing the activation function from ReLU to Leaky ReLU. Optimizing these parameters has notably enhanced brain tumor segmentation in MR images, achieving a Global Accuracy of 99.4% and a dice similarity coefficient of 90.2%. The model was trained, validated, and tested on many magnetic resonance images, with a training time not exceeding 19 min on a powerful GPU. This approach can be extended in medical care and hospitals to assist radiologists in identifying tumor locations and suspicious regions, thereby improving diagnosis and treatment effectiveness. The software could also be integrated into MR equipment protocols. Full article

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning for Improving Medical Treatment and Healthcare Systems)

► Show Figures

Figure 1

Search Results (70)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (70)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI