Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (22)

Search Parameters:
Keywords = convolution-augmented gated attention

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 15872 KB  
Article
Gated Attention-Augmented Double U-Net for White Blood Cell Segmentation
by Ilyes Benaissa, Athmane Zitouni, Salim Sbaa, Nizamettin Aydin, Ahmed Chaouki Megherbi, Abdellah Zakaria Sellam, Abdelmalik Taleb-Ahmed and Cosimo Distante
J. Imaging 2025, 11(11), 386; https://doi.org/10.3390/jimaging11110386 - 1 Nov 2025
Viewed by 752
Abstract
Segmentation of white blood cells is critical for a wide range of applications. It aims to identify and isolate individual white blood cells from medical images, enabling accurate diagnosis and monitoring of diseases. In the last decade, many researchers have focused on this [...] Read more.
Segmentation of white blood cells is critical for a wide range of applications. It aims to identify and isolate individual white blood cells from medical images, enabling accurate diagnosis and monitoring of diseases. In the last decade, many researchers have focused on this task using U-Net, one of the most used deep learning architectures. To further enhance segmentation accuracy and robustness, recent advances have explored the combination of U-Net with other techniques, such as attention mechanisms and aggregation techniques. However, a common challenge in white blood cell image segmentation is the similarity between the cells’ cytoplasm and other surrounding blood components, which often leads to inaccurate or incomplete segmentation due to difficulties in distinguishing low-contrast or subtle boundaries, leaving a significant gap for improvement. In this paper, we propose GAAD-U-Net, a novel architecture that integrates attention-augmented convolutions to better capture ambiguous boundaries and complex structures such as overlapping cells and low-contrast regions, followed by a gating mechanism to further suppress irrelevant feature information. These two key components are integrated in the Double U-Net base architecture. Our model achieves state-of-the-art performance on white blood cell benchmark datasets, with a 3.4% Dice score coefficient (DSC) improvement specifically on the SegPC-2021 dataset. The proposed model achieves superior performance as measured by mean the intersection over union (IoU) and DSC, with notably strong segmentation performance even for difficult images. Full article
(This article belongs to the Special Issue Computer Vision for Medical Image Analysis)
Show Figures

Figure 1

21 pages, 5368 KB  
Article
Predicting Urban Traffic Under Extreme Weather by Deep Learning Method with Disaster Knowledge
by Jiting Tang, Yuyao Zhu, Saini Yang and Carlo Jaeger
Appl. Sci. 2025, 15(17), 9848; https://doi.org/10.3390/app15179848 - 8 Sep 2025
Viewed by 1823
Abstract
Meteorological and climatological trends are surely changing the way urban infrastructure systems need to be operated and maintained. Urban road traffic fluctuates more significantly under the interference of strong wind–rain weather, especially during tropical cyclones. Deep learning-based methods have significantly improved the accuracy [...] Read more.
Meteorological and climatological trends are surely changing the way urban infrastructure systems need to be operated and maintained. Urban road traffic fluctuates more significantly under the interference of strong wind–rain weather, especially during tropical cyclones. Deep learning-based methods have significantly improved the accuracy of traffic prediction under extreme weather, but their robustness still has much room for improvement. As the frequency of extreme weather events increases due to climate change, accurately predicting spatiotemporal patterns of urban road traffic is crucial for a resilient transportation system. The compounding effects of the hazards, environments, and urban road network determine the spatiotemporal distribution of urban road traffic during an extreme weather event. In this paper, a novel Knowledge-driven Attribute-Augmented Attention Spatiotemporal Graph Convolutional Network (KA3STGCN) framework is proposed to predict urban road traffic under compound hazards. We design a disaster-knowledge attribute-augmented unit to enhance the model’s ability to perceive real-time hazard intensity and road vulnerability. The attribute-augmented unit includes the dynamic hazard attributes and static environment attributes besides the road traffic information. In addition, we improve feature extraction by combining Graph Convolutional Network, Gated Recurrent Unit, and the attention mechanism. A real-world dataset in Shenzhen City, China, was employed to validate the proposed framework. The findings show that the prediction accuracy of traffic speed can be significantly increased by 12.16%~31.67% with disaster information supplemented, and the framework performs robustly on different road vulnerabilities and hazard intensities. The framework can be migrated to other regions and disaster scenarios in order to strengthen city resilience. Full article
Show Figures

Figure 1

26 pages, 1790 KB  
Article
A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset
by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose
AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025
Cited by 1 | Viewed by 2261
Abstract
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the [...] Read more.
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

21 pages, 7844 KB  
Article
WRRT-DETR: Weather-Robust RT-DETR for Drone-View Object Detection in Adverse Weather
by Bei Liu, Jiangliang Jin, Yihong Zhang and Chen Sun
Drones 2025, 9(5), 369; https://doi.org/10.3390/drones9050369 - 14 May 2025
Cited by 8 | Viewed by 4976
Abstract
With the rapid advancement of UAV technology, robust object detection under adverse weather conditions has become critical for enhancing UAVs’ environmental perception. However, object detection in such challenging conditions remains a significant hurdle, and standardized evaluation benchmarks are still lacking. To bridge this [...] Read more.
With the rapid advancement of UAV technology, robust object detection under adverse weather conditions has become critical for enhancing UAVs’ environmental perception. However, object detection in such challenging conditions remains a significant hurdle, and standardized evaluation benchmarks are still lacking. To bridge this gap, we introduce the Adverse Weather Object Detection (AWOD) dataset—a large-scale dataset tailored for object detection in complex maritime environments. The AWOD dataset comprises 20,000 images captured under three representative adverse weather conditions: foggy, flare, and low-light. To address the challenges of scale variation and visual degradation introduced by harsh weather, we propose WRRT-DETR, a weather-robust object detection framework optimized for small objects. Within this framework, we design a gated single-head global–local attention backbone block (GLCE) to fuse local convolutional features with global attention, enhancing small object distinguishability. Additionally, a Frequency–Spatial Feature Augmentation Module (FSAE) is introduced to incorporate frequency-domain information for improved robustness, while an Attention-based Cross-Fusion Module (ACFM) facilitates the integration of multi-scale features. Experimental results demonstrate that WRRT-DETR outperforms SOTA methods on the AWOD dataset, exhibiting superior robustness and detection accuracy in complex weather conditions. Full article
Show Figures

Figure 1

28 pages, 4882 KB  
Article
A Daily Runoff Prediction Model for the Yangtze River Basin Based on an Improved Generative Adversarial Network
by Tong Liu, Xudong Cui and Li Mo
Sustainability 2025, 17(7), 2990; https://doi.org/10.3390/su17072990 - 27 Mar 2025
Cited by 2 | Viewed by 1146
Abstract
Hydrological runoff prediction plays a crucial role in water resource management and sustainable development. However, it is often constrained by the nonlinearity, strong stochasticity, and high non-stationarity of hydrological data, as well as the limited accuracy of traditional forecasting methods. Although Wasserstein Generative [...] Read more.
Hydrological runoff prediction plays a crucial role in water resource management and sustainable development. However, it is often constrained by the nonlinearity, strong stochasticity, and high non-stationarity of hydrological data, as well as the limited accuracy of traditional forecasting methods. Although Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) have been widely used for data augmentation to enhance predictive model training, their direct application as forecasting models remains limited. Additionally, the architectures of the generator and discriminator in WGAN-GP have not been fully optimized, and their potential in hydrological forecasting has not been thoroughly explored. Meanwhile, the strategy of jointly optimizing Variational Autoencoders (VAEs) with WGAN-GP is still in its infancy in this field. To address these challenges and promote more accurate and sustainable water resource planning, this study proposes a comprehensive forecasting model, VXWGAN-GP, which integrates Variational Autoencoders (VAEs), WGAN-GP, Convolutional Neural Networks (CNN), Bidirectional Long Short-Term Memory Networks (BiLSTM), Gated Recurrent Units (GRUs), and Attention mechanisms. The VAE enhances feature representation by learning the data distribution and generating new features, which are then combined with the original features to improve predictive performance. The generator integrates GRU, BiLSTM, and Attention mechanisms: GRU captures short-term dependencies, BiLSTM captures long-term dependencies, and Attention focuses on critical time steps to generate forecasting results. The discriminator, based on CNN, evaluates the differences between the generated and real data through adversarial training, thereby optimizing the generator’s forecasting ability and achieving high-precision runoff prediction. This study conducts daily runoff prediction experiments at the Yichang, Cuntan, and Pingshan hydrological stations in the Yangtze River Basin. The results demonstrate that VXWGAN-GP significantly improves the quality of input features and enhances runoff prediction accuracy, offering a reliable tool for sustainable hydrological forecasting and water resource management. By providing more precise and robust runoff predictions, this model contributes to long-term water sustainability and resilience in hydrological systems. Full article
Show Figures

Figure 1

20 pages, 9472 KB  
Article
A Novel RUL-Centric Data Augmentation Method for Predicting the Remaining Useful Life of Bearings
by Miao He, Zhonghua Li and Fangchao Hu
Machines 2024, 12(11), 766; https://doi.org/10.3390/machines12110766 - 30 Oct 2024
Cited by 3 | Viewed by 1393
Abstract
Maintaining the reliability of rotating machinery in industrial environments entails significant challenges. The objective of this paper is to develop a methodology that can accurately predict the condition of rotating machinery in order to facilitate the implementation of effective preventive maintenance strategies. This [...] Read more.
Maintaining the reliability of rotating machinery in industrial environments entails significant challenges. The objective of this paper is to develop a methodology that can accurately predict the condition of rotating machinery in order to facilitate the implementation of effective preventive maintenance strategies. This article proposed a novel RUL-centric data augmentation method, designated as DF-MDAGRU, for the purpose of predicting the remaining useful life (RUL) of bearings. This model is based on an encoder–decoder framework that integrates time–frequency domain feature enhancement with multidimensional dynamic attention gated recurrent units for feature extraction. This method enhances time–frequency domain features through the Discrete Wavelet Downsampling module (DWD) and Convolutional Fourier Residual Block (CFRB). This method employs a Multiscale Channel Attention Module (MS-CAM) and a Multiscale Convolutional Spatial Attention Mechanism (MSSAM) to extract channel and spatial feature information. Finally, the output predictions are processed through linear regression to achieve the final RUL estimation. Experimental results demonstrate that the proposed method outperforms other state-of-the-art approaches on the FEMETO-ST and XJTU datasets. Full article
Show Figures

Figure 1

21 pages, 7299 KB  
Article
RDAG U-Net: An Advanced AI Model for Efficient and Accurate CT Scan Analysis of SARS-CoV-2 Pneumonia Lesions
by Chih-Hui Lee, Cheng-Tang Pan, Ming-Chan Lee, Chih-Hsuan Wang, Chun-Yung Chang and Yow-Ling Shiue
Diagnostics 2024, 14(18), 2099; https://doi.org/10.3390/diagnostics14182099 - 23 Sep 2024
Cited by 3 | Viewed by 2059
Abstract
Background/Objective: This study aims to utilize advanced artificial intelligence (AI) image recog-nition technologies to establish a robust system for identifying features in lung computed tomog-raphy (CT) scans, thereby detecting respiratory infections such as SARS-CoV-2 pneumonia. Spe-cifically, the research focuses on developing a new [...] Read more.
Background/Objective: This study aims to utilize advanced artificial intelligence (AI) image recog-nition technologies to establish a robust system for identifying features in lung computed tomog-raphy (CT) scans, thereby detecting respiratory infections such as SARS-CoV-2 pneumonia. Spe-cifically, the research focuses on developing a new model called Residual-Dense-Attention Gates U-Net (RDAG U-Net) to improve accuracy and efficiency in identification. Methods: This study employed Attention U-Net, Attention Res U-Net, and the newly developed RDAG U-Net model. RDAG U-Net extends the U-Net architecture by incorporating ResBlock and DenseBlock modules in the encoder to retain training parameters and reduce computation time. The training dataset in-cludes 3,520 CT scans from an open database, augmented to 10,560 samples through data en-hancement techniques. The research also focused on optimizing convolutional architectures, image preprocessing, interpolation methods, data management, and extensive fine-tuning of training parameters and neural network modules. Result: The RDAG U-Net model achieved an outstanding accuracy of 93.29% in identifying pulmonary lesions, with a 45% reduction in computation time compared to other models. The study demonstrated that RDAG U-Net performed stably during training and exhibited good generalization capability by evaluating loss values, model-predicted lesion annotations, and validation-epoch curves. Furthermore, using ITK-Snap to convert 2D pre-dictions into 3D lung and lesion segmentation models, the results delineated lesion contours, en-hancing interpretability. Conclusion: The RDAG U-Net model showed significant improvements in accuracy and efficiency in the analysis of CT images for SARS-CoV-2 pneumonia, achieving a 93.29% recognition accuracy and reducing computation time by 45% compared to other models. These results indicate the potential of the RDAG U-Net model in clinical applications, as it can accelerate the detection of pulmonary lesions and effectively enhance diagnostic accuracy. Additionally, the 2D and 3D visualization results allow physicians to understand lesions' morphology and distribution better, strengthening decision support capabilities and providing valuable medical diagnosis and treatment planning tools. Full article
Show Figures

Figure 1

22 pages, 4040 KB  
Article
CSINet: A Cross-Scale Interaction Network for Lightweight Image Super-Resolution
by Gang Ke, Sio-Long Lo, Hua Zou, Yi-Feng Liu, Zhen-Qiang Chen and Jing-Kai Wang
Sensors 2024, 24(4), 1135; https://doi.org/10.3390/s24041135 - 9 Feb 2024
Cited by 5 | Viewed by 2316
Abstract
In recent years, advancements in deep Convolutional Neural Networks (CNNs) have brought about a paradigm shift in the realm of image super-resolution (SR). While augmenting the depth and breadth of CNNs can indeed enhance network performance, it often comes at the expense of [...] Read more.
In recent years, advancements in deep Convolutional Neural Networks (CNNs) have brought about a paradigm shift in the realm of image super-resolution (SR). While augmenting the depth and breadth of CNNs can indeed enhance network performance, it often comes at the expense of heightened computational demands and greater memory usage, which can restrict practical deployment. To mitigate this challenge, we have incorporated a technique called factorized convolution and introduced the efficient Cross-Scale Interaction Block (CSIB). CSIB employs a dual-branch structure, with one branch extracting local features and the other capturing global features. Interaction operations take place in the middle of this dual-branch structure, facilitating the integration of cross-scale contextual information. To further refine the aggregated contextual information, we designed an Efficient Large Kernel Attention (ELKA) using large convolutional kernels and a gating mechanism. By stacking CSIBs, we have created a lightweight cross-scale interaction network for image super-resolution named “CSINet”. This innovative approach significantly reduces computational costs while maintaining performance, providing an efficient solution for practical applications. The experimental results convincingly demonstrate that our CSINet surpasses the majority of the state-of-the-art lightweight super-resolution techniques used on widely recognized benchmark datasets. Moreover, our smaller model, CSINet-S, shows an excellent performance record on lightweight super-resolution benchmarks with extremely low parameters and Multi-Adds (e.g., 33.82 dB@Set14 × 2 with only 248 K parameters). Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 23037 KB  
Article
A Novel Piecewise Cubic Hermite Interpolating Polynomial-Enhanced Convolutional Gated Recurrent Method under Multiple Sensor Feature Fusion for Tool Wear Prediction
by Jigang He, Luyao Yuan, Haotian Lei, Kaixuan Wang, Yang Weng and Hongli Gao
Sensors 2024, 24(4), 1129; https://doi.org/10.3390/s24041129 - 8 Feb 2024
Cited by 7 | Viewed by 2859
Abstract
The monitoring of the lifetime of cutting tools often faces problems such as life data loss, drift, and distortion. The prediction of the lifetime in this situation is greatly compromised with respect to the accuracy. The recent rise of deep learning, such as [...] Read more.
The monitoring of the lifetime of cutting tools often faces problems such as life data loss, drift, and distortion. The prediction of the lifetime in this situation is greatly compromised with respect to the accuracy. The recent rise of deep learning, such as Gated Recurrent Unit Units (GRUs), Hidden Markov Models (HMMs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Attention networks, and Transformers, has dramatically improved the data problems in tool lifetime prediction, substantially enhancing the accuracy of tool wear prediction. In this paper, we introduce a novel approach known as PCHIP-Enhanced ConvGRU (PECG), which leverages multiple—feature fusion for tool wear prediction. When compared to traditional models such as CNNs, the CNN Block, and GRUs, our method consistently outperformed them across all key performance metrics, with a primary focus on the accuracy. PECG addresses the challenge of missing tool wear measurement data in relation to sensor data. By employing PCHIP interpolation to fill in the gaps in the wear values, we have developed a model that combines the strengths of both CNNs and GRUs with data augmentation. The experimental results demonstrate that our proposed method achieved an exceptional relative accuracy of 0.8522, while also exhibiting a Pearson’s Correlation Coefficient (PCC) exceeding 0.95. This innovative approach not only predicts tool wear with remarkable precision, but also offers enhanced stability. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

30 pages, 38046 KB  
Article
MosReformer: Reconstruction and Separation of Multiple Moving Targets for Staggered SAR Imaging
by Xin Qi, Yun Zhang, Yicheng Jiang, Zitao Liu and Chang Yang
Remote Sens. 2023, 15(20), 4911; https://doi.org/10.3390/rs15204911 - 11 Oct 2023
Cited by 2 | Viewed by 1780
Abstract
Maritime moving target imaging using synthetic aperture radar (SAR) demands high resolution and wide swath (HRWS). Using the variable pulse repetition interval (PRI), staggered SAR can achieve seamless HRWS imaging. The reconstruction should be performed since the variable PRI causes echo pulse loss [...] Read more.
Maritime moving target imaging using synthetic aperture radar (SAR) demands high resolution and wide swath (HRWS). Using the variable pulse repetition interval (PRI), staggered SAR can achieve seamless HRWS imaging. The reconstruction should be performed since the variable PRI causes echo pulse loss and nonuniformly sampled signals in azimuth, both of which result in spectrum aliasing. The existing reconstruction methods are designed for stationary scenes and have achieved impressive results. However, for moving targets, these methods inevitably introduce reconstruction errors. The target motion coupled with non-uniform sampling aggravates the spectral aliasing and degrades the reconstruction performance. This phenomenon becomes more severe, particularly in scenes involving multiple moving targets, since the distinct motion parameter has its unique effect on spectrum aliasing, resulting in the overlapping of various aliasing effects. Consequently, it becomes difficult to reconstruct and separate the echoes of the multiple moving targets with high precision in staggered mode. To this end, motivated by deep learning, this paper proposes a novel Transformer-based algorithm to image multiple moving targets in a staggered SAR system. The reconstruction and the separation of the multiple moving targets are achieved through a proposed network named MosReFormer (Multiple moving target separation and reconstruction Transformer). Adopting a gated single-head Transformer network with convolution-augmented joint self-attention, the proposed MosReFormer network can mitigate the reconstruction errors and separate the signals of multiple moving targets simultaneously. Simulations and experiments on raw data show that the reconstructed and separated results are close to ideal imaging results which are sampled uniformly in azimuth with constant PRI, verifying the feasibility and effectiveness of the proposed algorithm. Full article
(This article belongs to the Special Issue Advances in Radar Imaging with Deep Learning Algorithms)
Show Figures

Graphical abstract

13 pages, 2784 KB  
Article
Deep Learning-Based Evaluation of Ultrasound Images for Benign Skin Tumors
by Hyunwoo Lee, Yerin Lee, Seung-Won Jung, Solam Lee, Byungho Oh and Sejung Yang
Sensors 2023, 23(17), 7374; https://doi.org/10.3390/s23177374 - 24 Aug 2023
Cited by 2 | Viewed by 2679
Abstract
In this study, a combined convolutional neural network for the diagnosis of three benign skin tumors was designed, and its effectiveness was verified through quantitative and statistical analysis. To this end, 698 sonographic images were taken and diagnosed at the Department of Dermatology [...] Read more.
In this study, a combined convolutional neural network for the diagnosis of three benign skin tumors was designed, and its effectiveness was verified through quantitative and statistical analysis. To this end, 698 sonographic images were taken and diagnosed at the Department of Dermatology at Severance Hospital in Seoul, Korea, between 10 November 2017 and 17 January 2020. Through an empirical process, a convolutional neural network combining two structures, which consist of a residual structure and an attention-gated structure, was designed. Five-fold cross-validation was applied, and the train set for each fold was augmented by the Fast AutoAugment technique. As a result of training, for three benign skin tumors, an average accuracy of 95.87%, an average sensitivity of 90.10%, and an average specificity of 96.23% were derived. Also, through statistical analysis using a class activation map and physicians’ findings, it was found that the judgment criteria of physicians and the trained combined convolutional neural network were similar. This study suggests that the model designed and trained in this study can be a diagnostic aid to assist physicians and enable more efficient and accurate diagnoses. Full article
Show Figures

Figure 1

43 pages, 9293 KB  
Review
A Review on Neural Network Based Models for Short Term Solar Irradiance Forecasting
by Abbas Mohammed Assaf, Habibollah Haron, Haza Nuzly Abdull Hamed, Fuad A. Ghaleb, Sultan Noman Qasem and Abdullah M. Albarrak
Appl. Sci. 2023, 13(14), 8332; https://doi.org/10.3390/app13148332 - 19 Jul 2023
Cited by 44 | Viewed by 8182
Abstract
The accuracy of solar energy forecasting is critical for power system planning, management, and operation in the global electric energy grid. Therefore, it is crucial to ensure a constant and sustainable power supply to consumers. However, existing statistical and machine learning algorithms are [...] Read more.
The accuracy of solar energy forecasting is critical for power system planning, management, and operation in the global electric energy grid. Therefore, it is crucial to ensure a constant and sustainable power supply to consumers. However, existing statistical and machine learning algorithms are not reliable for forecasting due to the sporadic nature of solar energy data. Several factors influence the performance of solar irradiance, such as forecasting horizon, weather classification, and performance evaluation metrics. Therefore, we provide a review paper on deep learning-based solar irradiance forecasting models. These models include Long Short-Term Memory (LTSM), Gated Recurrent Unit (GRU), Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), Generative Adversarial Networks (GAN), Attention Mechanism (AM), and other existing hybrid models. Based on our analysis, deep learning models perform better than conventional models in solar forecasting applications, especially in combination with some techniques that enhance the extraction of features. Furthermore, the use of data augmentation techniques to improve deep learning performance is useful, especially for deep networks. Thus, this paper is expected to provide a baseline analysis for future researchers to select the most appropriate approaches for photovoltaic power forecasting, wind power forecasting, and electricity consumption forecasting in the medium term and long term. Full article
(This article belongs to the Special Issue Applications of Neural Network Modeling in Distribution Network)
Show Figures

Figure 1

24 pages, 6919 KB  
Article
Milling Surface Roughness Prediction Based on Physics-Informed Machine Learning
by Shi Zeng and Dechang Pi
Sensors 2023, 23(10), 4969; https://doi.org/10.3390/s23104969 - 22 May 2023
Cited by 23 | Viewed by 6784
Abstract
Surface roughness is a key indicator of the quality of mechanical products, which can precisely portray the fatigue strength, wear resistance, surface hardness and other properties of the products. The convergence of current machine-learning-based surface roughness prediction methods to local minima may lead [...] Read more.
Surface roughness is a key indicator of the quality of mechanical products, which can precisely portray the fatigue strength, wear resistance, surface hardness and other properties of the products. The convergence of current machine-learning-based surface roughness prediction methods to local minima may lead to poor model generalization or results that violate existing physical laws. Therefore, this paper combined physical knowledge with deep learning to propose a physics-informed deep learning method (PIDL) for milling surface roughness predictions under the constraints of physical laws. This method introduced physical knowledge in the input phase and training phase of deep learning. Data augmentation was performed on the limited experimental data by constructing surface roughness mechanism models with tolerable accuracy prior to training. In the training, a physically guided loss function was constructed to guide the training process of the model with physical knowledge. Considering the excellent feature extraction capability of convolutional neural networks (CNNs) and gated recurrent units (GRUs) in the spatial and temporal scales, a CNN–GRU model was adopted as the main model for milling surface roughness predictions. Meanwhile, a bi-directional gated recurrent unit and a multi-headed self-attentive mechanism were introduced to enhance data correlation. In this paper, surface roughness prediction experiments were conducted on the open-source datasets S45C and GAMHE 5.0. In comparison with the results of state-of-the-art methods, the proposed model has the highest prediction accuracy on both datasets, and the mean absolute percentage error on the test set was reduced by 3.029% on average compared to the best comparison method. Physical-model-guided machine learning prediction methods may be a future pathway for machine learning evolution. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

14 pages, 3251 KB  
Article
CGA-MGAN: Metric GAN Based on Convolution-Augmented Gated Attention for Speech Enhancement
by Haozhe Chen and Xiaojuan Zhang
Entropy 2023, 25(4), 628; https://doi.org/10.3390/e25040628 - 6 Apr 2023
Cited by 3 | Viewed by 4640
Abstract
In recent years, neural networks based on attention mechanisms have seen increasingly use in speech recognition, separation, and enhancement, as well as other fields. In particular, the convolution-augmented transformer has performed well, as it can combine the advantages of convolution and self-attention. Recently, [...] Read more.
In recent years, neural networks based on attention mechanisms have seen increasingly use in speech recognition, separation, and enhancement, as well as other fields. In particular, the convolution-augmented transformer has performed well, as it can combine the advantages of convolution and self-attention. Recently, the gated attention unit (GAU) was proposed. Compared with traditional multi-head self-attention, approaches with GAU are effective and computationally efficient. In this CGA-MGAN: MetricGAN based on Convolution-augmented Gated Attention for Speech Enhancement, we propose a network for speech enhancement called CGA-MGAN, a kind of MetricGAN based on convolution-augmented gated attention. CGA-MGAN captures local and global correlations in speech signals at the same time by fusing convolution and gated attention units. Experiments on Voice Bank + DEMAND show that our proposed CGA-MGAN model achieves excellent performance (3.47 PESQ, 0.96 STOI, and 11.09 dB SSNR) with a relatively small model size (1.14 M). Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3955 KB  
Article
TS-CGANet: A Two-Stage Complex and Real Dual-Path Sub-Band Fusion Network for Full-Band Speech Enhancement
by Haozhe Chen and Xiaojuan Zhang
Appl. Sci. 2023, 13(7), 4431; https://doi.org/10.3390/app13074431 - 31 Mar 2023
Viewed by 2623
Abstract
Speech enhancement based on deep neural networks faces difficulties, as modeling more frequency bands can lead to a decrease in the resolution of low-frequency bands and increase the computational complexity. Previously, we proposed a convolution-augmented gated attention unit (CGAU), which captured local and [...] Read more.
Speech enhancement based on deep neural networks faces difficulties, as modeling more frequency bands can lead to a decrease in the resolution of low-frequency bands and increase the computational complexity. Previously, we proposed a convolution-augmented gated attention unit (CGAU), which captured local and global correlation in speech signals through the fusion of the convolution and gated attention unit. In this paper, we further improved the CGAU, and proposed a two-stage complex and real dual-path sub-band fusion network for full-band speech enhancement called TS-CGANet. Specifically, we proposed a dual-path CGA network to enhance low-band (0–8 kHz) speech signals. In the medium band (8–16 kHz) and high band (16–24 kHz), noise suppression is only performed in the magnitude domain. The Voice Bank+DEMAND dataset was used to conduct experiments on the proposed TS-CGANet, which consistently outperformed state-of-the-art full-band baselines, as evidenced by the results. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Back to TopTop