MDPI - Publisher of Open Access Journals

18 pages, 2153 KB

Open AccessArticle

MusicDiffusionNet: Enhancing Text-to-Music Generation with Adaptive Style and Multi-Scale Temporal Mixup Strategies

by Leiheng Xu, Jiancong Chen, Chengcheng Li and Jinsong Liang

Appl. Sci. 2026, 16(4), 2066; https://doi.org/10.3390/app16042066 - 20 Feb 2026

Viewed by 634

Text-to-music generation aims to automatically produce audio content with semantic consistency and coherent musical structure based on natural language descriptions. However, existing methods still face challenges in terms of style diversity, rhythmic consistency, and long-term structural modeling. To address these issues, we propose [...] Read more.

Text-to-music generation aims to automatically produce audio content with semantic consistency and coherent musical structure based on natural language descriptions. However, existing methods still face challenges in terms of style diversity, rhythmic consistency, and long-term structural modeling. To address these issues, we propose a novel text-to-music generation model, termed MusicDiffusionNet (MDN), which integrates diffusion models with the WaveNet architecture to jointly model musical semantics and temporal structure in a continuous latent space. By decoupling high-level semantic conditioning from low-level audio generation, MDN enhances its ability to model long-range musical structure while improving semantic alignment between text and generated music with stable generation behavior. Building upon this framework, we further design two complementary mixing strategies to improve generation quality and structural coherence. Adaptive Style Mixing (ASM) performs weighted interpolation among stylistically similar music samples in the style embedding space, incorporating key and harmonic compatibility constraints to expand the style distribution while avoiding dissonance. Multi-scale Temporal Mixing (MTM) adopts beat-aware temporal decomposition, mixing, and reorganization across multiple time scales, thereby enhancing the modeling of both local and global temporal variations while preserving rhythmic periodicity and musical groove. Both strategies are integrated into the diffusion process as conditional augmentation mechanisms, contributing to improved learning stability and representational capacity under limited data conditions. Experimental results on the Audiostock dataset demonstrate that MDN and its mixing strategies achieve consistent improvements across multiple objective metrics, including generation quality, style diversity, and rhythmic coherence, validating the effectiveness of the proposed approach for text-to-music generation. Full article

► Show Figures

Figure 1

25 pages, 18819 KB

Open AccessArticle

Application of the Two-Layer Regularized Gated Recurrent Unit (TLR-GRU) Model Enhanced by Sliding Window Features in Water Quality Parameter Prediction

by Xianhe Wang, Meiqi Liu, Ying Li, Adriano Tavares, Weidong Huang and Yanchun Liang

Entropy 2026, 28(2), 186; https://doi.org/10.3390/e28020186 - 6 Feb 2026

Viewed by 451

Abstract

Water quality monitoring is critical for public health, ecology, and economic sustainability, but traditional methods are limited by temporal-spatial coverage and cost, failing to meet real-time assessment needs. Deep learning for water quality prediction is often hindered by high complexity and noise in [...] Read more.

Water quality monitoring is critical for public health, ecology, and economic sustainability, but traditional methods are limited by temporal-spatial coverage and cost, failing to meet real-time assessment needs. Deep learning for water quality prediction is often hindered by high complexity and noise in raw time series. This study aims to address the high complexity and noise of hydrological time series by proposing a prediction framework integrating sliding window feature enhancement, principal component analysis (PCA), and a two-layer regularized gated recurrent unit (TLR-GRU). The core goal is to achieve high-precision real-time prediction of four key water quality parameters (dissolved oxygen (DO), ammonia nitrogen (

{NH}_{3} - N

), total phosphorus (TP), and total nitrogen (TN)) for aquaculture and irrigation. Sample entropy (SampEn,

m = 2

,

r = 0.2 \times std (X)

), a univariate complexity metric capturing intra-series pattern repetition, quantifies time series regularity, showing sliding windows reduce SampEn by filtering transient noise while retaining ecological patterns. This optimization synergizes with TLR-GRU’s regularization (L2, Dropout) to avoid overfitting. A total of 4970 water quality records (2020–2023, 4 h sampling interval) were collected from a monitoring station in a typical aquaculture-irrigated water body. After dimensionality reduction via PCA, experimental results demonstrate that the TLR-GRU model outperforms six state-of-the-art deep learning models (e.g., TLD-LSTM, WaveNet) on both the base dataset and the sliding window-enhanced dataset. On the latter, DO and TP test set

R^{2}

rise from 0.82 to 0.93 and 0.81 to 0.92, with RMSE decreasing by 49.4% and 55.6%, respectively. This framework supports water resource management, applicable to rivers and lakes beyond aquaculture. Future work will optimize the model and integrate multi-source data. Full article

(This article belongs to the Special Issue Entropy in Machine Learning Applications, 2nd Edition)

► Show Figures

Figure 1

17 pages, 1253 KB

Open AccessArticle

Wavelet-Enhanced Transformer for Adaptive Multi-Period Time Series Forecasting

by Ping Yu, Hoiio Kong and Zijun Li

Appl. Sci. 2025, 15(23), 12698; https://doi.org/10.3390/app152312698 - 30 Nov 2025

Cited by 2 | Viewed by 2166

Abstract

Time series analysis is of critical importance in a wide range of applications, including weather forecasting, anomaly detection, and action recognition. Accurate time series forecasting requires modeling complex temporal dependencies, particularly multi-scale periodic patterns. To address this challenge, we propose a novel Wavelet-Enhanced [...] Read more.

Time series analysis is of critical importance in a wide range of applications, including weather forecasting, anomaly detection, and action recognition. Accurate time series forecasting requires modeling complex temporal dependencies, particularly multi-scale periodic patterns. To address this challenge, we propose a novel Wavelet-Enhanced Transformer (Wave-Net). Wave-Net transforms 1D time series data into 2D matrices based on periodicity, enhancing the capture of temporal patterns through convolutional filters. This paper introduces Wave-Net, a model that incorporates wavelet and Fourier transforms for feature extraction, along with an enhanced cycle offset and optimized dynamic K for improved robustness. The Transformer layer is further refined to bolster long-term modeling capabilities. Evaluations on real-world benchmarks demonstrate that Wave-Net consistently achieves state-of-the-art performance across mainstream time series analysis tasks. Full article

(This article belongs to the Special Issue AI-Based Supervised Prediction Models)

► Show Figures

Figure 1

26 pages, 3670 KB

Open AccessArticle

A Novel WaveNet Deep Learning Approach for Enhanced Bridge Damage Detection

by Mohab Turkomany, AbdelAziz Ibrahem AbdelLatef and Nasim Uddin

Appl. Sci. 2025, 15(22), 12228; https://doi.org/10.3390/app152212228 - 18 Nov 2025

Cited by 1 | Viewed by 2600

Abstract

Bridges are vital components of global infrastructure, with millions constructed over the years. Many of them face aging and are vulnerable to risks. Traditional bridge inspection methods are costly and time-consuming. They often rely on many manual laborers without providing system-level insights. Moreover, [...] Read more.

Bridges are vital components of global infrastructure, with millions constructed over the years. Many of them face aging and are vulnerable to risks. Traditional bridge inspection methods are costly and time-consuming. They often rely on many manual laborers without providing system-level insights. Moreover, these outdated approaches make it difficult to obtain a clear representation of the current bridge health. This paper introduces a novel framework based on deep learning (DL) for identifying local bridge damage using acceleration data collected by Unmanned Aerial Vehicle (UAV)-mounted sensors. The framework employs WaveNet, which was designed as a generative audio DL model. Its causal dilated convolution deals with long-range temporal correlations without recurrence. Two WaveNet regressors are used to predict the damage location and its severity. The methodology is integrated with an optimized sensor spacing strategy for UAV deployments. The results demonstrate that the severity model achieved an average R² = 0.98, while the location model reached R² = 0.85. Optimal sensor spacing “S” was found at S = 1.0 m for localization and S = 0.5 m for severity. A field-simulated case was accurately identified by the two models, representing the potential of the proposed framework for more reliable bridge health monitoring. Full article

(This article belongs to the Special Issue State-of-the-Art Structural Health Monitoring Application)

► Show Figures

Figure 1

23 pages, 3845 KB

Open AccessArticle

A Spatiotemporal Forecasting Method for Cooling Load of Chillers Based on Patch-Specific Dynamic Filtering

by Jie Li, Zhengri Jin and Tao Wu

Sustainability 2025, 17(21), 9883; https://doi.org/10.3390/su17219883 - 5 Nov 2025

Cited by 1 | Viewed by 837

Abstract

Accurate cooling load forecasting in chiller units is critical for building energy optimization, yet remains challenging due to non-stationary nonlinear dynamics driven by coupled external weather variability (solar radiation, ambient temperature) and internal thermal loads. Conventional models fail to capture the spatiotemporal coupling [...] Read more.

Accurate cooling load forecasting in chiller units is critical for building energy optimization, yet remains challenging due to non-stationary nonlinear dynamics driven by coupled external weather variability (solar radiation, ambient temperature) and internal thermal loads. Conventional models fail to capture the spatiotemporal coupling inherent in load time series, violating their stationarity assumptions. To address this, this research proposes OptiNet, a spatiotemporal forecasting framework integrating patch-specific dynamic filtering with graph neural networks. OptiNet partitions multi-sensor data into non-overlapping time patches to develop a dynamic spatiotemporal graph. A learnable routing mechanism then performs adaptive dependency filtering to capture time-varying temporal–spatial correlations, followed by graph convolution for load prediction. Validated on long-term industrial logs (52,075 multi-sensor samples at 20 min; district cooling plant in Zhangjiang, Shanghai, with multiple chillers, towers, pumps, building meters, and a weather station), OptiNet achieves consistently lower MAE and MSE than Graph WaveNet across 6–144-step horizons and sampling frequencies of 20–60 min; among 30 set-tings it leads in 26, with MSE reductions up to 27.8% (60 min, 72-step) and typical long-horizon (72–144 steps) gains of ≈2–18% MSE and ≈1–15% MAE. Crucially, the model provides interpretable spatial-temporal dependencies (e.g., “Zone B solar radiation influences Unit 2 load with 4-h lag”), enabling data-driven chiller sequencing strategies that reduce electricity consumption by 12.7% in real-world deployments—directly advancing energy-efficient building operations. Full article

(This article belongs to the Special Issue AI- and IoT-Driven Solutions for Industrial Sustainability and Smart Manufacturing)

► Show Figures

Figure 1

14 pages, 3118 KB

Open AccessArticle

Reconstruction Modeling and Validation of Brown Croaker (Miichthys miiuy) Vocalizations Using Wavelet-Based Inversion and Deep Learning

by Sunhyo Kim, Jongwook Choi, Bum-Kyu Kim, Hansoo Kim, Donhyug Kang, Jee Woong Choi, Young Geul Yoon and Sungho Cho

Sensors 2025, 25(19), 6178; https://doi.org/10.3390/s25196178 - 6 Oct 2025

Cited by 1 | Viewed by 974

Abstract

Fish species’ biological vocalizations serve as essential acoustic signatures for passive acoustic monitoring (PAM) and ecological assessments. However, limited availability of high-quality acoustic recordings, particularly for region-specific species like the brown croaker (Miichthys miiuy), hampers data-driven bioacoustic methodology development. In this [...] Read more.

Fish species’ biological vocalizations serve as essential acoustic signatures for passive acoustic monitoring (PAM) and ecological assessments. However, limited availability of high-quality acoustic recordings, particularly for region-specific species like the brown croaker (Miichthys miiuy), hampers data-driven bioacoustic methodology development. In this study, we present a framework for reconstructing brown croaker vocalizations by integrating fk14 wavelet synthesis, PSO-based parameter optimization (with an objective combining correlation and normalized MSE), and deep learning-based validation. Sensitivity analysis using a normalized Bartlett processor identified delay and scale (length) as the most critical parameters, defining valid ranges that maintained waveform similarity above 98%. The reconstructed signals matched measured calls in both time and frequency domains, replicating single-pulse morphology, inter-pulse interval (IPI) distributions, and energy spectral density. Validation with a ResNet-18-based Siamese network produced near-unity cosine similarity (~0.9996) between measured and reconstructed signals. Statistical analyses (95% confidence intervals; residual errors) confirmed faithful preservation of SPL values and minor, biologically plausible IPI variations. Under noisy conditions, similarity decreased as SNR dropped, indicating that environmental noise affects reconstruction fidelity. These results demonstrate that the proposed framework can reliably generate acoustically realistic and morphologically consistent fish vocalizations, even under data-limited scenarios. The methodology holds promise for dataset augmentation, PAM applications, and species-specific call simulation. Future work will extend this framework by using reconstructed signals to train generative models (e.g., GANs, WaveNet), enabling scalable synthesis and supporting real-time adaptive modeling in field monitoring. Full article

(This article belongs to the Topic Advances in Underwater Signal Processing and Communication: Challenges, Innovations, and Applications)

► Show Figures

Figure 1

38 pages, 12663 KB

Open AccessArticle

A Transformer-Based Hybrid Neural Network Integrating Multiresolution Turbulence Intensity and Independent Modeling of Multiple Meteorological Features for Wind Speed Forecasting

by Hongbin Liu, Ziyan Wang, Yizhuo Liu, Jie Zhou, Chen Chen, Haoyuan Ma, Xi Huang, Hongqing Wang and Xiaodong Ji

Energies 2025, 18(17), 4571; https://doi.org/10.3390/en18174571 - 28 Aug 2025

Cited by 2 | Viewed by 1340

Abstract

Aiming at the nonlinear, nonstationary, and multiscale fluctuation characteristics of wind speed series, this study proposes a wind speed-forecasting framework that integrates multi-resolution turbulence intensity features and a Transformer-based hybrid neural network. Firstly, based on multi-resolution turbulence intensity and stationary wavelet transform (SWT), [...] Read more.

Aiming at the nonlinear, nonstationary, and multiscale fluctuation characteristics of wind speed series, this study proposes a wind speed-forecasting framework that integrates multi-resolution turbulence intensity features and a Transformer-based hybrid neural network. Firstly, based on multi-resolution turbulence intensity and stationary wavelet transform (SWT), the original wind speed series is decomposed into eight pairs of mean wind speeds and turbulence intensities at different time scales, which are then modeled and predicted in parallel using eight independent LSTM sub-models. Unlike traditional methods treating meteorological variables such as air pressure, temperature, and wind direction as static input features, WaveNet, LSTM, and TCN neural networks are innovatively adopted here to independently model and forecast these meteorological series, thoroughly capturing their dynamic influences on wind speed. Finally, a Transformer-based self-attention mechanism dynamically integrates multiple outputs from the four sub-models to generate final wind speed predictions. Experimental results averaged over three datasets demonstrate superior accuracy and robustness, with MAE, RMSE, MAPE, and

R^{2}

values around 0.65, 0.87, 23.24%, and 0.92, respectively, for a 6 h forecast horizon. Moreover, the proposed framework consistently outperforms all baselines across four categories of comparative experiments, showing strong potential for practical applications in wind power dispatching. Full article

► Show Figures

Figure 1

13 pages, 6118 KB

Open AccessArticle

Wave-Net: A Marine Raft Aquaculture Area Extraction Framework Based on Feature Aggregation and Feature Dispersion for Synthetic Aperture Radar Images

by Chengyi Wang, Lei Wang and Ningyang Li

Sensors 2025, 25(7), 2207; https://doi.org/10.3390/s25072207 - 31 Mar 2025

Cited by 2 | Viewed by 953

Abstract

Monitoring raft aquaculture areas plays an important role in the sustainability of marine aquaculture. With the advantages of full-time observation and ability to penetrate clouds, synthetic aperture radar (SAR) imaging has replaced laborious on-site investigation and has become the preferred approach. However, the [...] Read more.

Monitoring raft aquaculture areas plays an important role in the sustainability of marine aquaculture. With the advantages of full-time observation and ability to penetrate clouds, synthetic aperture radar (SAR) imaging has replaced laborious on-site investigation and has become the preferred approach. However, the existing deep learning-based semantic segmentation approaches generally suffer from speckle noise and have difficulty with multi-scale structures, which blurs the boundaries of raft aquaculture areas, and therefore, they connect them incorrectly. To cope with this problem, a wave-shaped neural network (Wave-Net), which is mainly composed of a feature aggregation part and a feature dispersion part, was proposed. Its feature aggregation part extracts both global and local features from different scales of raft aquaculture areas with asymmetric V-shaped subnetworks. Then, its feature dispersion part uses asymmetric Ʌ-shaped subnetworks to refine the boundaries of different scales of raft aquaculture areas. During these processes, both residual connections and reconstruction losses are adopted between the identical scales of feature maps to promote feature fusion and parameter optimization. The experimental results revealed that the proposed Wave-Net model solved the issue of blurred boundaries and achieved better segmentation accuracy with limited samples. Full article

(This article belongs to the Special Issue Sensors and Sensing Technologies for Precise Earth Observation)

► Show Figures

Figure 1

18 pages, 6225 KB

Open AccessArticle

Research on Urban Road Traffic Flow Prediction Based on Sa-Dynamic Graph Convolutional Neural Network

by Song Hu, Jian Gu and Shun Li

Mathematics 2025, 13(3), 416; https://doi.org/10.3390/math13030416 - 27 Jan 2025

Cited by 5 | Viewed by 3192

Abstract

Neural network models based on GNNs often achieve good results in traffic flow prediction tasks of traffic networks. However, most existing GNN-based methods apply a fixed graph structure to capture spatial dependencies between nodes, and fixed graph structures may not be able to [...] Read more.

Neural network models based on GNNs often achieve good results in traffic flow prediction tasks of traffic networks. However, most existing GNN-based methods apply a fixed graph structure to capture spatial dependencies between nodes, and fixed graph structures may not be able to reflect the spatiotemporal changes in node dependencies. To address this, introducing a self-attention mechanism applied to an adaptive adjacency matrix, the neural network architecture is improved based on Graph WaveNet, and a new approach called self-attention dynamic graph wave network (SA-DGWN) is proposed, which can fit the spatiotemporal dependencies of the road network. In an experiment, traffic flow data were extracted based on RFID from certain roads in Nanjing, China. The results show that under the same configuration, compared to Graph WaveNet, MAE, MAPE, and RMSE from the proposed method reduced by 3.08%, 3.68%, and 2.6%, respectively. In addition, for the training data, we explored the impact of temporal feature and sampling periods on the training effect. The additional results indicate that adding hour-minute-second information to the input improved the model’s accuracy, reducing MAE, MAPE, and RMSE by 15.28%, 12.28%, and 14.01%, respectively. Adding day-of-the-week features also brought substantial performance improvements. For different sampling periods, the model performed better overall with a 10 min sampling period compared to 5 min and 15 min periods. For single-step prediction tasks, the longer the sampling period, the better the prediction effect. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

21 pages, 12323 KB

Open AccessArticle

NILM for Commercial Buildings: Deep Neural Networks Tackling Nonlinear and Multi-Phase Loads

by M. J. S. Kulathilaka, S. Saravanan, H. D. H. P. Kumarasiri, V. Logeeshan, S. Kumarawadu and Chathura Wanigasekara

Energies 2024, 17(15), 3802; https://doi.org/10.3390/en17153802 - 2 Aug 2024

Cited by 6 | Viewed by 2441

Abstract

As energy demand and electricity costs continue to rise, consumers are increasingly adopting energy-efficient practices and appliances, underscoring the need for detailed metering options like appliance-level load monitoring. Non-intrusive load monitoring (NILM) is particularly favored for its minimal hardware requirements and enhanced customer [...] Read more.

As energy demand and electricity costs continue to rise, consumers are increasingly adopting energy-efficient practices and appliances, underscoring the need for detailed metering options like appliance-level load monitoring. Non-intrusive load monitoring (NILM) is particularly favored for its minimal hardware requirements and enhanced customer experience, especially in residential settings. However, commercial power systems present significant challenges due to greater load diversity and imbalance. To address these challenges, we introduce a novel neural network architecture that combines sequence-to-sequence, WaveNet, and ensembling techniques to identify and classify single-phase and three-phase loads using appliance power signatures in commercial power systems. Our approach, validated over four months, achieved an overall accuracy exceeding 93% for nine devices, including six single-phase and four three-phase loads. The study also highlights the importance of incorporating nonlinear loads, such as two different inverter-type air conditioners, within NILM frameworks to ensure accurate energy monitoring. Additionally, we developed a web-based NILM energy dashboard application that enables users to monitor and evaluate load performance, recognize usage patterns, and receive real-time alerts for potential faults. Our findings demonstrate the significant potential of our approach to enhance energy management and conservation efforts in commercial buildings with diverse and complex load profiles, contributing to more efficient energy use and addressing climate change challenges. Full article

(This article belongs to the Section F: Electrical Engineering)

► Show Figures

Figure 1

20 pages, 3042 KB

Open AccessArticle

Voice-Controlled Intelligent Personal Assistant for Call-Center Automation in the Uzbek Language

by Abdinabi Mukhamadiyev, Ilyos Khujayarov and Jinsoo Cho

Electronics 2023, 12(23), 4850; https://doi.org/10.3390/electronics12234850 - 30 Nov 2023

Cited by 6 | Viewed by 5740

Abstract

The demand for customer support call centers has surged across various sectors due to the pandemic. Yet, the constraints of round-the-clock human services and fluctuating wait times pose challenges in fully meeting customer needs. In response, there’s a growing need for automated customer [...] Read more.

The demand for customer support call centers has surged across various sectors due to the pandemic. Yet, the constraints of round-the-clock human services and fluctuating wait times pose challenges in fully meeting customer needs. In response, there’s a growing need for automated customer service systems that can provide responses tailored to specific domains and in the native languages of customers, particularly in developing nations like Uzbekistan where call center usage is on the rise. Our system, “UzAssistant,” is designed to recognize user voices and accurately present customer issues in standardized Uzbek, as well as vocalize the responses to voice queries. It employs feature extraction and recurrent neural network (RNN)-based models for effective automatic speech recognition, achieving an impressive 96.4% accuracy in real-time tests with 56 participants. Additionally, the system incorporates a sentence similarity assessment method and a text-to-speech (TTS) synthesis feature specifically for the Uzbek language. The TTS component utilizes the WaveNet architecture to convert text into speech in Uzbek. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

14 pages, 666 KB

Open AccessArticle

Any-to-One Non-Parallel Voice Conversion System Using an Autoregressive Conversion Model and LPCNet Vocoder

by Kadria Ezzine, Joseph Di Martino and Mondher Frikha

Appl. Sci. 2023, 13(21), 11988; https://doi.org/10.3390/app132111988 - 2 Nov 2023

Cited by 6 | Viewed by 3060

Abstract

We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet vocoder, aimed at enhancing the converted speech in terms of naturalness, intelligibility, and speaker similarity. As the name implies, non-parallel any-to-one voice conversion does not require paired source and [...] Read more.

We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet vocoder, aimed at enhancing the converted speech in terms of naturalness, intelligibility, and speaker similarity. As the name implies, non-parallel any-to-one voice conversion does not require paired source and target speeches and can be employed for arbitrary speech conversion tasks. Recent advancements in neural-based vocoders, such as WaveNet, have improved the efficiency of speech synthesis. However, in practice, we find that the trajectory of some generated waveforms is not consistently smooth, leading to occasional voice errors. To address this issue, we propose to use an autoregressive (AR) conversion model along with the high-fidelity LPCNet vocoder. This combination not only solves the problems of waveform fluidity but also produces more natural and clear speech, with the added capability of real-time speech generation. To precisely represent the linguistic content of a given utterance, we use speaker-independent PPG features (SI-PPG) computed from an automatic speech recognition (ASR) model trained on a multi-speaker corpus. Next, a conversion model maps the SI-PPG to the acoustic representations used as input features for the LPCNet. The proposed autoregressive structure enables our system to produce the following prediction step outputs from the acoustic features predicted in the previous step. We evaluate the effectiveness of our system by performing any-to-one conversion pairs between native English speakers. Experimental results show that the proposed method outperforms state-of-the-art systems, producing higher speech quality and greater speaker similarity. Full article

(This article belongs to the Section Acoustics and Vibrations)

► Show Figures

Figure 1

20 pages, 1450 KB

Open AccessArticle

Detection of Ocean Internal Waves Based on Modified Deep Convolutional Generative Adversarial Network and WaveNet in Moderate Resolution Imaging Spectroradiometer Images

by Zhongyi Jiang, Xing Gao, Lin Shi, Ning Li and Ling Zou

Appl. Sci. 2023, 13(20), 11235; https://doi.org/10.3390/app132011235 - 12 Oct 2023

Cited by 8 | Viewed by 2998

Abstract

The generation and propagation of internal waves in the ocean are a common phenomenon that plays a pivotal role in the transport of mass, momentum, and energy, as well as in global climate change. Internal waves serve as a critical component of oceanic [...] Read more.

The generation and propagation of internal waves in the ocean are a common phenomenon that plays a pivotal role in the transport of mass, momentum, and energy, as well as in global climate change. Internal waves serve as a critical component of oceanic processes, contributing to the redistribution of heat and nutrients in the ocean, which, in turn, has implications for global climate regulation. However, the automatic identification of internal waves in oceanic regions from remote sensing images has presented a significant challenge. In this research paper, we address this challenge by designing a data augmentation approach grounded in a modified deep convolutional generative adversarial network (DCGAN) to enrich MODIS remote sensing image data for the automated detection of internal waves in the ocean. Utilizing t-distributed stochastic neighbor embedding (t-SNE) technology, we demonstrate that the feature distribution of the images produced by the modified DCGAN closely resembles that of the original images. By using t-SNE dimensionality reduction technology to map high-dimensional remote sensing data into a two-dimensional space, we can better understand, visualize, and analyze the quality of data generated by the modified DCGAN. The images generated by the modified DCGAN not only expand the dataset’s size but also exhibit diverse characteristics, enhancing the model’s generalization performance. Furthermore, we have developed a deep neural network named “WaveNet,” which incorporates a channel-wise attention mechanism to effectively handle complex remote sensing images, resulting in high classification accuracy and robustness. It is important to note that this study has limitations, such as the reliance on specific remote sensing data sources and the need for further validation across various oceanic regions. These limitations are essential to consider in the broader context of oceanic research and remote sensing applications. We initially pre-train WaveNet using the EuroSAT remote sensing dataset and subsequently employ it to identify internal waves in MODIS remote sensing images. Experiments show the highest average recognition accuracy achieved is an impressive 98.625%. When compared to traditional data augmentation training sets, utilizing the training set generated by the modified DCGAN leads to a 5.437% enhancement in WaveNet’s recognition rate. Full article

(This article belongs to the Special Issue Remote Sensing Image Processing and Application)

► Show Figures

Figure 1

13 pages, 2155 KB

Open AccessArticle

Multitask Attention-Based Neural Network for Intraoperative Hypotension Prediction

by Meng Shi, Yu Zheng, Youzhen Wu and Quansheng Ren

Bioengineering 2023, 10(9), 1026; https://doi.org/10.3390/bioengineering10091026 - 31 Aug 2023

Cited by 7 | Viewed by 2809

Abstract

Timely detection and response to Intraoperative Hypotension (IOH) during surgery is crucial to avoid severe postoperative complications. Although several methods have been proposed to predict IOH using machine learning, their performance still has space for improvement. In this paper, we propose a ResNet-BiLSTM [...] Read more.

Timely detection and response to Intraoperative Hypotension (IOH) during surgery is crucial to avoid severe postoperative complications. Although several methods have been proposed to predict IOH using machine learning, their performance still has space for improvement. In this paper, we propose a ResNet-BiLSTM model based on multitask training and attention mechanism for IOH prediction. We trained and tested our proposed model using bio-signal waveforms obtained from patient monitoring of non-cardiac surgery. We selected three models (WaveNet, CNN, and TCN) that process time-series data for comparison. The experimental results demonstrate that our proposed model has optimal MSE (43.83) and accuracy (0.9224) compared to other models, including WaveNet (51.52, 0.9087), CNN (318.52, 0.5861), and TCN (62.31, 0.9045), which suggests that our proposed model has better regression and classification performance. We conducted ablation experiments on the multitask and attention mechanisms, and the experimental results demonstrated that the multitask and attention mechanisms improved MSE and accuracy. The results demonstrate the effectiveness and superiority of our proposed model in predicting IOH. Full article

(This article belongs to the Special Issue Monitoring and Analysis of Human Biosignals, Volume II)

► Show Figures

Figure 1

22 pages, 2839 KB

Open AccessArticle

Wind Power Forecasting Based on WaveNet and Multitask Learning

by Hao Wang, Chen Peng, Bolin Liao, Xinwei Cao and Shuai Li

Sustainability 2023, 15(14), 10816; https://doi.org/10.3390/su151410816 - 10 Jul 2023

Cited by 12 | Viewed by 4229

Abstract

Accurately predicting the power output of wind turbines is crucial for ensuring the reliable and efficient operation of large-scale power systems. To address the inherent limitations of physical models, statistical models, and machine learning algorithms, we propose a novel framework for wind turbine [...] Read more.

Accurately predicting the power output of wind turbines is crucial for ensuring the reliable and efficient operation of large-scale power systems. To address the inherent limitations of physical models, statistical models, and machine learning algorithms, we propose a novel framework for wind turbine power prediction. This framework combines a special type of convolutional neural network, WaveNet, with a multigate mixture-of-experts (MMoE) architecture. The integration aims to overcome the inherent limitations by effectively capturing and utilizing complex patterns and trends in the time series data. First, the maximum information coefficient (MIC) method is applied to handle data features, and the wavelet transform technique is employed to remove noise from the data. Subsequently, WaveNet utilizes its scalable convolutional network to extract representations of wind power data and effectively capture long-range temporal information. These representations are then fed into the MMoE architecture, which treats multistep time series prediction as a set of independent yet interrelated tasks, allowing for information sharing among different tasks to prevent error accumulation and improve prediction accuracy. We conducted predictions for various forecasting horizons and compared the performance of the proposed model against several benchmark models. The experimental results confirm the strong predictive capability of the WaveNet–MMoE framework. Full article

(This article belongs to the Topic Research and Application of Artificial Intelligence in Wind and Wave Energy)

► Show Figures

Figure 1

Search Results (34)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (34)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI