Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,810)

Search Parameters:
Keywords = encoder decoder

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 3121 KB  
Article
DI-WOA: Symmetry-Aware Dual-Improved Whale Optimization for Monetized Cloud Compute Scheduling with Dual-Rollback Constraint Handling
by Yuanzhe Kuang, Zhen Zhang and Hanshen Li
Symmetry 2026, 18(2), 303; https://doi.org/10.3390/sym18020303 - 6 Feb 2026
Abstract
With the continuous growth in the scale of engineering simulation and intelligent manufacturing workflows, more and more problem-solving tasks are migrating to cloud computing platforms to obtain elastic computing power. However, a core operational challenge for cloud platforms lies in the difficulty of [...] Read more.
With the continuous growth in the scale of engineering simulation and intelligent manufacturing workflows, more and more problem-solving tasks are migrating to cloud computing platforms to obtain elastic computing power. However, a core operational challenge for cloud platforms lies in the difficulty of stably obtaining high-quality scheduling solutions that are both efficient and free of symmetric redundancy, due to the coupling of multiple constraints, partial resource interchangeability, inconsistent multi-objective evaluation scales, and heterogeneous resource fluctuations. To address this, this paper proposes a Dual-Improved Whale Optimization Algorithm (DI-WOA) accompanied by a modeling framework featuring discrete–continuous divide-and-conquer modeling, a unified monetization mechanism of the objective function, and separation of soft/hard constraints; its iterative trajectory follows an augmented Lagrangian dual-rollback mechanism, while being rooted in a three-layer “discrete gene–real-valued encoding–decoder” structure. Scalability experiments show that as the number of tasks J increases, the DI-WOA ranks optimal or sub-optimal at most scale points, indicating its effectiveness in reducing unified billing costs even under intensified task coupling and resource contention. Ablation experiment results demonstrate that the complete DI-WOA achieves final objective values (OBJ) 8.33%, 5.45%, and 13.31% lower than the baseline, the variant without dual update (w/o dual), and the variant without perturbation (w/o perturb), respectively, significantly enhancing convergence performance and final solution quality on this scheduling model. In robustness experiments, the DI-WOA exhibits the lowest or second-lowest OBJ and soft constraint violation, indicating higher controllability under perturbations. In multi-workload generalization experiments, the DI-WOA achieves the optimal or sub-optimal mean OBJ across all scenarios with H = 3/4, leading the sub-optimal algorithm by up to 13.85%, demonstrating good adaptability to workload variations. A comprehensive analysis of the experimental results reveals that the DI-WOA holds practical significance for stably solving high-quality scheduling problems that are efficient and free of symmetric redundancy in complex and diverse environments. Full article
(This article belongs to the Section Computer)
22 pages, 1944 KB  
Article
Automated Radiological Report Generation from Breast Ultrasound Images Using Vision and Language Transformers
by Shaheen Khatoon and Azhar Mahmood
J. Imaging 2026, 12(2), 68; https://doi.org/10.3390/jimaging12020068 - 6 Feb 2026
Abstract
Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support [...] Read more.
Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support clinical workflows, yet most existing approaches focus on chest X-ray imaging and rely on convolutional–recurrent architectures with limited capacity to model long-range dependencies and complex clinical semantics. In this work, we propose a multimodal Transformer-based framework for automatic breast ultrasound report generation that integrates visual and textual information through cross-attention mechanisms. The proposed architecture employs a Vision Transformer (ViT) to extract rich spatial and morphological features from ultrasound images. For textual embedding, pretrained language models (BERT, BioBERT, and GPT-2) are implemented in various encoder–decoder configurations to leverage both general linguistic knowledge and domain-specific biomedical semantics. A multimodal Transformer decoder is implemented to autoregressively generate diagnostic reports by jointly attending to visual features and contextualized textual embeddings. We conducted an extensive quantitative evaluation using standard report generation metrics, including BLEU, ROUGE-L, METEOR, and CIDEr, to assess lexical accuracy, semantic alignment, and clinical relevance. Experimental results demonstrate that BioBERT-based models consistently outperform general domain counterparts in clinical specificity, while GPT-2-based decoders improve linguistic fluency. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

21 pages, 734 KB  
Article
Hybrid Deep Learning Model for EI-MS Spectra Prediction
by Bartosz Majewski and Marta Łabuda
Int. J. Mol. Sci. 2026, 27(3), 1588; https://doi.org/10.3390/ijms27031588 - 5 Feb 2026
Abstract
Electron ionization (EI) mass spectrometry (MS) is a widely used technique for the compound identification and production of spectra. However, incomplete coverage of reference spectral libraries limits reliable analysis of newly characterized molecules. This study presents a hybrid deep learning model for predicting [...] Read more.
Electron ionization (EI) mass spectrometry (MS) is a widely used technique for the compound identification and production of spectra. However, incomplete coverage of reference spectral libraries limits reliable analysis of newly characterized molecules. This study presents a hybrid deep learning model for predicting EI-MS spectra directly from molecular structure. The approach combines a graph neural network encoder with a residual neural network decoder, followed by refinement using cross-attention, bidirectional prediction, and probabilistic, chemistry-informed masks. Trained on the NIST14 EI-MS database (≤500 Da), the model achieves strong library matching performance (Recall@10 ≈ 80.8%) and high spectral similarity. The proposed hybrid GNN (Graph Neural Network)-ResNet (Residual Neural Network) model can generate high-quality synthetic EI-MS spectra to supplement existing libraries, potentially reducing the cost and effort of experimental spectrum acquisition. The obtained results demonstrate the potential of data-driven models to augment EI-MS libraries, while highlighting remaining challenges in generalization and spectral uniqueness. Full article
20 pages, 3823 KB  
Article
DA-TransResUNet: Residual U-Net Liver Segmentation Model Integrating Dual Attention of Spatial and Channel with Transformer
by Kunzhan Wang, Xinyue Lu, Jing Li and Yang Lu
Mathematics 2026, 14(3), 575; https://doi.org/10.3390/math14030575 - 5 Feb 2026
Abstract
Precise medical image segmentation plays a vital role in disease diagnosis and clinical treatment. Although U-Net-based architectures and their Transformer-enhanced variants have achieved remarkable progress in automatic segmentation tasks, they still face challenges in complex medical imaging scenarios, particularly around simultaneously modeling fine-grained [...] Read more.
Precise medical image segmentation plays a vital role in disease diagnosis and clinical treatment. Although U-Net-based architectures and their Transformer-enhanced variants have achieved remarkable progress in automatic segmentation tasks, they still face challenges in complex medical imaging scenarios, particularly around simultaneously modeling fine-grained local details and capturing long-range global contextual information, which limits segmentation accuracy and structural consistency. To address these challenges, this paper proposes a novel medical image segmentation framework termed DA-TransResUNet. Built upon a ResUNet backbone, the proposed network integrates residual learning, Transformer-based encoding, and a dual-attention (DA) mechanism in a unified manner. Residual blocks facilitate stable optimization and progressive feature refinement in deep networks, while the Transformer module effectively models long-range dependencies to enhance global context representation. Meanwhile, the proposed DA-Block jointly exploits local and global features as well as spatial and channel-wise dependencies, leading to more discriminative feature representations. Furthermore, embedding DA-Blocks into both the feature embedding stage and skip connections strengthens information interaction between the encoder and decoder, thereby improving overall segmentation performance. Experimental results on the LiTS2017 dataset and Sliver07 dataset demonstrate that the proposed method achieves incremental improvement in liver segmentation. In particular, on the LiTS2017 dataset, DA-TransResUNet achieves a Dice score of 97.39%, a VOE of 5.08%, and an RVD of −0.74%, validating its effectiveness for liver segmentation. Full article
Show Figures

Figure 1

25 pages, 1263 KB  
Article
LFTD: Transformer-Enhanced Diffusion Model for Realistic Financial Time-Series Data Generation
by Gyumun Choi, Donghyeon Jo, Wonho Song, Hyungjong Na and Hyungjoon Kim
AI 2026, 7(2), 60; https://doi.org/10.3390/ai7020060 - 5 Feb 2026
Abstract
Firm-level financial statement data form multivariate annual time series with strong cross-variable dependencies and temporal dynamics, yet publicly available panels are often short and incomplete, limiting the generalization of predictive models. We present Latent Financial Time-Series Diffusion (LFTD), a structure-aware augmentation framework that [...] Read more.
Firm-level financial statement data form multivariate annual time series with strong cross-variable dependencies and temporal dynamics, yet publicly available panels are often short and incomplete, limiting the generalization of predictive models. We present Latent Financial Time-Series Diffusion (LFTD), a structure-aware augmentation framework that synthesizes realistic firm-level financial time series in a compact latent space. LFTD first learns information-preserving representations with a dual encoder: an FT-Transformer that captures within-year interactions across financial variables and a Time Series Transformer (TST) that models long-horizon evolution across years. On this latent sequence, we train a Transformer-based denoising diffusion model whose reverse process is FiLM-conditioned on the diffusion step as well as year, firm identity, and firm age, enabling controllable generation aligned with firm- and time-specific context. A TST-based Cross-Decoder then reconstructs continuous and binary financial variables for each year. Empirical evaluation on Korean listed-firm data from 2011 to 2023 shows that augmenting training sets with LFTD-generated samples consistently improves firm-value prediction for market-to-book and Tobin’s Q under both static (same-year) and dynamic (ττ + 1) forecasting settings and outperforms conventional generative augmentation baselines and ablated variants. These results suggest that domain-conditioned latent diffusion is a practical route to reliable augmentation for firm-level financial time series. Full article
32 pages, 5567 KB  
Article
Optimized Image Segmentation Model for Pellet Microstructure Incorporating KL Divergence Constraints
by Yuwen Ai, Xia Li, Aimin Yang, Yunjie Bai and Xuezhi Wu
Mathematics 2026, 14(3), 574; https://doi.org/10.3390/math14030574 - 5 Feb 2026
Abstract
Accurate segmentation of pellet microstructure images is crucial for evaluating their metallurgical performance and optimizing production processes. To address the challenges posed by complex structures, blurred boundaries, and fine-grained textures of hematite and magnetite in pellet micrographs, this study proposes a hybrid intelligently [...] Read more.
Accurate segmentation of pellet microstructure images is crucial for evaluating their metallurgical performance and optimizing production processes. To address the challenges posed by complex structures, blurred boundaries, and fine-grained textures of hematite and magnetite in pellet micrographs, this study proposes a hybrid intelligently optimized VGG16-U-Net semantic segmentation model. The model incorporates an improved SPC-SA channel self-attention mechanism in the encoder to enhance deep feature representation, while a simplified SAN and SAW module is integrated into the decoder to strengthen its response to key mineral regions. Additionally, a hybrid loss strategy is employed with KL regularization for training optimization. Experimental results show that the model achieves an mIoU of 85.58%, an mPA of 91.54%, and an overall accuracy of 93.58%. Compared with the baseline models, the proposed method achieves improved performance to some extent. Full article
(This article belongs to the Special Issue Mathematical Methods for Image Processing and Computer Vision)
27 pages, 2289 KB  
Article
Knowledge-Injected Transformer (KIT): A Modular Encoder–Decoder Architecture for Efficient Knowledge Integration and Reliable Question Answering
by Lyudmyla Kirichenko, Daniil Maksymenko, Olena Turuta, Sergiy Yakovlev and Oleksii Turuta
Appl. Sci. 2026, 16(3), 1601; https://doi.org/10.3390/app16031601 - 5 Feb 2026
Viewed by 41
Abstract
Decoder-only language models (LMs) store factual knowledge directly in their parameters, resulting in large model sizes, costly retraining when facts change, and limited controllability in knowledge-intensive information systems. These models frequently mix stored knowledge with user-provided context, which leads to hallucinations and reduces [...] Read more.
Decoder-only language models (LMs) store factual knowledge directly in their parameters, resulting in large model sizes, costly retraining when facts change, and limited controllability in knowledge-intensive information systems. These models frequently mix stored knowledge with user-provided context, which leads to hallucinations and reduces reliability. To address these limitations, we propose KIT (Knowledge-Injected Transformer), a modular encoder–decoder architecture that separates syntactic competence from factual knowledge representation. In KIT, the decoder is pre-trained on knowledge-agnostic narrative corpora to learn language structure, while the encoder is trained independently to compress structured facts into compact latent representations. During joint training, the decoder learns to decompress these representations and generate accurate, fact-grounded responses. The modular design provides three key benefits: (1) factual knowledge can be updated by retraining only the encoder, without modifying decoder weights; (2) strict domain boundaries can be enforced, the modular design provides a structural foundation for reducing knowledge source confusion and hallucinations, with its actual effectiveness awaiting future validation on standard hallucination benchmarks; and (3) interpretability is improved because each generated token can be traced back to encoder activations. A real-world experimental evaluation demonstrates that KIT achieves competitive answer accuracy while offering superior controllability and substantially lower update costs compared to decoder-only baselines. These results indicate that modular encoder–decoder architectures represent a promising and reliable alternative for explainable, adaptable, and domain-specific question answering in modern information systems. Full article
Show Figures

Figure 1

23 pages, 6932 KB  
Article
RocSync: Millisecond-Accurate Temporal Synchronization for Heterogeneous Camera Systems
by Jaro Meyer, Frédéric Giraud, Joschua Wüthrich, Marc Pollefeys, Philipp Fürnstahl and Lilian Calvet
Sensors 2026, 26(3), 1036; https://doi.org/10.3390/s26031036 - 5 Feb 2026
Viewed by 42
Abstract
Accurate spatiotemporal alignment of multi-view video streams is essential for a wide range of dynamic-scene applications such as multi-view 3D reconstruction, pose estimation, and scene understanding. However, synchronizing multiple cameras remains a significant challenge, especially in heterogeneous setups combining professional- and consumer-grade devices, [...] Read more.
Accurate spatiotemporal alignment of multi-view video streams is essential for a wide range of dynamic-scene applications such as multi-view 3D reconstruction, pose estimation, and scene understanding. However, synchronizing multiple cameras remains a significant challenge, especially in heterogeneous setups combining professional- and consumer-grade devices, visible and infrared sensors, or systems with and without audio, where common hardware synchronization capabilities are often unavailable. This limitation is particularly evident in real-world environments, where controlled capture conditions are not feasible. In this work, we present a low-cost, general-purpose synchronization method that achieves millisecond-level temporal alignment across diverse camera systems while supporting both visible (RGB) and infrared (IR) modalities. The proposed solution employs a custom-built LED Clock that encodes time through red and infrared LEDs, allowing visual decoding of the exposure window (start and end times) from recorded frames for millisecond-level synchronization. We benchmark our method against hardware synchronization and achieve a residual error of 1.34 ms RMSE across multiple recordings. In further experiments, our method outperforms light-, audio-, and timecode-based synchronization approaches and directly improves downstream computer vision tasks, including multi-view pose estimation and 3D reconstruction. Finally, we validate the system in large-scale surgical recordings involving over 25 heterogeneous cameras spanning both IR and RGB modalities. This solution simplifies and streamlines the synchronization pipeline and expands access to advanced vision-based sensing in unconstrained environments, including industrial and clinical applications. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

5 pages, 398 KB  
Proceeding Paper
A Lightweight Deep Learning Framework for Robust Video Watermarking in Adversarial Environments
by Antonio Cedillo-Hernandez, Lydia Velazquez-Garcia and Manuel Cedillo-Hernandez
Eng. Proc. 2026, 123(1), 25; https://doi.org/10.3390/engproc2026123025 - 5 Feb 2026
Viewed by 87
Abstract
The widespread distribution of digital videos in social networks, streaming services, and surveillance systems has increased the risk of manipulation, unauthorized redistribution, and adversarial tampering. This paper presents a lightweight deep learning framework for robust and imperceptible video watermarking designed specifically for cybersecurity [...] Read more.
The widespread distribution of digital videos in social networks, streaming services, and surveillance systems has increased the risk of manipulation, unauthorized redistribution, and adversarial tampering. This paper presents a lightweight deep learning framework for robust and imperceptible video watermarking designed specifically for cybersecurity environments. Unlike heavy architectures that rely on multi-scale feature extractors or complex adversarial networks, our model introduces a compact encoder–decoder pipeline optimized for real-time watermark embedding and recovery under adversarial attacks. The proposed system leverages spatial attention and temporal redundancy to ensure robustness against distortions such as compression, additive noise, and adversarial perturbations generated via Fast Gradient Sign Method (FGSM) or recompression attacks from generative models. Experimental simulations using a reduced Kinetics-600 subset demonstrate promising results, achieving an average PSNR of 38.9 dB, SSIM of 0.967, and Bit Error Rate (BER) below 3% even under FGSM attacks. These results suggest that the proposed lightweight framework achieves a favorable trade-off between resilience, imperceptibility, and computational efficiency, making it suitable for deployment in video forensics, authentication, and secure content distribution systems. Full article
(This article belongs to the Proceedings of First Summer School on Artificial Intelligence in Cybersecurity)
Show Figures

Figure 1

35 pages, 6562 KB  
Article
Sub-Hourly Multi-Horizon Quantile Forecasting of Photovoltaic Power Using Meteorological Data and a HybridCNN–STTransformer
by Guldana Taganova, Alma Zakirova, Assel Abdildayeva, Bakhyt Nurbekov, Zhanar Akhayeva and Talgat Azykanov
Algorithms 2026, 19(2), 123; https://doi.org/10.3390/a19020123 - 3 Feb 2026
Viewed by 84
Abstract
The rapid deployment of photovoltaic generation increases uncertainty in power-system operation and strengthens the need for ultra-short-term forecasts with reliable uncertainty estimates. Point-forecasting approaches alone are often insufficient for dispatch and reserve decisions because they do not quantify risk. This study investigates probabilistic [...] Read more.
The rapid deployment of photovoltaic generation increases uncertainty in power-system operation and strengthens the need for ultra-short-term forecasts with reliable uncertainty estimates. Point-forecasting approaches alone are often insufficient for dispatch and reserve decisions because they do not quantify risk. This study investigates probabilistic forecasting of short-horizon solar generation using quantile regression on a public dataset of solar output and meteorological variables. This study proposes a hybrid attention–convolution model that combines an attention-based encoder to capture long-range temporal dependencies with a causal temporal convolution module that extracts fast local fluctuations using only past information, preventing information leakage. The two representations are fused and decoded jointly across multiple future horizons to produce consistent quantile trajectories. Experiments against representative machine-learning and deep-learning baselines show improved probabilistic accuracy and competitive central forecasts, while illustrating an important sharpness–calibration trade-off relevant to risk-aware grid operation. Key novelties include a multi-horizon quantile formulation at 15 min resolution for one-hour-ahead PV increments, a HybridCNN–STTransformer that fuses causal temporal convolutions with Transformer attention, and a horizon-token decoder that models inter-horizon dependencies to produce consistent multi-step quantile trajectories; reliability/sharpness diagnostics and post hoc calibration are discussed for operational risk-aware use. Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
22 pages, 6571 KB  
Article
A Nested U-Network with Temporal Convolution for Monaural Speech Enhancement in Laser Hearing
by Bomao Zhou, Jin Tang and Fan Guo
Modelling 2026, 7(1), 32; https://doi.org/10.3390/modelling7010032 - 3 Feb 2026
Viewed by 91
Abstract
Laser Doppler vibrometer (LDV) has the characteristics of long-distance, non-contact, and high sensitivity, and plays an increasingly important role in industrial, military, and security fields. Remote speech acquisition technology based on LDV has progressed significantly in recent years. However, unlike microphone receivers, LDV-captured [...] Read more.
Laser Doppler vibrometer (LDV) has the characteristics of long-distance, non-contact, and high sensitivity, and plays an increasingly important role in industrial, military, and security fields. Remote speech acquisition technology based on LDV has progressed significantly in recent years. However, unlike microphone receivers, LDV-captured signals have severe signal distortion, which affects the quality of the LDV-captured speech. This paper proposes a nested U-network with gated temporal convolution (TCNUNet) to enhance monaural speech based on LDV. Specifically, the network is based on an encoder-decoder structure with skip connections and introduces nested U-Net (NUNet) in the encoder to better reconstruct speech signals. In addition, a temporal convolutional network with a gating mechanism is inserted between the encoder and decoder. The gating mechanism helps to control the information flow, while temporal convolution helps to model the long-range temporal dependencies. In a real-world environment, we designed an LDV monitoring system to collect and enhance voice signals remotely. Different datasets were collected from various target objects to fully validate the performance of the proposed network. Compared with baseline models, the proposed model achieves state-of-the-art performance. Finally, the results of the generalization experiment also indicate that the proposed model has a certain degree of generalization ability for different languages. Full article
(This article belongs to the Special Issue AI-Driven and Data-Driven Modelling in Acoustics and Vibration)
Show Figures

Graphical abstract

19 pages, 3447 KB  
Article
Hybrid Decoding with Co-Occurrence Awareness for Fine-Grained Food Image Segmentation
by Shenglong Wang and Guorui Sheng
Foods 2026, 15(3), 534; https://doi.org/10.3390/foods15030534 - 3 Feb 2026
Viewed by 113
Abstract
Fine-grained food image segmentation is essential for accurate dietary assessment and nutritional analysis, yet remains highly challenging due to ambiguous boundaries, inter-class similarity, and dense layouts of meals containing many different ingredients in real-world settings. Existing methods based solely on CNNs, Transformers, or [...] Read more.
Fine-grained food image segmentation is essential for accurate dietary assessment and nutritional analysis, yet remains highly challenging due to ambiguous boundaries, inter-class similarity, and dense layouts of meals containing many different ingredients in real-world settings. Existing methods based solely on CNNs, Transformers, or Mamba architectures often fail to simultaneously preserve fine-grained local details and capture contextual dependencies over long distances. To address these limitations, we propose HDF (Hybrid Decoder for Food Image Segmentation), a novel decoding framework built upon the MambaVision backbone. Our approach first employs a convolution-based feature pyramid network (FPN) to extract multi-stage features from the encoder. These features are then thoroughly fused across scales using a Cross-Layer Mamba module that models inter-level dependencies with linear complexity. Subsequently, an Attention Refinement module integrates global semantic context through spatial–channel reweighting. Finally, a Food Co-occurrence Module explicitly enhances food-specific semantics by learning dynamic co-occurrence patterns among categories, improving segmentation of visually similar or frequently co-occurring ingredients. Evaluated on two widely used, high-quality benchmarks, FoodSeg103 and UEC-FoodPIX Complete, which are standard datasets for fine-grained food segmentation, HDF achieves a 52.25% mean Intersection-over-Union (mIoU) on FoodSeg103 and a 76.16% mIoU on UEC-FoodPIX Complete, outperforming current state-of-the-art methods by a clear margin. These results demonstrate that HDF’s hybrid design and explicit co-occurrence awareness effectively address key challenges in food image segmentation, providing a robust foundation for practical applications in dietary logging, nutritional estimation, and food safety inspection. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Figure 1

25 pages, 11231 KB  
Article
Uncertainty Quantification Analysis of Dynamic Responses in Plate Structures Based on a Physics-Informed CVAE Model
by Shujing Tang, Xuewen Yin and Wenwei Wu
Appl. Sci. 2026, 16(3), 1496; https://doi.org/10.3390/app16031496 - 2 Feb 2026
Viewed by 90
Abstract
The propagation of uncertainties in structural dynamic responses, arising from variations in material properties, geometry, and boundary conditions, is of critical concern to researchers in a variety of engineering instances. Conventional methods like high-fidelity Monte Carlo simulation are computationally prohibitive, while existing surrogate [...] Read more.
The propagation of uncertainties in structural dynamic responses, arising from variations in material properties, geometry, and boundary conditions, is of critical concern to researchers in a variety of engineering instances. Conventional methods like high-fidelity Monte Carlo simulation are computationally prohibitive, while existing surrogate models can improve efficiency at the expense of accuracy. To achieve a trade-off between accuracy and efficiency, a Physics-Informed Conditional Variational Autoencoder (PI-CVAE) model is proposed. It integrates a novel dual-branch encoder for time-frequency feature extraction, a learnable frequency-filtering decoder, and a holistic physics-informed loss function so as to enable efficient generation of dynamic responses with high accuracy and adequate physics consistency. Comprehensive numerical analysis of plate structures demonstrates that the proposed approach achieves remarkable accuracy (maximum FRF error < 0.2% and R2 > 0.99) and a computational speedup of 8–11 times in comparison with conventional simulation techniques. By maintaining high accuracy while efficiently propagating uncertainties, the PI-CVAE model provides a practical framework for probabilistic vibration analysis, especially during the acoustic design phase. Full article
(This article belongs to the Special Issue Machine Learning in Vibration and Acoustics (3rd Edition))
Show Figures

Figure 1

21 pages, 4327 KB  
Article
Engineering-Oriented Ultrasonic Decoding: An End-to-End Deep Learning Framework for Metal Grain Size Distribution Characterization
by Le Dai, Shiyuan Zhou, Yuhan Cheng, Lin Wang, Yuxuan Zhang and Heng Zhi
Sensors 2026, 26(3), 958; https://doi.org/10.3390/s26030958 - 2 Feb 2026
Viewed by 175
Abstract
Grain size is critical for metallic material performance, yet conventional ultrasonic methods rely on strong model assumptions and exhibit limited adaptability. We propose a deep learning architecture that uses multimodal ultrasonic features with spatial coding to predict the grain size distribution of GH4099. [...] Read more.
Grain size is critical for metallic material performance, yet conventional ultrasonic methods rely on strong model assumptions and exhibit limited adaptability. We propose a deep learning architecture that uses multimodal ultrasonic features with spatial coding to predict the grain size distribution of GH4099. A-scan signals from C-scan measurements are converted to time–frequency representations and fed to an encoder–decoder model that combines a dual convolutional compression network with a fully connected decoder. A thickness-encoding branch enables feature decoupling under physical constraints, and an elliptic spatial fusion strategy refines predictions. Experiments show mean and standard deviation MAEs of 1.08 and 0.84 μm, respectively, with a KL divergence of 0.0031, outperforming attenuation- and velocity-based methods. Input-specificity experiments further indicate that transfer learning calibration quickly restores performance under new conditions. These results demonstrate a practical path for integrating deep learning with ultrasonic inspection for accurate, adaptable grain-size characterization. Full article
(This article belongs to the Special Issue Ultrasonic Sensors and Ultrasonic Signal Processing)
Show Figures

Graphical abstract

38 pages, 3226 KB  
Article
Optimization of High-Frequency Transmission Line Reflection Wave Compensation and Impedance Matching Based on a DQN-GA Hybrid Algorithm
by Tieli Liu, Jie Li, Xi Zhang, Debiao Zhang, Chenjun Hu, Kaiqiang Feng, Shuangchao Ge and Junlong Li
Electronics 2026, 15(3), 645; https://doi.org/10.3390/electronics15030645 - 2 Feb 2026
Viewed by 168
Abstract
In high-frequency circuit design, parameters such as the characteristic impedance and propagation constant of transmission lines directly affect key performance metrics, including signal integrity and power transmission efficiency. To address the challenge of optimizing impedance matching for high-frequency PCB transmission lines, this study [...] Read more.
In high-frequency circuit design, parameters such as the characteristic impedance and propagation constant of transmission lines directly affect key performance metrics, including signal integrity and power transmission efficiency. To address the challenge of optimizing impedance matching for high-frequency PCB transmission lines, this study applies a hybrid deep Q-network—genetic algorithm (DQN-GA) that integrates deep reinforcement learning with a genetic algorithm (GA). Unlike existing methods that primarily focus on predictive modeling or single-algorithm optimization, the proposed approach introduces a bidirectional interaction mechanism for algorithm fusion: transmission line structures learned by the deep Q-network (DQN) are encoded as chromosomes to enhance the diversity of the genetic algorithm population; simultaneously, high-fitness individuals from the genetic algorithm are decoded and stored in the experience replay pool of the DQN to accelerate its convergence. Simulation results demonstrate that the DQN-GA algorithm significantly outperforms both unoptimized structures and standalone GA methods, achieving substantial improvements in fitness scores and S11 transmission coefficients. This algorithm effectively overcomes the limitations of conventional approaches in addressing complex reflected wave compensation problems in high-frequency applications, providing a robust solution for signal integrity optimization in high-speed circuit design. This study not only advances the field of intelligent circuit optimization but also establishes a valuable framework for the application of hybrid algorithms to complex engineering challenges. Full article
Show Figures

Figure 1

Back to TopTop