MDPI - Publisher of Open Access Journals

29 pages, 15488 KiB

Open AccessArticle

GOFENet: A Hybrid Transformer–CNN Network Integrating GEOBIA-Based Object Priors for Semantic Segmentation of Remote Sensing Images

by Tao He, Jianyu Chen and Delu Pan

Remote Sens. 2025, 17(15), 2652; https://doi.org/10.3390/rs17152652 - 31 Jul 2025

Viewed by 310

Abstract

Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability [...] Read more.

Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability in semantic segmentation. While convolutional neural networks (CNNs) excel at local feature extraction, they inherently struggle to capture long-range dependencies. In contrast, Transformer-based models are well suited for global context modeling but often lack fine-grained local detail. To overcome these limitations, we propose GOFENet (Geo-Object Feature Enhanced Network)—a hybrid semantic segmentation architecture that effectively fuses object-level priors into deep feature representations. GOFENet employs a dual-encoder design combining CNN and Swin Transformer architectures, enabling multi-scale feature fusion through skip connections to preserve both local and global semantics. An auxiliary branch incorporating cascaded atrous convolutions is introduced to inject information of segmented objects into the learning process. Furthermore, we develop a cross-channel selection module (CSM) for refined channel-wise attention, a feature enhancement module (FEM) to merge global and local representations, and a shallow–deep feature fusion module (SDFM) to integrate pixel- and object-level cues across scales. Experimental results on the GID and LoveDA datasets demonstrate that GOFENet achieves superior segmentation performance, with 66.02% mIoU and 51.92% mIoU, respectively. The model exhibits strong capability in delineating large-scale land cover features, producing sharper object boundaries and reducing classification noise, while preserving the integrity and discriminability of land cover categories. Full article

► Show Figures

Graphical abstract

25 pages, 4344 KiB

Open AccessArticle

YOLO-DFAM-Based Onboard Intelligent Sorting System for Portunus trituberculatus

by Penglong Li, Shengmao Zhang, Hanfeng Zheng, Xiumei Fan, Yonchuang Shi, Zuli Wu and Heng Zhang

Fishes 2025, 10(8), 364; https://doi.org/10.3390/fishes10080364 - 25 Jul 2025

Viewed by 263

Abstract

This study addresses the challenges of manual measurement bias and low robustness in detecting small, occluded targets in complex marine environments during real-time onboard sorting of Portunus trituberculatus. We propose YOLO-DFAM, an enhanced YOLOv11n-based model that replaces the global average pooling in [...] Read more.

This study addresses the challenges of manual measurement bias and low robustness in detecting small, occluded targets in complex marine environments during real-time onboard sorting of Portunus trituberculatus. We propose YOLO-DFAM, an enhanced YOLOv11n-based model that replaces the global average pooling in the Focal Modulation module with a spatial–channel dual-attention mechanism and incorporates the ASF-YOLO cross-scale fusion strategy to improve feature representation across varying target sizes. These enhancements significantly boost detection, achieving an mAP@50 of 98.0% and precision of 94.6%, outperforming RetinaNet-CSL and Rotated Faster R-CNN by up to 6.3% while maintaining real-time inference at 180.3 FPS with only 7.2 GFLOPs. Unlike prior static-scene approaches, our unified framework integrates attention-guided detection, scale-adaptive tracking, and lightweight weight estimation for dynamic marine conditions. A ByteTrack-based tracking module with dynamic scale calibration, EMA filtering, and optical flow compensation ensures stable multi-frame tracking. Additionally, a region-specific allometric weight estimation model (R² = 0.9856) reduces dimensional errors by 85.7% and maintains prediction errors below 4.7% using only 12 spline-interpolated calibration sets. YOLO-DFAM provides an accurate, efficient solution for intelligent onboard fishery monitoring. Full article

(This article belongs to the Special Issue New Technologies for Improving Fisheries and Aquaculture Production and Management)

► Show Figures

Figure 1

11 pages, 670 KiB

Open AccessArticle

LLM-Enhanced Chinese Morph Resolution in E-Commerce Live Streaming Scenarios

by Xiaoye Ouyang, Liu Yuan, Xiaocheng Hu, Jiahao Zhu and Jipeng Qiang

Entropy 2025, 27(7), 698; https://doi.org/10.3390/e27070698 - 29 Jun 2025

Viewed by 378

Abstract

E-commerce live streaming in China has become a major retail channel, yet hosts often employ subtle phonetic or semantic “morphs” to evade moderation and make unsubstantiated claims, posing risks to consumers. To address this, we study the Live Auditory Morph Resolution (LiveAMR) task, [...] Read more.

E-commerce live streaming in China has become a major retail channel, yet hosts often employ subtle phonetic or semantic “morphs” to evade moderation and make unsubstantiated claims, posing risks to consumers. To address this, we study the Live Auditory Morph Resolution (LiveAMR) task, which restores morphed speech transcriptions to their true forms. Building on prior text-based morph resolution, we propose an LLM-enhanced training framework that mines three types of explanation knowledge—predefined morph-type labels, LLM-generated reference corrections, and natural-language rationales constrained for clarity and comprehensiveness—from a frozen large language model. These annotations are concatenated with the original morphed sentence and used to fine-tune a lightweight T5 model under a standard cross-entropy objective. In experiments on two test sets (in-domain and out-of-domain), our method achieves substantial gains over baselines, improving

F_{0.5}

by up to 7 pp in-domain (to 0.943) and 5 pp out-of-domain (to 0.799) compared to a strong T5 baseline. These results demonstrate that structured LLM-derived signals can be mined without fine-tuning the LLM itself and injected into small models to yield efficient, accurate morph resolution. Full article

(This article belongs to the Special Issue Natural Language Processing and Data Mining)

► Show Figures

Figure 1

38 pages, 3580 KiB

Open AccessReview

A Review of Unmanned Visual Target Detection in Adverse Weather

by Yifei Song and Yanfeng Lu

Electronics 2025, 14(13), 2582; https://doi.org/10.3390/electronics14132582 - 26 Jun 2025

Viewed by 416

Abstract

Visual target detection under adverse weather conditions presents a fundamental challenge for autonomous driving, particularly in achieving all-weather operational capabilities. Unlike existing reviews that concentrate on individual technical domains such as image restoration or detection robustness, this review introduces an innovative “restoration–detection” collaborative [...] Read more.

Visual target detection under adverse weather conditions presents a fundamental challenge for autonomous driving, particularly in achieving all-weather operational capabilities. Unlike existing reviews that concentrate on individual technical domains such as image restoration or detection robustness, this review introduces an innovative “restoration–detection” collaborative framework. This paper systematically examines state-of-the-art methods for degraded image recovery and improvement of detection model robustness, encompassing from traditional, physically driven approaches as well as contemporary deep learning paradigms. A comprehensive overview and comparative analysis are provided to elucidate these advancements. Regarding the recovery of degraded images, traditional methods demonstrate advantages in interpretability within specific scenarios, such as those based on dark channel prior. In contrast, deep learning methods have achieved significant breakthroughs in modeling complex degradations and enhancing cross-domain generalization through a data-driven paradigm. In the field of enhancing detection robustness, traditional improvement techniques that utilize anisotropic filtering, alongside deep learning methods such as SSD, R-CNN, and the YOLO series, contribute to perceptual stability through feature optimization and end-to-end learning approaches, respectively. This paper summarizes 11 types of mainstream public datasets, examining their multimodal annotation system and addressing issues related to discrepancies. Furthermore, it provides an extensive evaluation of algorithm performance using PSNR, SSIM, mAP, among others. It has been identified that significant bottlenecks persist in dynamic weather coupling modeling, multimodal heterogeneous data fusion, and the efficiency of edge deployment. Future research should focus on establishing a physically guided hybrid learning architecture, developing techniques for dynamic and adaptive timing calibration, and designing a flexible multimodal fusion framework to overcome the limitations associated with complex environment perception. This paper serves as a systematic reference for both the theoretical development and practical implementation of automatic driving vision detection technology under severe weather conditions. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

28 pages, 2317 KiB

Open AccessArticle

Cross-Feature Hybrid Associative Priori Network for Pulsar Candidate Screening

by Wei Luo, Xiaoyao Xie, Jiatao Jiang, Linyong Zhou and Zhijun Hu

Sensors 2025, 25(13), 3963; https://doi.org/10.3390/s25133963 - 26 Jun 2025

Viewed by 246

Abstract

To enhance pulsar candidate recognition performance and improve model generalization, this paper proposes the cross-feature hybrid associative prior network (CFHAPNet). CFHAPNet incorporates a novel architecture and strategies to integrate multi-class heterogeneous feature subimages from each candidate into multi-channel data processing. By implementing cross-attention [...] Read more.

To enhance pulsar candidate recognition performance and improve model generalization, this paper proposes the cross-feature hybrid associative prior network (CFHAPNet). CFHAPNet incorporates a novel architecture and strategies to integrate multi-class heterogeneous feature subimages from each candidate into multi-channel data processing. By implementing cross-attention mechanisms and other enhancements for multi-view feature interactions, the model significantly strengthens its ability to capture fine-grained image texture details and weak prior semantic information. Through comparative analysis of feature weight similarity between subimages and average fusion weights, CFHAPNet efficiently identifies and filters genuine pulsar signals from candidate images collected across astronomical observatories. Additionally, refinements to the original loss function enhance convergence, further improving recognition accuracy and stability. To validate CFHAPNet’s efficacy, we compare its performance against several state-of-the-art methods on diverse datasets. The results demonstrate that under similar data scales, our approach achieves superior recognition performance. Notably, on the FAST dataset, the accuracy, recall, and F1-score reach 97.5%, 98.4%, and 98.0%, respectively. Ablation studies further reveal that the proposed enhancements improve overall recognition performance by approximately 5.6% compared to the original architecture, achieving an optimal balance between recognition precision and computational efficiency. These improvements make CFHAPNet a strong candidate for future large-scale pulsar surveys using new sensor systems. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

20 pages, 4198 KiB

Open AccessArticle

HiDRA-DCDNet: Dynamic Hierarchical Attention and Multi-Scale Context Fusion for Real-Time Remote Sensing Small-Target Detection

by Jiale Wang, Zhe Bai, Ximing Zhang, Yuehong Qiu, Fan Bu and Yuancheng Shao

Remote Sens. 2025, 17(13), 2195; https://doi.org/10.3390/rs17132195 - 25 Jun 2025

Viewed by 402

Abstract

Small-target detection in remote sensing presents three fundamental challenges: limited pixel representation of targets, multi-angle imaging-induced appearance variance, and complex background interference. This paper introduces a dual-component neural architecture comprising Hierarchical Dynamic Refinement Attention (HiDRA) and Densely Connected Dilated Block (DCDBlock) to address [...] Read more.

Small-target detection in remote sensing presents three fundamental challenges: limited pixel representation of targets, multi-angle imaging-induced appearance variance, and complex background interference. This paper introduces a dual-component neural architecture comprising Hierarchical Dynamic Refinement Attention (HiDRA) and Densely Connected Dilated Block (DCDBlock) to address these challenges systematically. The HiDRA mechanism implements a dual-phase feature enhancement process: channel competition through bottleneck compression for discriminative feature selection, followed by spatial-semantic reweighting for foreground–background decoupling. The DCDBlock architecture synergizes multi-scale dilated convolutions with cross-layer dense connections, establishing persistent feature propagation pathways that preserve critical spatial details across network depths. Extensive experiments on AI-TOD, VisDrone, MAR20, and DOTA-v1.0 datasets demonstrate our method’s consistent superiority, achieving average absolute gains of +1.16% (mAP₅₀), +0.93% (mAP₉₅), and +1.83% (F1-score) over prior state-of-the-art approaches across all benchmarks. With 8.1 GFLOPs computational complexity and 2.6 ms inference speed per image, our framework demonstrates practical efficacy for real-time remote sensing applications, achieving superior accuracy–efficiency trade-off compared to existing approaches. Full article

► Show Figures

Figure 1

26 pages, 3494 KiB

Open AccessArticle

A Hyper-Attentive Multimodal Transformer for Real-Time and Robust Facial Expression Recognition

by Zarnigor Tagmatova, Sabina Umirzakova, Alpamis Kutlimuratov, Akmalbek Abdusalomov and Young Im Cho

Appl. Sci. 2025, 15(13), 7100; https://doi.org/10.3390/app15137100 - 24 Jun 2025

Viewed by 457

Abstract

Facial expression recognition (FER) plays a critical role in affective computing, enabling machines to interpret human emotions through facial cues. While recent deep learning models have achieved progress, many still fail under real-world conditions such as occlusion, lighting variation, and subtle expressions. In [...] Read more.

Facial expression recognition (FER) plays a critical role in affective computing, enabling machines to interpret human emotions through facial cues. While recent deep learning models have achieved progress, many still fail under real-world conditions such as occlusion, lighting variation, and subtle expressions. In this work, we propose FERONet, a novel hyper-attentive multimodal transformer architecture tailored for robust and real-time FER. FERONet integrates a triple-attention mechanism (spatial, channel, and cross-patch), a hierarchical transformer with token merging for computational efficiency, and a temporal cross-attention decoder to model emotional dynamics in video sequences. The model fuses RGB, optical flow, and depth/landmark inputs, enhancing resilience to environmental variation. Experimental evaluations across five standard FER datasets—FER-2013, RAF-DB, CK+, BU-3DFE, and AFEW—show that FERONet achieves superior recognition accuracy (up to 97.3%) and real-time inference speeds (<16 ms per frame), outperforming prior state-of-the-art models. The results confirm the model’s suitability for deployment in applications such as intelligent tutoring, driver monitoring, and clinical emotion assessment. Full article

(This article belongs to the Special Issue Emerging Trends in Affective Computing and Measuring Emotional Intelligence)

► Show Figures

Figure 1

21 pages, 83137 KiB

Open AccessArticle

RGB-FIR Multimodal Pedestrian Detection with Cross-Modality Context Attentional Model

by Han Wang, Lei Jin, Guangcheng Wang, Wenjie Liu, Quan Shi, Yingyan Hou and Jiali Liu

Sensors 2025, 25(13), 3854; https://doi.org/10.3390/s25133854 - 20 Jun 2025

Viewed by 370

Abstract

Pedestrian detection is an important research topic in the field of visual cognition and autonomous driving systems. The proposal of the YOLO model has significantly improved the speed and accuracy of detection. To achieve full day detection performance, multimodal YOLO models based on [...] Read more.

Pedestrian detection is an important research topic in the field of visual cognition and autonomous driving systems. The proposal of the YOLO model has significantly improved the speed and accuracy of detection. To achieve full day detection performance, multimodal YOLO models based on RGB-FIR image pairs have become a research hotspot. Existing work has focused on the design of fusion modules after feature extraction of RGB and FIR branch backbone networks, achieving a multimodal backbone network framework based on back-end fusion. However, these methods overlook the complementarity and prior knowledge between modalities and scales in the front-end raw feature extraction of RGB and FIR branch backbone networks. As a result, the performance of the backend fusion framework largely depends on the representation ability of the raw features of each modality in the front-end. This paper proposes a novel RGB-FIR multimodal backbone network framework based on a cross-modality context attentional model (CCAM). Different from the existing works, a multi-level fusion framework is designed. At the front-end of the RGB-FIR parallel backbone network, the CCAM model is constructed for the raw features of each scale. The RGB-FIR feature fusion results of the lower-level features of the RGB and FIR branch backbone networks are fully utilized to optimize the spatial weight of the upper level RGB and FIR features, to achieve cross-modality and cross-scale complementarity between adjacent scale feature extraction modules. At the back-end of the RGB-FIR parallel network, a channel-space joint attention model (CBAM) and self-attention models are combined to obtain the final RGB-FIR fusion features at each scale for those RGB and FIR features optimized by CCAM. Compared with the current RGB-FIR multimodal YOLO model, comparative experiments on different performance evaluation indicators on multiple RGB-FIR public datasets indicate that this method can significantly enhance the accuracy and robustness of pedestrian detection. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

24 pages, 7475 KiB

Open AccessArticle

Application of a Dual-Stream Network Collaboratively Based on Wavelet and Spatial-Channel Convolution in the Inpainting of Blank Strips in Marine Electrical Imaging Logging Images: A Case Study in the South China Sea

by Guilan Lin, Sinan Fang, Manxin Li, Hongtao Wu, Chenxi Xue and Zeyu Zhang

J. Mar. Sci. Eng. 2025, 13(5), 997; https://doi.org/10.3390/jmse13050997 - 21 May 2025

Cited by 1 | Viewed by 487

Abstract

Electrical imaging logging technology precisely characterizes the features of the formation on the borehole wall through high-resolution resistivity images. However, the problem of blank strips caused by the mismatch between the instrument pads and the borehole diameter seriously affects the accuracy of fracture [...] Read more.

Electrical imaging logging technology precisely characterizes the features of the formation on the borehole wall through high-resolution resistivity images. However, the problem of blank strips caused by the mismatch between the instrument pads and the borehole diameter seriously affects the accuracy of fracture identification and formation continuity interpretation in marine oil and gas reservoirs. Existing inpainting methods struggle to reconstruct complex geological textures while maintaining structural continuity, particularly in balancing low-frequency formation morphology with high-frequency fracture details. To address this issue, this paper proposes an inpainting method using a dual-stream network based on the collaborative optimization of wavelet and spatial-channel convolution. By designing a texture-aware data prior algorithm, a high-quality training dataset with geological rationality is generated. A dual-stream encoder–decoder network architecture is adopted, and the wavelet transform convolution (WTConv) module is utilized to enhance the multi-scale perception ability of the generator, achieving a collaborative analysis of the low-frequency formation structure and high-frequency fracture details. Combined with the spatial channel convolution (SCConv) to enhance the feature fusion module, the cross-modal interaction between texture and structural features is optimized through a dynamic gating mechanism. Furthermore, a multi-objective loss function is introduced to constrain the semantic coherence and visual authenticity of image reconstruction. Experiments show that, in the inpainting indexes for Block X in the South China Sea, the mean absolute error (MAE), structural similarity index (SSIM), and peak signal-to-noise ratio (PSNR) of this method are 6.893, 0.779, and 19.087, respectively, which are significantly better than the improved filtersim, U-Net, and AOT-GAN methods. The correlation degree of the pixel distribution between the inpainted area and the original image reaches 0.921~0.997, verifying the precise matching of the low-frequency morphology and high-frequency details. In the inpainting of electrical imaging logging images across blocks, the applicability of the method is confirmed, effectively solving the interference of blank strips on the interpretation accuracy of marine oil and gas reservoirs. It provides an intelligent inpainting tool with geological interpretability for the electrical imaging logging interpretation of complex reservoirs, and has important engineering value for improving the efficiency of oil and gas exploration and development. Full article

(This article belongs to the Special Issue Research on Offshore Oil and Gas Numerical Simulation)

► Show Figures

Figure 1

13 pages, 1584 KiB

Open AccessArticle

Radiomics and AI-Based Prediction of MGMT Methylation Status in Glioblastoma Using Multiparametric MRI: A Hybrid Feature Weighting Approach

by Erdal Tasci, Ying Zhuge, Longze Zhang, Holly Ning, Jason Y. Cheng, Robert W. Miller, Kevin Camphausen and Andra V. Krauze

Diagnostics 2025, 15(10), 1292; https://doi.org/10.3390/diagnostics15101292 - 21 May 2025

Viewed by 948

Abstract

Background/Objectives: Glioblastoma (GBM) is a highly aggressive primary central nervous system tumor with a median survival of 14 months. MGMT (O6-methylguanine-DNA methyltransferase) promoter methylation status is a key biomarker as a prognostic indicator and a predictor of chemotherapy response in GBM. Patients [...] Read more.

Background/Objectives: Glioblastoma (GBM) is a highly aggressive primary central nervous system tumor with a median survival of 14 months. MGMT (O6-methylguanine-DNA methyltransferase) promoter methylation status is a key biomarker as a prognostic indicator and a predictor of chemotherapy response in GBM. Patients with MGMT methylated disease progress later and survive longer (median survival rate 22 vs. 15 months, respectively) as compared to patients with MGMT unmethylated disease. Patients with GBM undergo an MRI of the brain prior to diagnosis and following surgical resection for radiation therapy planning and ongoing follow-up. There is currently no imaging biomarker for GBM. Studies have attempted to connect MGMT methylation status to MRI imaging appearance to determine if brain MRI can be leveraged to provide MGMT status information non-invasively and more expeditiously. Methods: Artificial intelligence (AI) can identify MRI features that are not distinguishable to the human eye and can be linked to MGMT status. We employed the UPenn-GBM dataset patients for whom methylation status was available (n = 146), employing a novel radiomic method grounded in hybrid feature selection and weighting to predict MGMT methylation status. Results: The best MGMT classification and feature selection result obtained resulted in a mean accuracy rate value of 81.6% utilizing 101 selected features and five-fold cross-validation. Conclusions: This compared favorably with similar studies in the literature. Validation with external datasets remains critical to enhance generalizability and propagate robust results while reducing bias. Future directions include multi-channel data integration with radiomic features and deep and ensemble learning methods to improve predictive performance. Full article

(This article belongs to the Special Issue The Applications of Radiomics in Precision Diagnosis)

► Show Figures

Figure 1

17 pages, 1557 KiB

Open AccessArticle

MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles

by Binghui Yang, Tao Tao, Wenfei Wu, Yongjun Zhang, Xiuyuan Meng and Jianfeng Yang

Drones 2025, 9(5), 322; https://doi.org/10.3390/drones9050322 - 22 Apr 2025

Viewed by 663

Abstract

Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been [...] Read more.

Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been made in detection methods based on point clouds, cameras, and multimodal fusion, the computational complexity of existing high-precision models struggles to meet the real-time requirements of vehicular edge devices. Additionally, during the model lightweighting process, issues such as multimodal feature coupling failure and the imbalance between classification and localization performance often arise. To address these challenges, this paper proposes a knowledge distillation framework for multimodal 3D object detection, incorporating attention guidance, rank-aware learning, and interactive feature supervision to achieve efficient model compression and performance optimization. Specifically: To enhance the student model’s ability to focus on key channel and spatial features, we introduce attention-guided feature distillation, leveraging a bird’s-eye view foreground mask and a dual-attention mechanism. To mitigate the degradation of classification performance when transitioning from two-stage to single-stage detectors, we propose ranking-aware category distillation by modeling anchor-level distribution. To address the insufficient cross-modal feature extraction capability, we enhance the student network’s image features using the teacher network’s point cloud spatial priors, thereby constructing a LiDAR-image cross-modal feature alignment mechanism. Experimental results demonstrate the effectiveness of the proposed approach in multimodal 3D object detection. On the KITTI dataset, our method improves network performance by 4.89% even after reducing the number of channels by half. Full article

(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)

► Show Figures

Figure 1

16 pages, 927 KiB

Open AccessArticle

Cross-Layer Stream Allocation of mMIMO-OFDM Hybrid Beamforming Video Communications

by You-Ting Chen, Shu-Ming Tseng, Yung-Fang Chen and Chao Fang

Sensors 2025, 25(8), 2554; https://doi.org/10.3390/s25082554 - 17 Apr 2025

Viewed by 393

Abstract

This paper proposes a source encoding rate control and cross-layer data stream allocation scheme for uplink millimeter-wave (mmWave) multi-user massive MIMO (MU-mMIMO) orthogonal frequency division multiplexing (OFDM) hybrid beamforming video communication systems. Unlike most previous studies that focus on the downlink scenario, our [...] Read more.

This paper proposes a source encoding rate control and cross-layer data stream allocation scheme for uplink millimeter-wave (mmWave) multi-user massive MIMO (MU-mMIMO) orthogonal frequency division multiplexing (OFDM) hybrid beamforming video communication systems. Unlike most previous studies that focus on the downlink scenario, our proposed scheme optimizes the uplink transmission while also addressing the limitation of prior works that only consider single-data-stream users. A key distinction of our approach is the integration of cross-layer resource allocation, which jointly considers both the physical layer channel state information (CSI) and the application layer video rate-distortion (RD) function. While traditional methods optimize for spectral efficiency (SE), our proposed method directly maximizes the peak signal-to-noise ratio (PSNR) to enhance video quality, aligning with the growing demand for high-quality video communication. We introduce a novel iterative cross-layer dynamic data stream allocation scheme, where the initial allocation is based on conventional physical-layer data stream allocation, followed by iterative refinement. Through multiple iterations, users with lower PSNR can dynamically contend for data streams, leading to a more balanced and optimized resource allocation. Our approach is a general framework that can incorporate any existing physical-layer data stream allocation as an initialization step before iteration. Simulation results demonstrate that the proposed cross-layer scheme outperforms three conventional physical-layer schemes by 0.4 to 1.14 dB in PSNR for 4–6 users, at the cost of a 1.8 to 2.3× increase in computational complexity (requiring 3.6–5.8 iterations). Full article

(This article belongs to the Special Issue MIMO Technologies in Sensors and Wireless Communication Applications: 2nd Edition)

► Show Figures

Figure 1

18 pages, 6983 KiB

Open AccessArticle

Multiscale Convolution-Based Efficient Channel Estimation Techniques for OFDM Systems

by Nahyeon Kwon, Bora Yoon and Junghyun Kim

Electronics 2025, 14(2), 307; https://doi.org/10.3390/electronics14020307 - 14 Jan 2025

Viewed by 1074

Abstract

With the advancement of wireless communication technology, the significance of efficient and accurate channel estimation methods has grown substantially. Recently, deep learning-based methods are being adopted to estimate channels with higher precision than traditional methods, even in the absence of prior channel statistics. [...] Read more.

With the advancement of wireless communication technology, the significance of efficient and accurate channel estimation methods has grown substantially. Recently, deep learning-based methods are being adopted to estimate channels with higher precision than traditional methods, even in the absence of prior channel statistics. In this paper, we propose two deep learning-based channel estimation models, CAMPNet and MSResNet, which are designed to consider channel characteristics from a multiscale perspective. The convolutional attention and multiscale parallel network (CAMPNet) accentuates critical channel characteristics by utilizing parallel multiscale features and convolutional attention, while the multiscale residual network (MSResNet) integrates information across various scales through cross-connected multiscale convolutional structures. Both models are designed to perform robustly in environments with complex frequency domain information and various Doppler shifts. Experimental results demonstrate that CAMPNet and MSResNet achieve superior performance compared to existing channel estimation methods within various channel models. Notably, the proposed models show exceptional performance in high signal-to-noise ratio (SNR) environments, achieving up to a 48.98% reduction in mean squared error(MSE) compared to existing methods at an SNR of

25 dB

. In experiments evaluating the generalization capabilities of the proposed models, they show greater stability and robustness compared to existing methods. These results suggest that deep learning-based channel estimation models have the potential to overcome the limitations of existing methods, offering high performance and efficiency in real-world communication environments. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

18 pages, 3854 KiB

Open AccessArticle

IL-6-Inducing Peptide Prediction Based on 3D Structure and Graph Neural Network

by Ruifen Cao, Qiangsheng Li, Pijing Wei, Yun Ding, Yannan Bin and Chunhou Zheng

Biomolecules 2025, 15(1), 99; https://doi.org/10.3390/biom15010099 - 10 Jan 2025

Viewed by 1207

Abstract

Interleukin-6 (IL-6) is a potent glycoprotein that plays a crucial role in regulating innate and adaptive immunity, as well as metabolism. The expression and release of IL-6 are closely correlated with the severity of various diseases. IL-6-inducing peptides are critical for the development [...] Read more.

Interleukin-6 (IL-6) is a potent glycoprotein that plays a crucial role in regulating innate and adaptive immunity, as well as metabolism. The expression and release of IL-6 are closely correlated with the severity of various diseases. IL-6-inducing peptides are critical for the development of immunotherapy and diagnostic biomarkers for some diseases. Most existing methods for predicting IL-6-induced peptides use traditional machine learning methods, whose feature selection is based on prior knowledge. In addition, none of these methods take into account the three-dimensional (3D) structure of peptides, which is essential for their functional properties. In this study, we propose a novel IL-6-inducing peptide prediction method called DGIL-6, which integrates 3D structural information with graph neural networks. DGIL-6 represents a peptide sequence as a graph, where each amino acid is treated as a node, and the adjacency matrix, representing the relationships between nodes, is derived from the predicted residue contact graph of the peptide sequence. In addition to commonly used amino acid representations, such as one-hot encoding and position encoding, the pre-trained model ESM-1b is employed to extract amino acid features as node features. In order to simultaneously consider node weights and information updates, a dual-channel method combining Graph Attention Network (GAT) and Graph Convolutional Network (GCN) is adopted. Finally, the extracted features from both channels are merged for the classification of IL-6-inducing peptides. A series of experiments including cross-validation, independent testing, ablation studies, and visualizations demonstrate the effectiveness of the DGIL-6 method. Full article

(This article belongs to the Special Issue Computational Analysis and Conformational Modeling for Protein Structure and Interaction)

► Show Figures

Figure 1

25 pages, 9089 KiB

Open AccessArticle

Remotely Powered Two-Wire Cooperative Sensors for Bioimpedance Imaging Wearables

by Olivier Chételat, Michaël Rapin, Benjamin Bonnal, André Fivaz, Benjamin Sporrer, James Rosenthal and Josias Wacker

Sensors 2024, 24(18), 5896; https://doi.org/10.3390/s24185896 - 11 Sep 2024

Viewed by 1501

Abstract

Bioimpedance imaging aims to generate a 3D map of the resistivity and permittivity of biological tissue from multiple impedance channels measured with electrodes applied to the skin. When the electrodes are distributed around the body (for example, by delineating a cross section of [...] Read more.

Bioimpedance imaging aims to generate a 3D map of the resistivity and permittivity of biological tissue from multiple impedance channels measured with electrodes applied to the skin. When the electrodes are distributed around the body (for example, by delineating a cross section of the chest or a limb), bioimpedance imaging is called electrical impedance tomography (EIT) and results in functional 2D images. Conventional EIT systems rely on individually cabling each electrode to master electronics in a star configuration. This approach works well for rack-mounted equipment; however, the bulkiness of the cabling is unsuitable for a wearable system. Previously presented cooperative sensors solve this cabling problem using active (dry) electrodes connected via a two-wire parallel bus. The bus can be implemented with two unshielded wires or even two conductive textile layers, thus replacing the cumbersome wiring of the conventional star arrangement. Prior research demonstrated cooperative sensors for measuring bioimpedances, successfully realizing a measurement reference signal, sensor synchronization, and data transfer though still relying on individual batteries to power the sensors. Subsequent research using cooperative sensors for biopotential measurements proposed a method to remove batteries from the sensors and have the central unit supply power over the two-wire bus. Building from our previous research, this paper presents the application of this method to the measurement of bioimpedances. Two different approaches are discussed, one using discrete, commercially available components, and the other with an application-specific integrated circuit (ASIC). The initial experimental results reveal that both approaches are feasible, but the ASIC approach offers advantages for medical safety, as well as lower power consumption and a smaller size. Full article

(This article belongs to the Special Issue Wearable and Unobtrusive Technologies for Healthcare Monitoring—2nd Edition)

► Show Figures

Figure 1

Search Results (45)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (45)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI