MDPI - Publisher of Open Access Journals

34 pages, 4124 KiB

Open AccessArticle

Prompt-Gated Transformer with Spatial–Spectral Enhancement for Hyperspectral Image Classification

by Ruimin Han, Shuli Cheng, Shuoshuo Li and Tingjie Liu

Remote Sens. 2025, 17(15), 2705; https://doi.org/10.3390/rs17152705 - 4 Aug 2025

Viewed by 191

Hyperspectral image (HSI) classification is an important task in the field of remote sensing, with far-reaching practical significance. Most Convolutional Neural Networks (CNNs) only focus on local spatial features and ignore global spectral dependencies, making it difficult to completely extract spectral information in [...] Read more.

Hyperspectral image (HSI) classification is an important task in the field of remote sensing, with far-reaching practical significance. Most Convolutional Neural Networks (CNNs) only focus on local spatial features and ignore global spectral dependencies, making it difficult to completely extract spectral information in HSI. In contrast, Vision Transformers (ViTs) are widely used in HSI due to their superior feature extraction capabilities. However, existing Transformer models have challenges in achieving spectral–spatial feature fusion and maintaining local structural consistency, making it difficult to strike a balance between global modeling capabilities and local representation. To this end, we propose a Prompt-Gated Transformer with a Spatial–Spectral Enhancement (PGTSEFormer) network, which includes a Channel Hybrid Positional Attention Module (CHPA) and Prompt Cross-Former (PCFormer). The CHPA module adopts a dual-branch architecture to concurrently capture spectral and spatial positional attention, thereby enhancing the model’s discriminative capacity for complex feature categories through adaptive weight fusion. PCFormer introduces a Prompt-Gated mechanism and grouping strategy to effectively model cross-regional contextual information, while maintaining local consistency, which significantly enhances the ability for long-distance dependent modeling. Experiments were conducted on five HSI datasets and the results showed that overall accuracies of 97.91%, 98.74%, 99.48%, 99.18%, and 92.57% were obtained on the Indian pines, Salians, Botswana, WHU-Hi-LongKou, and WHU-Hi-HongHu datasets. The experimental results show the effectiveness of our proposed approach. Full article

(This article belongs to the Special Issue Multi-Task Remote Sensing Image Analysis: Classification, Segmentation, and Change Detection)

► Show Figures

Figure 1

22 pages, 24173 KiB

Open AccessArticle

ScaleViM-PDD: Multi-Scale EfficientViM with Physical Decoupling and Dual-Domain Fusion for Remote Sensing Image Dehazing

by Hao Zhou, Yalun Wang, Wanting Peng, Xin Guan and Tao Tao

Remote Sens. 2025, 17(15), 2664; https://doi.org/10.3390/rs17152664 - 1 Aug 2025

Viewed by 204

Abstract

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm [...] Read more.

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm for vision tasks, showing great promise due to their computational efficiency and robust capacity to model global dependencies. However, most existing learning-based dehazing methods lack physical interpretability, leading to weak generalization. Furthermore, they typically rely on spatial features while neglecting crucial frequency domain information, resulting in incomplete feature representation. To address these challenges, we propose ScaleViM-PDD, a novel network that enhances an SSM backbone with two key innovations: a Multi-scale EfficientViM with Physical Decoupling (ScaleViM-P) module and a Dual-Domain Fusion (DD Fusion) module. The ScaleViM-P module synergistically integrates a Physical Decoupling block within a Multi-scale EfficientViM architecture. This design enables the network to mitigate haze interference in a physically grounded manner at each representational scale while simultaneously capturing global contextual information to adaptively handle complex haze distributions. To further address detail loss, the DD Fusion module replaces conventional skip connections by incorporating a novel Frequency Domain Module (FDM) alongside channel and position attention. This allows for a more effective fusion of spatial and frequency features, significantly improving the recovery of fine-grained details, including color and texture information. Extensive experiments on nine publicly available remote sensing datasets demonstrate that ScaleViM-PDD consistently surpasses state-of-the-art baselines in both qualitative and quantitative evaluations, highlighting its strong generalization ability. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing (5th Edition))

► Show Figures

Figure 1

25 pages, 549 KiB

Open AccessArticle

CurveMark: Detecting AI-Generated Text via Probabilistic Curvature and Dynamic Semantic Watermarking

by Yuhan Zhang, Xingxiang Jiang, Hua Sun, Yao Zhang and Deyu Tong

Entropy 2025, 27(8), 784; https://doi.org/10.3390/e27080784 - 24 Jul 2025

Viewed by 347

Abstract

Large language models (LLMs) pose significant challenges to content authentication, as their sophisticated generation capabilities make distinguishing AI-produced text from human writing increasingly difficult. Current detection methods suffer from limited information capture, poor rate–distortion trade-offs, and vulnerability to adversarial perturbations. We present CurveMark, [...] Read more.

Large language models (LLMs) pose significant challenges to content authentication, as their sophisticated generation capabilities make distinguishing AI-produced text from human writing increasingly difficult. Current detection methods suffer from limited information capture, poor rate–distortion trade-offs, and vulnerability to adversarial perturbations. We present CurveMark, a novel dual-channel detection framework that combines probability curvature analysis with dynamic semantic watermarking, grounded in information-theoretic principles to maximize mutual information between text sources and observable features. To address the limitation of requiring prior knowledge of source models, we incorporate a Bayesian multi-hypothesis detection framework for statistical inference without prior assumptions. Our approach embeds imperceptible watermarks during generation via entropy-aware, semantically informed token selection and extracts complementary features from probability curvature patterns and watermark-specific metrics. Evaluation across multiple datasets and LLM architectures demonstrates 95.4% detection accuracy with minimal quality degradation (perplexity increase < 1.3), achieving 85–89% channel capacity utilization and robust performance under adversarial perturbations (72–94% information retention). Full article

(This article belongs to the Section Information Theory, Probability and Statistics)

► Show Figures

Figure 1

17 pages, 2072 KiB

Open AccessArticle

Barefoot Footprint Detection Algorithm Based on YOLOv8-StarNet

by Yujie Shen, Xuemei Jiang, Yabin Zhao and Wenxin Xie

Sensors 2025, 25(15), 4578; https://doi.org/10.3390/s25154578 - 24 Jul 2025

Viewed by 298

Abstract

This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich [...] Read more.

This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich texture patterns. To address this, our framework integrates an improved StarNet into the backbone of YOLOv8 architecture. Leveraging the unique advantages of element-wise multiplication, the redesigned backbone efficiently maps inputs to a high-dimensional nonlinear feature space without increasing channel dimensions, achieving enhanced representational capacity with low computational latency. Subsequently, an Encoder layer facilitates feature interaction within the backbone through multi-scale feature fusion and attention mechanisms, effectively extracting rich semantic information while maintaining computational efficiency. In the feature fusion part, a feature modulation block processes multi-scale features by synergistically combining global and local information, thereby reducing redundant computations and decreasing both parameter count and computational complexity to achieve model lightweighting. Experimental evaluations on a proprietary barefoot footprint dataset demonstrate that the proposed model exhibits significant advantages in terms of parameter efficiency, recognition accuracy, and computational complexity. The number of parameters has been reduced by 0.73 million, further improving the model’s speed. Gflops has been reduced by 1.5, lowering the performance requirements for computational hardware during model deployment. Recognition accuracy has reached 99.5%, with further improvements in model precision. Future research will explore how to capture shoeprint images with complex backgrounds from shoes worn at crime scenes, aiming to further enhance the model’s recognition capabilities in more forensic scenarios. Full article

(This article belongs to the Special Issue Transformer Applications in Target Tracking)

► Show Figures

Figure 1

26 pages, 663 KiB

Open AccessArticle

An Information-Theoretic Framework for Retrieval-Augmented Generation Systems

by Semih Yumuşak

Electronics 2025, 14(15), 2925; https://doi.org/10.3390/electronics14152925 - 22 Jul 2025

Viewed by 355

Abstract

Retrieval-Augmented Generation (RAG) systems have emerged as a critical approach for enhancing large language models with external knowledge, yet the field lacks systematic theoretical analysis for understanding their fundamental characteristics and optimization principles. A novel information-theoretic approach for analyzing and optimizing RAG systems [...] Read more.

Retrieval-Augmented Generation (RAG) systems have emerged as a critical approach for enhancing large language models with external knowledge, yet the field lacks systematic theoretical analysis for understanding their fundamental characteristics and optimization principles. A novel information-theoretic approach for analyzing and optimizing RAG systems is introduced in this paper by modeling them as cascading information channel systems where each component (query encoding, retrieval, context integration, and generation) functions as a distinct information-theoretic channel with measurable capacity. Following established practices in information theory research, theoretical insights are evaluated through systematic experimentation on controlled synthetic datasets that enable precise manipulation of schema entropy and isolation of information flow dynamics. Through this controlled experimental approach, the following key theoretical insights are supported: (1) RAG performance is bounded by the minimum capacity across constituent channels, (2) the retrieval channel represents the primary information bottleneck, (3) errors propagate through channel-dependent mechanisms with specific interaction patterns, and (4) retrieval capacity is fundamentally limited by the minimum of embedding dimension and schema entropy. Both quantitative metrics for evaluating RAG systems and practical design principles for optimization are provided by the proposed approach. Retrieval improvements yield 58–85% performance gains and generation improvements yield 58–110% gains, substantially higher than context integration improvements (∼9%) and query encoding modifications, as shown by experimental results on controlled synthetic environments, supporting the theoretical approach. A systematic theoretical analysis for understanding RAG system dynamics is provided by this work, with real-world validation and practical implementation refinements representing natural next phases for this research. Full article

(This article belongs to the Special Issue Advanced Natural Language Processing Technology and Applications)

► Show Figures

Figure 1

18 pages, 1956 KiB

Open AccessArticle

Two Novel Quantum Steganography Algorithms Based on LSB for Multichannel Floating-Point Quantum Representation of Digital Signals

by Meiyu Xu, Dayong Lu, Youlin Shang, Muhua Liu and Songtao Guo

Electronics 2025, 14(14), 2899; https://doi.org/10.3390/electronics14142899 - 20 Jul 2025

Viewed by 220

Abstract

Currently, quantum steganography schemes utilizing the least significant bit (LSB) approach are primarily optimized for fixed-point data processing, yet they encounter precision limitations when handling extended floating-point data structures owing to quantization error accumulation. To overcome precision constraints in quantum data hiding, the [...] Read more.

Currently, quantum steganography schemes utilizing the least significant bit (LSB) approach are primarily optimized for fixed-point data processing, yet they encounter precision limitations when handling extended floating-point data structures owing to quantization error accumulation. To overcome precision constraints in quantum data hiding, the EPlsb-MFQS and MVlsb-MFQS quantum steganography algorithms are constructed based on the LSB approach in this study. The multichannel floating-point quantum representation of digital signals (MFQS) model enhances information hiding by augmenting the number of available channels, thereby increasing the embedding capacity of the LSB approach. Firstly, we analyze the limitations of fixed-point signals steganography schemes and propose the conventional quantum steganography scheme based on the LSB approach for the MFQS model, achieving enhanced embedding capacity. Moreover, the enhanced embedding efficiency of the EPlsb-MFQS algorithm primarily stems from the superposition probability adjustment of the LSB approach. Then, to prevent an unauthorized person easily extracting secret messages, we utilize channel qubits and position qubits as novel carriers during quantum message encoding. The secret message is encoded into the signal’s qubits of the transmission using a particular modulo value rather than through sequential embedding, thereby enhancing the security and reducing the time complexity in the MVlsb-MFQS algorithm. However, this algorithm in the spatial domain has low robustness and security. Therefore, an improved method of transferring the steganographic process to the quantum Fourier transformed domain to further enhance security is also proposed. This scheme establishes the essential building blocks for quantum signal processing, paving the way for advanced quantum algorithms. Compared with available quantum steganography schemes, the proposed steganography schemes achieve significant improvements in embedding efficiency and security. Finally, we theoretically delineate, in detail, the quantum circuit design and operation process. Full article

(This article belongs to the Special Issue New Sights in Quantum Computing: Circuits, Algorithms, and Applications)

► Show Figures

Figure 1

36 pages, 25361 KiB

Open AccessArticle

Remote Sensing Image Compression via Wavelet-Guided Local Structure Decoupling and Channel–Spatial State Modeling

by Jiahui Liu, Lili Zhang and Xianjun Wang

Remote Sens. 2025, 17(14), 2419; https://doi.org/10.3390/rs17142419 - 12 Jul 2025

Viewed by 477

Abstract

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer [...] Read more.

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer superior representational capacity. However, challenges remain in achieving a balance between fine-detail adaptation and computational efficiency. Mamba, a state–space model (SSM)-based architecture, offers linear-time complexity and excels at capturing long-range dependencies in sequences. It has been adopted in remote sensing compression tasks to model long-distance dependencies between pixels. However, despite its effectiveness in global context aggregation, Mamba’s uniform bidirectional scanning is insufficient for capturing high-frequency structures such as edges and textures. Moreover, existing visual state–space (VSS) models built upon Mamba typically treat all channels equally and lack mechanisms to dynamically focus on semantically salient spatial regions. To address these issues, we present an innovative architecture for distant sensing image compression, called the Multi-scale Channel Global Mamba Network (MGMNet). MGMNet integrates a spatial–channel dynamic weighting mechanism into the Mamba architecture, enhancing global semantic modeling while selectively emphasizing informative features. It comprises two key modules. The Wavelet Transform-guided Local Structure Decoupling (WTLS) module applies multi-scale wavelet decomposition to disentangle and separately encode low- and high-frequency components, enabling efficient parallel modeling of global contours and local textures. The Channel–Global Information Modeling (CGIM) module enhances conventional VSS by introducing a dual-path attention strategy that reweights spatial and channel information, improving the modeling of long-range dependencies and edge structures. We conducted extensive evaluations on three distinct remote sensing datasets to assess the MGMNet. The results of the investigations revealed that MGMNet outperforms the current SOTA models across various performance metrics. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

11 pages, 3730 KiB

Open AccessCommunication

Chiral Grayscale Imaging Based on a Versatile Metasurface of Spin-Selective Manipulation

by Yue Cao, Yi-Fei Sun, Zi-Yang Zhu, Qian-Wen Luo, Bo-Xiong Zhang, Xiao-Wei Sun and Ting Song

Materials 2025, 18(13), 3190; https://doi.org/10.3390/ma18133190 - 5 Jul 2025

Viewed by 433

Abstract

Metasurface display, a kind of unique imaging technique with subwavelength scale, plays a key role in data storage, information processing, and optical imaging due to the superior performance of high resolution, miniaturization, and integration. Recent works about grayscale imaging as a typical metasurface [...] Read more.

Metasurface display, a kind of unique imaging technique with subwavelength scale, plays a key role in data storage, information processing, and optical imaging due to the superior performance of high resolution, miniaturization, and integration. Recent works about grayscale imaging as a typical metasurface display have showcased an excellent performance for optical integrated devices in the near field. However, chiral grayscale imaging has been rarely elucidated, especially using a single structure. Here, a novel method is proposed to display a continuously chiral grayscale imaging that is adjusted by a metasurface consisting of a single chiral structure with optimized geometric parameters. The simulation results show that the incident light can be nearly converted into its cross-polarized reflection when the chiral structural variable parameters are α = 80° and β = 45°. The versatile metasurface can arbitrarily and independently realize the spin-selective manipulation of wavelength and amplitude of circularly polarized light. Due to the excellent manipulation ability of the versatile metasurface, a kind of circularly polarized light detection and a two-channel encoded display with different operating wavelengths are presented. More importantly, this versatile metasurface can also be used to show high-resolution chiral grayscale imaging, which distinguishes it from the results of previous grayscale imaging studies about linearly polarized incident illumination. The proposed versatile metasurface of spin-selective manipulation, with the advantages of high resolution, large capacity, and monolithic integration, provides a novel way for polarization detection, optical display, information storage, and other relevant fields. Full article

► Show Figures

Figure 1

23 pages, 372 KiB

Open AccessArticle

Computability of the Zero-Error Capacity of Noisy Channels

by Holger Boche and Christian Deppe

Information 2025, 16(7), 571; https://doi.org/10.3390/info16070571 - 3 Jul 2025

Viewed by 326

Abstract

The zero-error capacity of discrete memoryless channels (DMCs), introduced by Shannon, is a fundamental concept in information theory with significant operational relevance, particularly in settings where even a single transmission error is unacceptable. Despite its importance, no general closed-form expression or algorithm is [...] Read more.

The zero-error capacity of discrete memoryless channels (DMCs), introduced by Shannon, is a fundamental concept in information theory with significant operational relevance, particularly in settings where even a single transmission error is unacceptable. Despite its importance, no general closed-form expression or algorithm is known for computing this capacity. In this work, we investigate the computability-theoretic boundaries of the zero-error capacity and establish several fundamental limitations. Our main result shows that the zero-error capacity of noisy channels is not Banach–Mazur-computable and therefore is also not Borel–Turing-computable. This provides a strong form of non-computability that goes beyond classical undecidability, capturing the inherent discontinuity of the capacity function. As a further contribution, we analyze the deep connections between (i) the zero-error capacity of DMCs, (ii) the Shannon capacity of graphs, and (iii) Ahlswede’s operational characterization via the maximum-error capacity of 0–1 arbitrarily varying channels (AVCs). We prove that key semi-decidability questions are equivalent for all three capacities, thus unifying these problems into a common algorithmic framework. While the computability status of the Shannon capacity of graphs remains unresolved, our equivalence result clarifies what makes this problem so challenging and identifies the logical barriers that must be overcome to resolve it. Together, these results chart the computational landscape of zero-error information theory and provide a foundation for further investigations into the algorithmic intractability of exact capacity computations. Full article

(This article belongs to the Special Issue Feature Papers in Information in 2024–2025)

16 pages, 6137 KiB

Open AccessArticle

DMET: Dynamic Mask-Enhanced Transformer for Generalizable Deep Image Denoising

by Tong Zhu, Anqi Li, Yuan-Gen Wang, Wenkang Su and Donghua Jiang

Mathematics 2025, 13(13), 2167; https://doi.org/10.3390/math13132167 - 2 Jul 2025

Viewed by 364

Abstract

Different types of noise are inevitably introduced by devices during image acquisition and transmission processes. Therefore, image denoising remains a crucial challenge in computer vision. Deep learning, especially recent Transformer-based architectures, has demonstrated remarkable performance for image denoising tasks. However, due to its [...] Read more.

Different types of noise are inevitably introduced by devices during image acquisition and transmission processes. Therefore, image denoising remains a crucial challenge in computer vision. Deep learning, especially recent Transformer-based architectures, has demonstrated remarkable performance for image denoising tasks. However, due to its data-driven nature, deep learning can easily overfit the training data, leading to a lack of generalization ability. In order to address this issue, we present a novel Dynamic Mask-Enhanced Transformer (DMET) to improve the generalization capacity of denoising networks. Specifically, a texture-guided adaptive masking mechanism is introduced to simulate possible noise in practical applications. Then, we apply a masked hierarchical attention block to mitigate information loss and leverage global statistics, which combines shifted window multi-head self-attention with channel attention. Additionally, an attention mask is applied during training to reduce discrepancies between training and testing. Extensive experiments demonstrate that our approach achieves better generalization performance than state-of-the-art deep learning models and can be directly applied to real-world scenarios. Full article

(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

26 pages, 5306 KiB

Open AccessArticle

Non-Hermitian Control of Tri-Photon and Quad-Photon Using Parallel Multi-Dressing Quantization

by Haitian Tang, Rui Zhuang, Jiaxuan Wei, Qingyu Chen, Sinong Liu, Guobin Liu, Zhou Feng and Yanpeng Zhang

Photonics 2025, 12(7), 653; https://doi.org/10.3390/photonics12070653 - 27 Jun 2025

Viewed by 192

Abstract

The fifth-order nonlinear polarizability has been extensively studied in the field of quantum communication due to its ease of manipulation. By adjusting the relative size of the Rabi frequency and dephasing rate of the dressing field, natural non-Hermitian exceptional points can be generated, [...] Read more.

The fifth-order nonlinear polarizability has been extensively studied in the field of quantum communication due to its ease of manipulation. By adjusting the relative size of the Rabi frequency and dephasing rate of the dressing field, natural non-Hermitian exceptional points can be generated, and further evolution can be achieved by varying the types of dressing fields. However, as the demand for information capacity in quantum communication continues to increase, research on the higher-order seventh-order nonlinear polarizability, based on four-photon states, and the number of coherent channels and resonance positions has gradually come to the forefront. This paper focuses on the simultaneous generation of a seventh-order nonlinear polarizability through a spontaneous eight-wave mixing (SEWM) process in an atomic medium involving four photons. Compared to the fifth-order nonlinear polarizability, the seventh-order polarizability shows an exponential increase in coherent channels and resonance positions due to its strong dressing effect. Additionally, the interaction between the four photons is stronger than that between three photons, making it possible for even the difficult-to-dress eigenvalues to be influenced by the dressing field and dephasing rate, resulting in more complex coherent channels. These are manifested as more complex, damped Rabi oscillations, with periods that can be controlled by the dressing field. These findings may contribute to a promising new method for quantum communication. Full article

► Show Figures

Figure 1

24 pages, 11665 KiB

Open AccessArticle

Error Performance Analysis and PS Factor Optimization for SWIPT AF Relaying Systems over Rayleigh Fading Channels: Interpretation SWIPT AF Relay as Non-SWIPT AF Relay

by Kyunbyoung Ko and Changick Song

Electronics 2025, 14(13), 2597; https://doi.org/10.3390/electronics14132597 - 27 Jun 2025

Viewed by 285

Abstract

This paper presents an analytical study of the bit error rate (BER) and signal-to-noise ratio (SNR) performance in simultaneous wireless information and power transfer (SWIPT) amplify-and-forward (AF) relaying systems over Rayleigh fading channels. A power-splitting (PS) protocol is employed at the energy-constrained relay [...] Read more.

This paper presents an analytical study of the bit error rate (BER) and signal-to-noise ratio (SNR) performance in simultaneous wireless information and power transfer (SWIPT) amplify-and-forward (AF) relaying systems over Rayleigh fading channels. A power-splitting (PS) protocol is employed at the energy-constrained relay to divide the received signal for concurrent energy harvesting and information processing. Closed-form and asymptotic BER expressions are derived based on exact and bounded moment-generating functions (MGFs), offering insights into how the SNR balance between the source–relay (SR) and relay–destination (RD) links influences system performance. An asymptotic BER expression further reveals that a SWIPT AF relay system can be interpreted as a generalized AF relaying model, sharing the same diversity order as conventional AF systems. Based on this interpretation, an optimization method for the PS factor is proposed, effectively reducing the BER by reinforcing the weaker link. Simulation results confirm the tightness of the derived expressions and the effectiveness of the optimization strategy. Moreover, the analytical framework is extended to multiple SWIPT relaying systems, where multiple relays operate with individually optimized PS ratios. For such configurations, approximations for the system BER, outage probability, and channel capacity are derived and validated. Results demonstrate that increasing the number of relays significantly improves system performance, and the proposed analysis accurately captures these performance gains under varying channel conditions. Full article

(This article belongs to the Section Microwave and Wireless Communications)

► Show Figures

Figure 1

21 pages, 12722 KiB

Open AccessArticle

PC3D-YOLO: An Enhanced Multi-Scale Network for Crack Detection in Precast Concrete Components

by Zichun Kang, Kedi Gu, Andrew Yin Hu, Haonan Du, Qingyang Gu, Yang Jiang and Wenxia Gan

Buildings 2025, 15(13), 2225; https://doi.org/10.3390/buildings15132225 - 25 Jun 2025

Viewed by 467

Abstract

Crack detection in precast concrete components aims to achieve precise extraction of crack features within complex image backgrounds. Current computer vision-based methods typically conduct limited local searches at a single scale, constraining the model’s capacity for feature extraction and fusion in information-rich environments. [...] Read more.

Crack detection in precast concrete components aims to achieve precise extraction of crack features within complex image backgrounds. Current computer vision-based methods typically conduct limited local searches at a single scale, constraining the model’s capacity for feature extraction and fusion in information-rich environments. To address these limitations, we propose PC3D-YOLO, an enhanced framework derived from YOLOv11, which strengthens long-range dependency modeling through multi-scale feature integration, offering a novel approach for crack detection in precast concrete structures. Our methodology involves three key innovations: (1) the Multi-Dilation Spatial-Channel Fusion with Shuffling (MSFS) module, employing dilated convolutions and channel shuffling to enable global feature fusion, replaces the C3K2 bottleneck module to enhance long-distance dependency capture; (2) the AIFI_M2SA module substitutes the conventional SPPF to mitigate its restricted receptive field and information loss, incorporating multi-scale attention for improved near-far contextual integration; (3) a redesigned neck network (MSCD-Net) preserves rich contextual information across all feature scales. Experimental results demonstrate that, on the self-developed dataset, the proposed algorithm achieves a recall of 78.8%, an AP@50 of 86.3%, and an AP@50-95 of 65.6%, outperforming the YOLOv11 algorithm. Furthermore, evaluations on the CRACKS_MANISHA and DECA datasets also confirm the proposed model’s strong generalization capability across different data domains. Full article

(This article belongs to the Section Building Materials, and Repair & Renovation)

► Show Figures

Figure 1

25 pages, 528 KiB

Open AccessFeature PaperArticle

Lightweight and Security-Enhanced Key Agreement Protocol Using PUF for IoD Environments

by Sangjun Lee, Seunghwan Son and Youngho Park

Mathematics 2025, 13(13), 2062; https://doi.org/10.3390/math13132062 - 21 Jun 2025

Viewed by 358

Abstract

With the increasing demand for drones in diverse tasks, the Internet of Drones (IoD) has recently emerged as a significant technology in academia and industry. The IoD environment enables various services, such as traffic and environmental monitoring, disaster situation management, and military operations. [...] Read more.

With the increasing demand for drones in diverse tasks, the Internet of Drones (IoD) has recently emerged as a significant technology in academia and industry. The IoD environment enables various services, such as traffic and environmental monitoring, disaster situation management, and military operations. However, IoD communication is vulnerable to security threats due to the exchange of sensitive information over insecure public channels. Moreover, public key-based cryptographic schemes are impractical for communication with resource-constrained drones due to their limited computational capability and resource capacity. Therefore, a secure and lightweight key agreement scheme must be developed while considering the characteristics of the IoD environment. In 2024, Alzahrani proposed a secure key agreement protocol for securing the IoD environment. However, Alzahrani’s protocol suffers from high computational overhead due to its reliance on elliptic curve cryptography and is vulnerable to drone and mobile user impersonation attacks and session key disclosure attacks by eavesdropping on public-channel messages. Therefore, this work proposes a lightweight and security-enhanced key agreement scheme for the IoD environment to address the limitations of Alzahrani’s protocol. The proposed protocol employs a physical unclonable function and simple cryptographic operations (XOR and hash functions) to achieve high security and efficiency. This work demonstrates the security of the proposed protocol using informal security analysis. This work also conducted formal security analysis using the Real-or-Random (RoR) model, Burrows–Abadi–Needham (BAN) logic, and Automated Verification of Internet Security Protocols and Applications (AVISPA) simulation to verify the proposed protocol’s session key security, mutual authentication ability, and resistance to replay and MITM attacks, respectively. Furthermore, this work demonstrates that the proposed protocol offers better performance and security by comparing the computational and communication costs and security features with those of relevant protocols. Full article

(This article belongs to the Special Issue Advanced Research on Information System Security and Privacy, 2nd Edition)

► Show Figures

Figure 1

23 pages, 4973 KiB

Open AccessArticle

Detection of Electric Network Frequency in Audio Using Multi-HCNet

by Yujin Li, Tianliang Lu, Shufan Peng, Chunhao He, Kai Zhao, Gang Yang and Yan Chen

Sensors 2025, 25(12), 3697; https://doi.org/10.3390/s25123697 - 13 Jun 2025

Viewed by 564

Abstract

With the increasing application of electrical network frequency (ENF) in forensic audio and video analysis, ENF signal detection has emerged as a critical technology. However, high-pass filtering operations commonly employed in modern communication scenarios, while effectively removing infrasound to enhance communication quality at [...] Read more.

With the increasing application of electrical network frequency (ENF) in forensic audio and video analysis, ENF signal detection has emerged as a critical technology. However, high-pass filtering operations commonly employed in modern communication scenarios, while effectively removing infrasound to enhance communication quality at reduced costs, result in a substantial loss of fundamental frequency information, thereby degrading the performance of existing detection methods. To tackle this issue, this paper introduces Multi-HCNet, an innovative deep learning model specifically tailored for ENF signal detection in high-pass filtered environments. Specifically, the model incorporates an array of high-order harmonic filters (AFB), which compensates for the loss of fundamental frequency by capturing high-order harmonic components. Additionally, a grouped multi-channel adaptive attention mechanism (GMCAA) is proposed to precisely distinguish between multiple frequency signals, demonstrating particular effectiveness in differentiating between 50 Hz and 60 Hz fundamental frequency signals. Furthermore, a sine activation function (SAF) is utilized to better align with the periodic nature of ENF signals, enhancing the model’s capacity to capture periodic oscillations. Experimental results indicate that after hyperparameter optimization, Multi-HCNet exhibits superior performance across various experimental conditions. Compared to existing approaches, this study not only significantly improves the detection accuracy of ENF signals in complex environments, achieving a peak accuracy of 98.84%, but also maintains an average detection accuracy exceeding 80% under high-pass filtering conditions. These findings demonstrate that even in scenarios where fundamental frequency information is lost, the model remains capable of effectively detecting ENF signals, offering a novel solution for ENF signal detection under extreme conditions of fundamental frequency absence. Moreover, this study successfully distinguishes between 50 Hz and 60 Hz fundamental frequency signals, providing robust support for the practical deployment and extension of ENF signal applications. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

Search Results (424)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (424)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI