Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (6,530)

Search Parameters:
Keywords = attention-enhanced network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 8829 KB  
Article
YOLO-MSLT: A Multimodal Fusion Network Based on Spatial Linear Transformer for Cattle and Sheep Detection in Challenging Environments
by Yixing Bai, Yongquan Li, Ruoyu Di, Jingye Liu, Xiaole Wang, Chengkai Li and Pan Gao
Agriculture 2026, 16(1), 35; https://doi.org/10.3390/agriculture16010035 (registering DOI) - 23 Dec 2025
Abstract
Accurate detection of cattle and sheep is a core task in precision livestock farming. However, the complexity of agricultural settings, where visible light images perform poorly under low-light or occluded conditions and infrared images are limited in resolution, poses significant challenges for current [...] Read more.
Accurate detection of cattle and sheep is a core task in precision livestock farming. However, the complexity of agricultural settings, where visible light images perform poorly under low-light or occluded conditions and infrared images are limited in resolution, poses significant challenges for current smart monitoring systems. To tackle these challenges, this study aims to develop a robust multimodal fusion detection network for the accurate and reliable detection of cattle and sheep in complex scenes. To achieve this, we propose YOLO-MSLT, a multimodal fusion detection network based on YOLOv10, which leverages the complementary nature of visible light and infrared data. The core of YOLO-MSLT incorporates a Cross Flatten Fusion Transformer (CFFT), composed of the Linear Cross-modal Spatial Transformer (LCST) and Deep-wise Enhancement (DWE), designed to enhance modality collaboration by performing complementary fusion at the feature level. Furthermore, a Content-Guided Attention Feature Pyramid Network (CGA-FPN) is integrated into the neck to improve the representation of multi-scale object features. Validation was conducted on a cattle and sheep dataset built from 5056 pairs of multimodal images (visible light and infrared) collected in the Manas River Basin, Xinjiang. Results demonstrate that YOLO-MSLT performs robustly in complex terrain, low-light, and occlusion scenarios, achieving an mAP@0.5 of 91.8% and a precision of 93.2%, significantly outperforming mainstream detection models. This research provides an impactful and practical solution for cattle and sheep detection in challenging agricultural environments. Full article
(This article belongs to the Section Farm Animal Production)
Show Figures

Figure 1

18 pages, 2081 KB  
Article
Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction
by Guoliang Yang, Yuyu Zhang and Hao Yang
Sensors 2026, 26(1), 105; https://doi.org/10.3390/s26010105 - 23 Dec 2025
Abstract
The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model [...] Read more.
The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model first uses the visual state space model (VSS) as an encoder for feature extraction to better capture its long-range dependencies. Second, a hybrid attention enhancement mechanism (HAEM) is designed at the bottleneck between the encoder and the decoder to provide fine-grained control of the feature map in both the channel and spatial dimensions, so that the network captures key features and regions more comprehensively. The decoder uses transposed convolution to upsample the feature map, gradually increasing the resolution and recovering its spatial information. Finally, the cross-fusion module (CFM) is constructed to simultaneously focus on the spatial information of the shallow feature map as well as the deep semantic information, which effectively reduces the interference of noise and artifacts. Experiments are carried out on BUSI and UDIAT datasets, and the Dice similarity coefficient and HD95 indexes reach 76.04% and 20.28 mm, respectively, which show that the algorithm can effectively solve the problems of noise and artifacts in ultrasound image segmentation, and the segmentation performance is improved compared with the existing algorithms. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

21 pages, 2107 KB  
Article
A High-Precision Daily Runoff Prediction Model for Cross-Border Basins: RPSEMD-IMVO-CSAT Based on Multi-Scale Decomposition and Parameter Optimization
by Tianming He, Yilin Yang, Zheng Wang, Zongzheng Mo and Chu Zhang
Water 2026, 18(1), 48; https://doi.org/10.3390/w18010048 - 23 Dec 2025
Abstract
As the last critical hydrological control station on the Lancang River before it flows out of China, the daily runoff variations at the Yunjinghong Hydrological Station are directly linked to agricultural irrigation, hydropower development, and ecological security in downstream Mekong River riparian countries [...] Read more.
As the last critical hydrological control station on the Lancang River before it flows out of China, the daily runoff variations at the Yunjinghong Hydrological Station are directly linked to agricultural irrigation, hydropower development, and ecological security in downstream Mekong River riparian countries such as Laos, Myanmar, and Thailand. Aiming at the core issues of the runoff sequence in the Lancang–Mekong Basin, which is characterized by prominent nonlinearity, non-stationarity, and coupling of multi-scale features, this study proposes a synergistic prediction framework of “multi-scale decomposition-model improvement-parameter optimization”. Firstly, Regenerated Phase-Shifted Sine-Assisted Empirical Mode Decomposition (RPSEMD) is adopted to adaptively decompose the daily runoff data. On this basis, a Convolutional Sparse Attention Transformer (CSAT) model is constructed. A one-dimensional convolutional neural network (1D-CNN) module is embedded in the input layer to enhance local feature perception, making up for the deficiency of traditional Transformers in capturing detailed information. Meanwhile, the sparse attention mechanism replaces the multi-head attention, realizing efficient focusing on key time-step correlations and reducing computational costs. Additionally, an Improved Multi-Verse Optimizer (IMVO) is introduced, which optimizes the hyperparameters of CSAT through a spiral update mechanism, exponential Travel Distance Rate (T_DR), and adaptive compression factor, thereby improving the model’s accuracy in capturing short-term abrupt patterns such as flood peaks and drought transition points. Experiments are conducted using measured daily runoff data from 2010 to 2022, and the proposed model is compared with mainstream models such as LSTM, GRU, and standard Transformer. The results show that the RPSEMD-IMVO-CSAT model reduces the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) by 15.3–28.7% and 18.6–32.4%, respectively, compared with the comparative models. Full article
Show Figures

Figure 1

21 pages, 1986 KB  
Article
A Comparative and Regional Study of Atmospheric Temperature in the Near-Space Environment Using Intelligent Modeling
by Zhihui Li, Zhiming Han, Huanwei Zhang and Qixiang Liao
Forecasting 2026, 8(1), 1; https://doi.org/10.3390/forecast8010001 - 23 Dec 2025
Abstract
The high-precision prediction of near-space atmospheric temperature holds significant importance for aerospace, national defense security, and climate change research. To address the deficiencies of extracting features in conventional convolutional neural networks, this paper designs a ConvLSTM hybrid model that combines the spatiotemporal feature [...] Read more.
The high-precision prediction of near-space atmospheric temperature holds significant importance for aerospace, national defense security, and climate change research. To address the deficiencies of extracting features in conventional convolutional neural networks, this paper designs a ConvLSTM hybrid model that combines the spatiotemporal feature extraction capability of 3D convolution with a residual attention mechanism, effectively capturing the dynamic evolution patterns of the near-space temperature field. The comparative analysis with various models, including GRU, shows that the proposed model demonstrates superior performance, achieving an RMSE of 2.433 K, a correlation coefficient R of 0.993, and an MRE of 0.76% on the test set. Seasonal error analysis reveals that the prediction stability is better in winter than in summer, with errors in the mesosphere primarily stemming from the complexity of atmospheric processes and limitations in data resolution. Compared to traditional CNNs and single time-series models, the proposed method significantly enhances prediction accuracy, providing a new technical approach for near-space environmental modeling. Full article
(This article belongs to the Section Weather and Forecasting)
Show Figures

Figure 1

20 pages, 1304 KB  
Article
LSDA-YOLO: Enhanced SAR Target Detection with Large Kernel and SimAM Dual Attention
by Jingtian Yang and Lei Zhu
Symmetry 2026, 18(1), 23; https://doi.org/10.3390/sym18010023 - 23 Dec 2025
Abstract
Synthetic Aperture Radar (SAR) target detection faces significant challenges including speckle noise interference, weak small object features, and multi-category imbalance. To address these issues, this paper proposes LSDA-YOLO, an enhanced SAR target detection framework built upon the YOLO architecture that integrates Large Kernel [...] Read more.
Synthetic Aperture Radar (SAR) target detection faces significant challenges including speckle noise interference, weak small object features, and multi-category imbalance. To address these issues, this paper proposes LSDA-YOLO, an enhanced SAR target detection framework built upon the YOLO architecture that integrates Large Kernel Attention and SimAM dual attention mechanisms. Our method effectively overcomes these challenges by synergistically combining global context modeling and local detail enhancement to improve robustness and accuracy. Notably, this framework leverages the inherent symmetry properties of typical SAR targets (e.g., geometric symmetry of ships and bridges) to strengthen feature consistency, thereby reducing interference from asymmetric background clutter. By replacing the baseline C2PSA module with Deformable Large Kernel Attention and incorporating parameter-free SimAM attention throughout the detection network, our approach achieves improved detection accuracy while maintaining computational efficiency. The deformable large kernel attention module expands the receptive field through synergistic integration of deformable and dilated convolutions, enhancing geometric modeling for complex-shaped targets. Simultaneously, the SimAM attention mechanism enables adaptive feature enhancement across channel and spatial dimensions based on visual neuroscience principles, effectively improving discriminability for small targets in noisy SAR environments. Experimental results on the RSAR dataset demonstrate that LSDA-YOLO achieves 80.8% mAP50, 53.2% mAP50-95, and 77.6% F1 score, with computational complexity of 7.3 GFLOPS, showing significant improvement over baseline models and other attention variants while maintaining lightweight characteristics suitable for real-time applications. Full article
Show Figures

Figure 1

18 pages, 3847 KB  
Article
Research on the Detection of Ocean Internal Waves Based on the Improved Faster R-CNN in SAR Images
by Gaoyuan Shen, Zhi Zeng, Hao Huang, Zhifan Jiao and Jun Song
J. Mar. Sci. Eng. 2026, 14(1), 23; https://doi.org/10.3390/jmse14010023 - 23 Dec 2025
Abstract
Ocean internal waves occur in stably stratified seawater and play a crucial role in energy cascade, material transport, and military activities. However, the complex and irregular spatial patterns of internal waves pose significant challenges for accurate detection in SAR images when using conventional [...] Read more.
Ocean internal waves occur in stably stratified seawater and play a crucial role in energy cascade, material transport, and military activities. However, the complex and irregular spatial patterns of internal waves pose significant challenges for accurate detection in SAR images when using conventional convolutional neural networks, which often lack adaptability to geometric variations. To address this problem, this paper proposes a refined Faster R-CNN detection framework, termed “rFaster R-CNN”, and adopts a transfer learning strategy to enhance model generalization and robustness. In the feature extraction stage, a backbone network called “ResNet50_CDCN” that integrates the CBAM attention mechanism and DCNv2 deformable convolution is constructed to enhance the feature expression ability of key regions in the images. Experimental results show that in the internal wave dataset constructed in this paper, this network improves the detection accuracy by approximately 3% compared to the original ResNet50 network. At the region proposal stage, this paper further adds two small-scale anchors and combines the ROI Align and FPN modules, effectively enhancing the spatial hierarchical information and semantic expression ability of ocean internal waves. compared with classical object detection algorithms such as SSD, YOLO, and RetinaNet, the proposed “rFaster R-CNN” achieves superior detection performance, showing significant improvements in both accuracy and robustness. Full article
(This article belongs to the Special Issue Artificial Intelligence and Its Application in Ocean Engineering)
Show Figures

Figure 1

25 pages, 3364 KB  
Article
A SimAM-Enhanced Multi-Resolution CNN with BiGRU for EEG Emotion Recognition: 4D-MRSimNet
by Yutao Huang and Jijie Deng
Electronics 2026, 15(1), 39; https://doi.org/10.3390/electronics15010039 - 22 Dec 2025
Abstract
This study proposes 4D-MRSimNet, a framework that employs attention mechanisms to focus on distinct dimensions. The approach applies enhancements to key responses in the spatial and spectral domains and provides a characterization of dynamic evolution in temporal domain, which extracts and integrates complementary [...] Read more.
This study proposes 4D-MRSimNet, a framework that employs attention mechanisms to focus on distinct dimensions. The approach applies enhancements to key responses in the spatial and spectral domains and provides a characterization of dynamic evolution in temporal domain, which extracts and integrates complementary emotional features to facilitate final classification. At the feature level, differential entropy (DE) and power spectral density (PSD) are combined within four core frequency bands (θ, α, β, and γ). These bands are recognized as closely related to emotional processing. This integration constructs a complementary feature representation that preserves both energy distribution and entropy variability. These features are organized into a 4D representation that integrates electrode topology, frequency characteristics, and temporal dependencies inherent in EEG signals. At the network level, a multi-resolution convolutional module embedded with SimAM attention extracts spatial and spectral features at different scales and adaptively emphasizes key information. A bidirectional GRU (BiGRU) integrated with temporal attention further emphasizes critical time segments and strengthens the modeling of temporal dependencies. Experiments show that our method achieves an accuracy of 97.68% for valence and 97.61% for arousal on the DEAP dataset and 99.60% for valence and 99.46% for arousal on the DREAMER dataset. The results demonstrate the effectiveness of complementary feature fusion, multidimensional feature representation, and the complementary dual attention enhancement strategy for EEG emotion recognition. Full article
23 pages, 6612 KB  
Article
Functional Connectivity of Auditory, Motor, and Reward Networks at Rest and During Music Listening
by Kai Yi (Kaye) Han, Jinyu Wang, Benjamin M. Kubit, Corinna Parrish and Psyche Loui
Brain Sci. 2026, 16(1), 15; https://doi.org/10.3390/brainsci16010015 - 22 Dec 2025
Abstract
Background/Objectives: Music engages multiple brain networks simultaneously, yet most studies examine these networks in isolation. Methods: We investigated functional connectivity among the auditory, motor, and reward networks during music listening in different contexts using fMRI data from two samples (N = 39 [...] Read more.
Background/Objectives: Music engages multiple brain networks simultaneously, yet most studies examine these networks in isolation. Methods: We investigated functional connectivity among the auditory, motor, and reward networks during music listening in different contexts using fMRI data from two samples (N = 39 each): focused music listening and background music during cognitive tasks. ROI-to-ROI, seed-based, and graph theory analyses examined connectivity patterns among 46 regions spanning the three networks. Results: Both contexts showed enhanced within-auditory network connectivity compared to rest, suggesting that this is fundamental to music processing. However, between-network patterns diverged markedly. Background music listening during cognitive tasks preserved reward-motor coupling while reducing auditory-motor and auditory-reward connectivity. Focused music listening produced widespread negative correlations between motor regions and both the auditory and reward networks, potentially reflecting motor suppression in the scanner environment. Graph theory measures revealed context-specific hub reorganization: reward regions (nucleus accumbens, caudate) showed increased centrality during background music listening, while the amygdala and frontal orbital cortex were selectively enhanced during focused listening. Conclusions: These findings demonstrate that music engagement involves context-dependent network reorganization beyond simple attention effects. The same musical stimulus engages different neural mechanisms depending on concurrent cognitive demands, motor requirements, and listening goals. Enhanced within-auditory connectivity appears consistent across contexts, but between-network interactions are shaped by the broader cognitive-behavioral context. These results highlight the importance of considering ecological context when studying music processing and designing music-based interventions, as network connectivity patterns during music listening reflect complex interactions between task demands, attentional resources, and musical engagement rather than music processing alone. Full article
Show Figures

Figure 1

22 pages, 25879 KB  
Article
Multi-Scale Interactive Network with Color Attention for Low-Light Image Enhancement
by Haoxiang Lu, Changna Qian, Ziming Wang and Zhenbing Liu
Sensors 2026, 26(1), 83; https://doi.org/10.3390/s26010083 (registering DOI) - 22 Dec 2025
Abstract
Enhancing low-light images is crucial in computer vision applications. Most existing learning-based models often struggle to balance light enhancement and color correction, while images typically contain different types of information at different levels. Hence, we proposed a multi-scale interactive network with color attention [...] Read more.
Enhancing low-light images is crucial in computer vision applications. Most existing learning-based models often struggle to balance light enhancement and color correction, while images typically contain different types of information at different levels. Hence, we proposed a multi-scale interactive network with color attention named MSINet to effectively explore these different types of information for lowlight image enhancement (LLIE) tasks. Specifically, the MSINet first employs the CNN-based branch built upon stacked residual channel attention blocks (RCABs) to fully explore the image local features. Meanwhile, the Transformer-based branch constructed by Transformer blocks contains cross-scale attention (CSA) and multi-head self-attention (MHSA) to mine the global features. Notably, the local and global features extracted by each RCAB and Transformer block are interacted with by the fusion module. Additionally, the color correction branch (CCB) based upon self-attention (SA) can learn the color distribution information from the lowlight input for further guaranteeing the color fidelity of the final output. Extensive experiments have demonstrated that our proposed MSINet outperforms state-of-the-art LLIE methods in light enhancement and color correction. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

32 pages, 5024 KB  
Article
ICU-Transformer: Multi-Head Attention Expert System for ICU Resource Allocation Robust to Data Poisoning Attacks
by Manal Alghieth
Future Internet 2026, 18(1), 6; https://doi.org/10.3390/fi18010006 - 22 Dec 2025
Abstract
Intensive Care Units (ICUs) face unprecedented challenges in resource allocation, particularly during health crises in which algorithmic systems may be exposed to adversarial manipulation. A transformer-based expert system, ICU-Transformer, is presented to optimize resource allocation across 200 ICUs in Physionet while maintaining robustness [...] Read more.
Intensive Care Units (ICUs) face unprecedented challenges in resource allocation, particularly during health crises in which algorithmic systems may be exposed to adversarial manipulation. A transformer-based expert system, ICU-Transformer, is presented to optimize resource allocation across 200 ICUs in Physionet while maintaining robustness against data poisoning attacks. The framework incorporates a Robust Multi-Head Attention mechanism that achieves an AUC-ROC of 0.891 in mortality prediction under 20% data contamination, outperforming conventional baselines. The system is trained and evaluated using data from the MIMIC-IV and eICU Collaborative Research Database and is deployed to manage more than 50,000 ICU admissions annually. A Resource Optimization Engine (ROE) is introduced to dynamically allocate ventilators, Extracorporeal Membrane Oxygenation (ECMO) machines, and specialized clinical staff based on predicted deterioration risk, resulting in an 18% reduction in preventable deaths. A Surge Capacity Planner (SCP) is further employed to simulate disaster scenarios and optimize cross-hospital resource distribution. Deployment across the Physionet ICU Network demonstrates improvements, including a 2.1-day reduction in average ICU bed turnover time, a 31% decrease in unnecessary admissions, and an estimated USD 142 million in annual operational savings. During the observation period, 234 algorithmic manipulation attempts were detected, with targeted disparities identified and mitigated through enhanced auditing protocols. Full article
(This article belongs to the Special Issue Artificial Intelligence-Enabled Smart Healthcare)
22 pages, 3023 KB  
Article
Enhancing Continuous Sign Language Recognition via Spatio-Temporal Multi-Scale Deformable Correlation
by Yihan Jiang, Degang Yang and Chen Chen
Appl. Sci. 2026, 16(1), 124; https://doi.org/10.3390/app16010124 - 22 Dec 2025
Abstract
Deep learning-based sign language recognition plays a pivotal role in facilitating communication for the deaf community. Current approaches, while effective, often introduce redundant information and incur excessive computational overhead through global feature interactions. To address these limitations, this paper introduces a Deformable Correlation [...] Read more.
Deep learning-based sign language recognition plays a pivotal role in facilitating communication for the deaf community. Current approaches, while effective, often introduce redundant information and incur excessive computational overhead through global feature interactions. To address these limitations, this paper introduces a Deformable Correlation Network (DCA) designed for efficient temporal modeling in continuous sign language recognition. The DCA integrates a Deformable Correlation (DC) module that leverages spatio-temporal driven offsets to adjust the sampling range adaptively, thereby minimizing interference. Additionally, a multi-scale local sampling strategy, guided by motion prior, enhances temporal modeling capability while reducing computational costs. Furthermore, an attention-based Correlation Matrix Filter (CMF) is proposed to suppress interference elements by accounting for feature motion patterns. A long-term temporal enhancement module, based on spatial aggregation, efficiently leverages global temporal information to model the performer’s holistic limb motion trajectories. Extensive experiments on three benchmark datasets demonstrate significant performance improvements, with a reduction in Word Error Rate (WER) of up to 7.0% on the CE-CSL dataset, showcasing the superiority and competitive advantage of the proposed DCA algorithm. Full article
Show Figures

Figure 1

36 pages, 2348 KB  
Article
LSTM-CA-YOLOv11: A Road Sign Detection Model Integrating LSTM Temporal Modeling and Multi-Scale Attention Mechanism
by Tianlei Ye, Yajie Pang, Yihong Li, Enming Liang, Yunfei Wang and Tong Zhou
Appl. Sci. 2026, 16(1), 116; https://doi.org/10.3390/app16010116 - 22 Dec 2025
Abstract
Traffic sign detection is crucial for intelligent transportation and autonomous driving, yet faces challenges such as illumination variations, occlusions, and scale changes that impact accuracy. To address these issues, the paper proposes the LSTM-CA-YOLOv11 model. This approach pioneers the integration of a Bi-LSTM [...] Read more.
Traffic sign detection is crucial for intelligent transportation and autonomous driving, yet faces challenges such as illumination variations, occlusions, and scale changes that impact accuracy. To address these issues, the paper proposes the LSTM-CA-YOLOv11 model. This approach pioneers the integration of a Bi-LSTM (Bi-directional Long-Short Term Memory) into the YOLOv11 backbone network to model spatial-sequence dependencies, thereby enhancing structured feature extraction capabilities. The lightweight CA (Coordinate Attention) module encodes precise positional information by capturing horizontal and vertical features. The MSEF (Multi-Scale Enhancement Fusion) module addresses scale variations through parallel convolutional and pooling branches with adaptive fusion processing. We further introduce the SPP-Plus (Spatial Pyramid Pooling-Plus) module to expand the receptive field while preserving fine details, and employ a focus IoU (Intersection over Union) loss to prioritise challenging samples, thereby improving regression accuracy. On a private dataset comprising 10,231 images, experiments demonstrate that this model achieves a mAP@0.5 of 93.4% and a mAP@0.5:0.95 of 79.5%, representing improvements of 5.3% and 4.7% over the baseline, respectively. Furthermore, the model’s generalisation performance on the public TT100K (Tsinghua-Tencent 100K) dataset surpassed the latest YOLOv13n by 5.3% in mAP@0.5 and 3.9% in mAP@0.5:0.95, demonstrating robust cross-dataset capabilities and exceptional practical deployment feasibility. Full article
(This article belongs to the Special Issue AI in Object Detection)
18 pages, 4075 KB  
Article
An Attention-Based Hybrid CNN–Bidirectional LSTM Model for Classifying Chlorophyll-a Concentration in Coastal Waters
by Wara Taparhudee, Tanuspong Pokavanich, Manit Chansuparp, Kanokwan Khaodon, Saroj Rermdumri, Alongot Intarachart and Roongparit Jongjaraunsuk
Water 2026, 18(1), 33; https://doi.org/10.3390/w18010033 - 22 Dec 2025
Abstract
Accurate monitoring of chlorophyll-a (Chl-a) is essential for managing coastal aquaculture, as Chl-a indicates phytoplankton biomass and water quality. This study developed a hybrid deep learning model integrating convolutional neural networks (CNN), bidirectional long short-term memory (BiLSTM), and an attention mechanism (Attention) to [...] Read more.
Accurate monitoring of chlorophyll-a (Chl-a) is essential for managing coastal aquaculture, as Chl-a indicates phytoplankton biomass and water quality. This study developed a hybrid deep learning model integrating convolutional neural networks (CNN), bidirectional long short-term memory (BiLSTM), and an attention mechanism (Attention) to classify Chl-a using hourly, water quality datasets collected from the GOT001 station in Si Racha Bay, Eastern Gulf of Thailand (2020–2024). A random forest (RF) identified sea surface temperature (SEATEMP), dew point temperature (DEWPOINT), and turbidity (TURB) as the most influential variables, accounting for over 90% of the accuracy. Chl-a concentrations were categorized into ecological groups (low, medium, and high) using quantile-based binning and K-means clustering to support operational classification. Model performance comparison showed that the CNN–BiLSTM model achieved the highest classification accuracy (81.3%), outperforming the CNN–LSTM model (59.7%). However, the addition of the Attention did not enhance predictive performance, likely due to the limited number of key predictive variables and their already high explanatory power. This study highlights the potential of CNN–BiLSTM as a near-real-time classification tool for Chl-a levels in highly variable coastal ecosystems, supporting aquaculture management, early warning of algal blooms or red tides, and water quality risk assessment in the Gulf of Thailand and comparable coastal regions. Full article
(This article belongs to the Section Water, Agriculture and Aquaculture)
Show Figures

Figure 1

23 pages, 5151 KB  
Article
Small-Target Detection Algorithm Based on Improved YOLOv11n
by Ke Zeng, Wangsheng Yu, Xianxiang Qin and Siyu Long
Sensors 2026, 26(1), 71; https://doi.org/10.3390/s26010071 (registering DOI) - 22 Dec 2025
Abstract
Target detection in UAV aerial photography scenarios faces challenges of small targets and complex backgrounds. Thus, we proposed an improved YOLOv11n small-target detection algorithm. First, a detection head is added to the 160 × 160 resolution feature layer, and non-adjacent layer feature is [...] Read more.
Target detection in UAV aerial photography scenarios faces challenges of small targets and complex backgrounds. Thus, we proposed an improved YOLOv11n small-target detection algorithm. First, a detection head is added to the 160 × 160 resolution feature layer, and non-adjacent layer feature is fused via Asymptotic Feature Pyramid Network (AFPN) to alleviate feature loss caused by downsampling and reduce cross-level feature conflicts. Second, the Spatial Channel Attention SPPF (SCASPPF) module replaces the original Spatial Pyramid Pooling-Fast (SPPF) module to highlight key features and suppress irrelevant ones. Moreover, the loss function is enhanced by fusing MPDIoU and InnerIoU to boost detection accuracy. Finally, Inception Deep Convolution (IDC) is adopted to improve the C3k2 module, expanding the model’s receptive field and enhancing small-target detection performance. Experiments on the Visdrone2019 dataset show that the algorithm achieves 39.256% mAP@0.5, 6.689% higher than 32.567% mAP@0.5 of the benchmark model (YOLOv11n). Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

21 pages, 3958 KB  
Article
Research on Efficient Calligraphy Image Classification Based on Attention Enhancement
by Yu Lei, Tianzhao Zhou and Yuankui Ma
Mathematics 2026, 14(1), 28; https://doi.org/10.3390/math14010028 - 22 Dec 2025
Abstract
As a task in the digital preservation of calligraphy stone inscriptions, an invaluable cultural heritage, style classification faces prominent challenges: insufficient feature representation of single-channel rubbings, and difficulties in effectively capturing the complex strokes and spatial layouts inherent to calligraphic works. To tackle [...] Read more.
As a task in the digital preservation of calligraphy stone inscriptions, an invaluable cultural heritage, style classification faces prominent challenges: insufficient feature representation of single-channel rubbings, and difficulties in effectively capturing the complex strokes and spatial layouts inherent to calligraphic works. To tackle these issues, an efficient deep learning model integrated with the dual-path attention mechanism of Bottleneck Attention Module (BAM) is proposed in this paper, which is designed to achieve accurate and efficient classification of calligraphy styles. With the lightweight network EfficientNetB2 as its backbone, this model innovatively integrates the BAM. It realizes the channel-spatial collaborative attention in calligraphy analysis, with the weight of stroke structure features increased to over 85%. Through the synergistic effect of channel attention and spatial attention, the model’s ability to extract stroke structure and spatial layout features from calligraphy images is significantly enhanced. The experimental results on the stratified sampling dataset show that the model achieves an accuracy of 98.44% on the test set, a confusion matrix recall rate of 94.80%, an F1-score of 0.9675, a precision of 0.8690, and a macro-averaged Area Under the Curve (AUC) value of 0.9694. To further validate the effectiveness of the BAM module and the necessity of its dual-path design, we conducted a systematic ablation experiment analysis. The experiment used EfficientNet-B2 as the baseline model and sequentially compared the contributions of different attention mechanisms. The experimental results show that the method proposed in this paper balances efficiency and performance, and holds practical significance in fields such as ancient book authentication and calligraphy research. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

Back to TopTop