Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (301)

Search Parameters:
Keywords = max pooling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 4932 KB  
Article
Automated Facial Pain Assessment Using Dual-Attention CNN with Clinically Calibrated High-Reliability and Reproducibility Framework
by Albert Psatrick Sankoh, Ali Raza, Khadija Parwez, Wesam Shishah, Ayman Alharbi, Mubeen Javed and Muhammad Bilal
Biomimetics 2026, 11(1), 51; https://doi.org/10.3390/biomimetics11010051 - 8 Jan 2026
Abstract
Accurate and quantitative pain assessment remains a major challenge in clinical medicine, especially for patients unable to verbalize discomfort. Conventional methods based on self-reports or clinician observation are subjective and inconsistent. This study introduces a novel automated facial pain assessment framework built on [...] Read more.
Accurate and quantitative pain assessment remains a major challenge in clinical medicine, especially for patients unable to verbalize discomfort. Conventional methods based on self-reports or clinician observation are subjective and inconsistent. This study introduces a novel automated facial pain assessment framework built on a dual-attention convolutional neural network (CNN) that achieves clinically calibrated, high-reliability performance and interpretability. The architecture combines multi-head spatial attention to localize pain-relevant facial regions with an enhanced channel attention block employing triple-pooling (average, max, and standard deviation) to capture discriminative intensity features. Regularization through label smoothing (α = 0.1) and AdamW optimization ensures calibrated, stable convergence. Evaluated on a clinically annotated dataset using subject-wise stratified sampling, the proposed model achieved a test accuracy of 90.19% ± 0.94%, with an average 5-fold cross-validation accuracy of 83.60% ± 1.55%. The model further attained an F1-score of 0.90 and Cohen’s κ = 0.876, with macro- and micro-AUCs of 0.991 and 0.992, respectively. The evaluation covers five pain classes (No Pain, Mid Pain, Moderate Pain, Severe Pain, and Very Pain) using subject-wise splits comprising 5840 total images and 1160 test samples. Comparative benchmarking and ablation experiments confirm each module’s contribution, while Grad-CAM visualizations highlight physiologically relevant facial regions. The results demonstrate a robust, explainable, and reproducible framework suitable for integration into real-world automated pain-monitoring systems. Inspired by biological pain perception mechanisms and human facial muscle responses, the proposed framework aligns with biomimetic sensing principles by emulating how localized facial cues contribute to pain interpretation. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) in Biomedical Engineering: 2nd Edition)
Show Figures

Figure 1

33 pages, 40054 KB  
Article
MVDCNN: A Multi-View Deep Convolutional Network with Feature Fusion for Robust Sonar Image Target Recognition
by Yue Fan, Cheng Peng, Peng Zhang, Zhisheng Zhang, Guoping Zhang and Jinsong Tang
Remote Sens. 2026, 18(1), 76; https://doi.org/10.3390/rs18010076 - 25 Dec 2025
Viewed by 284
Abstract
Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these [...] Read more.
Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these critical limitations, this paper proposes a Multi-View Deep Convolutional Neural Network (MVDCNN) based on feature-level fusion for robust sonar image target recognition. The MVDCNN adopts a highly modular and extensible architecture consisting of four interconnected modules: an input reshaping module that adapts multi-view images to match the input format of pre-trained backbone networks via dimension merging and channel replication; a shared-weight feature extraction module that leverages Convolutional Neural Network (CNN) or Transformer backbones (e.g., ResNet, Swin Transformer, Vision Transformer) to extract discriminative features from each view, ensuring parameter efficiency and cross-view feature consistency; a feature fusion module that aggregates complementary features (e.g., target texture and shape) across views using max-pooling to retain the most salient characteristics and suppress noisy or occluded view interference; and a lightweight classification module that maps the fused feature representations to target categories. Additionally, to mitigate the data scarcity bottleneck in sonar ATR, we design a multi-view sample augmentation method based on sonar imaging geometric principles: this method systematically combines single-view samples of the same target via the combination formula and screens valid samples within a predefined azimuth range, constructing high-quality multi-view training datasets without relying on complex generative models or massive initial labeled data. Comprehensive evaluations on the Custom Side-Scan Sonar Image Dataset (CSSID) and Nankai Sonar Image Dataset (NKSID) demonstrate the superiority of our framework over single-view baselines. Specifically, the two-view MVDCNN achieves average classification accuracies of 94.72% (CSSID) and 97.24% (NKSID), with relative improvements of 7.93% and 5.05%, respectively; the three-view MVDCNN further boosts the average accuracies to 96.60% and 98.28%. Moreover, MVDCNN substantially elevates the precision and recall of small-sample categories (e.g., Fishing net and Small propeller in NKSID), effectively alleviating the class imbalance challenge. Mechanism validation via t-Distributed Stochastic Neighbor Embedding (t-SNE) feature visualization and prediction confidence distribution analysis confirms that MVDCNN yields more separable feature representations and more confident category predictions, with stronger intra-class compactness and inter-class discrimination in the feature space. The proposed MVDCNN framework provides a robust and interpretable solution for advancing sonar ATR and offers a technical paradigm for multi-view acoustic image understanding in complex underwater environments. Full article
(This article belongs to the Special Issue Underwater Remote Sensing: Status, New Challenges and Opportunities)
Show Figures

Graphical abstract

22 pages, 957 KB  
Article
A Hybrid Deep Learning Model Based on Local and Global Features for Amazon Product Reviews: An Optimal ALBERT-Cascade CNN Approach
by Israa Mustafa Abbas, İsmail Atacak, Sinan Toklu, Necaattin Barışçı and İbrahim Alper Doğru
Appl. Sci. 2026, 16(1), 25; https://doi.org/10.3390/app16010025 - 19 Dec 2025
Viewed by 433
Abstract
Natural Language Processing (NLP) is a valuable technology and business topic as it helps turn data into useful information with the spread of digital information. Nevertheless, there are some difficulties in its use, including the language’s complexity and the data quality. To address [...] Read more.
Natural Language Processing (NLP) is a valuable technology and business topic as it helps turn data into useful information with the spread of digital information. Nevertheless, there are some difficulties in its use, including the language’s complexity and the data quality. To address these challenges, in this study, the researchers first performed a series of ablation experiments on 14 models derived from various variations in Deep Learning (DL) methods, including A Lite BERT (ALBERT) together with Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Max Pooling layer, and attention mechanism. Subsequently, they proposed an ALBERT-cascaded CNN hybrid model as an effective method to overcome the related challenges by evaluating the performance results obtained from these models. In the proposed model, a transformer architecture with parallel processing capability for both word and subword tokenization is used in addition to creating contextualized word embeddings. Local and global feature extraction was also performed using two 1-D CNN blocks before classification to improve the model performance. The model was optimized using an advanced hyperparameter optimization tool called OPTUNA. The findings of the experiment conducted with the proposed model were obtained based on Amazon Fashion 2023 data under 5-fold cross-validation conditions. The experimental results demonstrate that the proposed hybrid model exhibits good performance with average scores of 0.9308 (accuracy), 0.9296 (F1 score), 0.9412 (precision), 0.9182 (recall), and 0.9797 (AUC) in the validation dataset, and scores of 0.9313, 0.9305, 0.9414, 0.9199, and 0.9800 in the test dataset. In addition, comparisons of the model with models in studies using similar datasets support the experimental results and reveal that it can be used as a competitive approach for solving the problems encountered in the NLP field. Full article
(This article belongs to the Special Issue Applied Artificial Intelligence and Data Science)
Show Figures

Figure 1

20 pages, 1669 KB  
Article
Evaluation of Salinity Tolerance Potentials of Two Contrasting Soybean Genotypes Based on Physiological and Biochemical Responses
by Mawia Sobh, Tahoora Batool Zargar, Oqba Basal, Ayman Shehada AL-Ouda and Szilvia Veres
Plants 2026, 15(1), 10; https://doi.org/10.3390/plants15010010 - 19 Dec 2025
Viewed by 247
Abstract
Salinity stress is a major abiotic constraint limiting soybean (Glycine max L.) productivity in saline–alkali soils; however, the physiological and biochemical mechanisms underlying genotypic tolerance remain poorly understood. This study aimed to identify key traits that underpin salinity tolerance and can inform [...] Read more.
Salinity stress is a major abiotic constraint limiting soybean (Glycine max L.) productivity in saline–alkali soils; however, the physiological and biochemical mechanisms underlying genotypic tolerance remain poorly understood. This study aimed to identify key traits that underpin salinity tolerance and can inform breeding and agronomic strategies to enhance soybean performance under saline conditions. Two contrasting soybean genotypes, YAKARTA and POCA, were exposed to 25, 50, 75, and 100 mM NaCl from the first to the fourth trifoliate stage (V1–V4) under controlled conditions for 30 days. YAKARTA maintained higher relative water content (75.51% vs. 66.97%), stomatal conductance (342 vs. 286 mmol H2O m−2 s−1), proline (6.15 vs. 4.36 µmol g−1 fresh weight), K+/Na+ ratio (61.8 vs. 32.2), and H2O2 (833.8 vs. 720.2 µmol g−1 fresh weight) compared with POCA, whereas POCA exhibited elevated solute leakage (87.1% vs. 79.21%), malondialdehyde (122 vs. 112 µg g−1), and ascorbic acid (334 vs. 293 µg g−1), indicating greater sensitivity. At 100 mM NaCl, relative water content, stomatal conductance, K+/Na+ ratio, and H2O2 declined by 44.5%, 81.9%, 99.8%, and 49.5%, respectively, while proline, solute leakage, malondialdehyde, and ascorbic acid increased by 56-, 1.27-, 11.6-, and 1.68-fold, respectively. The contrasting physiological and biochemical responses between these genotypes highlight key traits, such as relative water content, stomatal conductance, proline accumulation, malondialdehyde content, and the K+/Na+ ratio, as promising potential markers associated with salinity tolerance in soybean. These findings provide a foundational understanding that can guide future research to validate these markers across a wider genetic pool and under field conditions. Full article
Show Figures

Figure 1

22 pages, 8263 KB  
Article
Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet
by Beining Cui, Dezhi Jiang, Xinyu Wang, Lv Xiao, Peisen Tan, Yanxia Li and Zhaobin Tan
Symmetry 2026, 18(1), 3; https://doi.org/10.3390/sym18010003 - 19 Dec 2025
Viewed by 193
Abstract
To address flight safety risks from rotor defects in rotorcraft drones operating in complex low-altitude environments, this study proposes a high-precision diagnostic model based on the Multimodal Data Input and Spatio-Temporal Feature Fusion Network (MDI-STFFNet). The model uses a dual-modality coupling mechanism that [...] Read more.
To address flight safety risks from rotor defects in rotorcraft drones operating in complex low-altitude environments, this study proposes a high-precision diagnostic model based on the Multimodal Data Input and Spatio-Temporal Feature Fusion Network (MDI-STFFNet). The model uses a dual-modality coupling mechanism that integrates vibration and air pressure signals, forming a “single-path temporal, dual-path representational” framework. The one-dimensional vibration signal and the five-channel pressure array are mapped into a texture space via phase space reconstruction and color-coded recurrence plots, followed by extraction of transient spatial features using a pre-trained ResNet-18 model. Parallel LSTM networks capture long-term temporal dependencies, while a parameter-free 1D max-pooling layer compresses redundant pressure data, reducing LSTM parameter growth. The CSW-FM module enables adaptive fusion across modal scales via shared-weight mapping and learnable query vectors that dynamically assign spatiotemporal weights. Experiments on a self-built dataset with seven defect types show that the model achieves 99.01% accuracy, improving by 4.46% and 1.98% over single-modality vibration and pressure inputs. Ablation studies confirm the benefits of spatiotemporal fusion and soft weighting in accuracy and robustness. The model provides a scalable, lightweight solution for UAV power system fault diagnosis under high-noise and varying conditions. Full article
(This article belongs to the Section Engineering and Materials)
Show Figures

Figure 1

27 pages, 6271 KB  
Article
A Method for Identifying Critical Control Points in Production Scheduling for Crankshaft Production Workshop by Integrating Weighted-ARM with Complex Networks
by Luwen Yuan, Ge Han and Peng Dong
Systems 2025, 13(12), 1122; https://doi.org/10.3390/systems13121122 - 15 Dec 2025
Viewed by 256
Abstract
In smart manufacturing environments, production scheduling is highly susceptible to multi-source disruptions. However, traditional methods often struggle to accurately characterize the complex interdependencies between control points and disruptions, along with their systemic propagation effects, thereby constraining the proactivity and precision of scheduling optimization. [...] Read more.
In smart manufacturing environments, production scheduling is highly susceptible to multi-source disruptions. However, traditional methods often struggle to accurately characterize the complex interdependencies between control points and disruptions, along with their systemic propagation effects, thereby constraining the proactivity and precision of scheduling optimization. This paper proposes a novel data-driven approach that integrates Weighted Association Rule Mining (WARM) with a two-layer directed weighted complex network to achieve precise identification of critical control points in production scheduling. First, a production loss function integrating delay duration and resource idle cost is constructed, and the max-pooling method is applied to map control point weights, thereby quantifying their intrinsic importance. Subsequently, under the constraint that association rule antecedents are restricted to control points, an improved Apriori algorithm is employed to mine directed “Control Point-Disruption” association rules. These rules are then used to construct a two-layer directed weighted complex network. Furthermore, by combining weighted PageRank and edge betweenness centrality analyses, critical control points and high-risk propagation paths are identified from the dual dimensions of node influence and path propagation capability. A case study conducted in a crankshaft production workshop demonstrates that the proposed method effectively identifies low-frequency yet high-impact hidden nodes often overlooked by traditional rules. The resulting scheduling optimization scheme reduces the occurrence rate of high-impact disruptions by 53% and significantly improves key performance indicators such as on-time delivery rate and equipment utilization. This research provides new theoretical support and a technical pathway for manufacturing enterprises to suppress system disturbances through flexible interventions targeting high-betweenness paths. Full article
(This article belongs to the Special Issue Scheduling and Optimization in Production and Transportation Systems)
Show Figures

Figure 1

17 pages, 10887 KB  
Article
The Effect of Bulk Nucleation Parameters on the Solidification Structure of Large Slabs During Electroslag Remelting and Optimization of Production Process Parameters
by Qi Li, Yu Du, Zhenquan Jing and Yanhui Sun
Crystals 2025, 15(12), 1052; https://doi.org/10.3390/cryst15121052 - 11 Dec 2025
Viewed by 301
Abstract
In this paper, the moving heat transfer boundary method is adopted to establish a three-dimensional solidification microstructure model based on the coupling technology of the cellular automata method (CA) and finite element method (FE), simulate the ingot growth process, and optimize the nucleation [...] Read more.
In this paper, the moving heat transfer boundary method is adopted to establish a three-dimensional solidification microstructure model based on the coupling technology of the cellular automata method (CA) and finite element method (FE), simulate the ingot growth process, and optimize the nucleation parameters. In addition, this study also explored the influence of process parameters such as melting rate, molten pool temperature, and cooling intensity on the solidification structure of ingots, providing a theoretical basis for process optimization. The results show that the maximum nucleation undercooling degree and the maximum nucleation density have significant effects on different crystal regions of the ingot solidification structure, while the maximum nucleation variance has no obvious effect on the changes in the solidification structure. When the maximum bulk nucleus undercooling degree ΔTv,max = 4 K, the bulk nucleus standard deviation ΔTv,σ = 5 K, and the maximum bulk nucleus density nv,max = 3 × 107, the simulation results of the solidification structure can be well consistent with the experimental results. With the increase in smelting speed, the number of grains in the ingot structure gradually increases, while the average area of grains gradually decreases. The melting temperature and the intensity of side wall cooling have no obvious influence on the solidification structure of the ingot. Full article
(This article belongs to the Special Issue Crystallization of High-Performance Metallic Materials (3rd Edition))
Show Figures

Figure 1

25 pages, 4430 KB  
Article
NOVA: A Novel Multi-Scale Adaptive Vision Architecture for Accurate and Efficient Automated Diagnosis of Malaria Using Microscopic Blood Smear Images
by Md Nayeem Hosen, Md Ariful Islam Mozumder, Proloy Kumar Mondal and Hee Cheol Kim
Electronics 2025, 14(24), 4861; https://doi.org/10.3390/electronics14244861 - 10 Dec 2025
Viewed by 271
Abstract
Background: Malaria continues to be a significant global health concern, particularly in tropical and subtropical areas. Timely and accurate diagnosis is crucial in minimizing the disease’s mortality. The standard method, microscopic diagnosis, which represents the gold standard, is heavily reliant on skilled interpretation, [...] Read more.
Background: Malaria continues to be a significant global health concern, particularly in tropical and subtropical areas. Timely and accurate diagnosis is crucial in minimizing the disease’s mortality. The standard method, microscopic diagnosis, which represents the gold standard, is heavily reliant on skilled interpretation, labor-intensive, and prone to human error. Methods: To address these challenges, we propose the NOVA (Novel Multi-Scale Adaptive Vision Architecture) for the diagnosis of malaria. NOVA is based on an innovative dynamic channel attention and Learnable Temperature Spatial Pyramid Attention to achieve more powerful feature representation and better classification performance. In addition, adaptive feature refinement and enhanced transformer blocks are used to obtain multi-scale feature extraction and contextual reasoning. Furthermore, a multi-strategy pooling mechanism that fuses average, max, and attention-based aggregation is developed to enhance the model’s discriminative capability. Results: We conduct experiments on a publicly accessible dataset of 15,031 microscopic thin blood smear images to validate the effectiveness of the proposed approach. The model is assessed and compared on a benchmark malaria microscopy dataset, achieving an accuracy of 97.00%, a precision of 96.00%, and an F1-score of 97.00%, outperforming other existing models. Conclusions: The experimental results demonstrate the feasibility of the proposed approach as a potential research prototype for the automated diagnosis of malaria. Before clinical deployment, further multi-site clinical evaluation on a large patient cohort is required for validation. Full article
Show Figures

Figure 1

32 pages, 1317 KB  
Article
ECA110-Pooling: A Comparative Analysis of Pooling Strategies in Convolutional Neural Networks
by Doru Constantin and Costel Bălcău
Big Data Cogn. Comput. 2025, 9(12), 306; https://doi.org/10.3390/bdcc9120306 - 2 Dec 2025
Viewed by 407
Abstract
Pooling strategies are fundamental to convolutional neural networks, shaping the trade-off between accuracy, robustness to spatial variations, and computational efficiency in modern visual recognition systems. In this paper, we present and validate ECA110-Pooling, a novel rule-based pooling operator inspired by elementary cellular automata. [...] Read more.
Pooling strategies are fundamental to convolutional neural networks, shaping the trade-off between accuracy, robustness to spatial variations, and computational efficiency in modern visual recognition systems. In this paper, we present and validate ECA110-Pooling, a novel rule-based pooling operator inspired by elementary cellular automata. We conduct a systematic comparative study, benchmarking ECA110-Pooling against conventional pooling methods (MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling) as well as state-of-the-art (SOTA) architectures. Experiments on three benchmark datasets—ImageNet (subset), CIFAR-10, and Fashion-MNIST—across training horizons ranging from 20 to 50,000 epochs show that ECA110-Pooling consistently achieves higher Top-1 accuracy, lower error rates, and stronger F1-scores than traditional pooling operators, while maintaining computational efficiency comparable to MaxPooling. Moreover, when compared with SOTA models, ECA110-Pooling delivers competitive accuracy with substantially fewer parameters and reduced training time. These results establish ECA110-Pooling as a principled and validated approach to image classification, bridging the gap between fixed pooling schemes and complex deep architectures. Its interpretable, rule-based design highlights both theoretical significance and practical applicability in contexts that demand a balance of accuracy, efficiency, and scalability. Full article
Show Figures

Figure 1

21 pages, 1194 KB  
Article
Retentive-HAR: Human Activity Recognition from Wearable Sensors with Enhanced Temporal and Inter-Feature Dependency Retention
by Ayokunle Olalekan Ige, Daniel Ayo Oladele and Malusi Sibiya
Appl. Sci. 2025, 15(23), 12661; https://doi.org/10.3390/app152312661 - 29 Nov 2025
Viewed by 600
Abstract
Human Activity Recognition (HAR) using wearable sensor data plays a vital role in health monitoring, context-aware computing, and smart environments. Many existing deep learning models for HAR incorporate MaxPooling layers after convolutional operations to reduce dimensionality and computational load. While this approach is [...] Read more.
Human Activity Recognition (HAR) using wearable sensor data plays a vital role in health monitoring, context-aware computing, and smart environments. Many existing deep learning models for HAR incorporate MaxPooling layers after convolutional operations to reduce dimensionality and computational load. While this approach is effective in image-based tasks, it is less suitable for the sensor signals used in HAR. MaxPooling introduces a form of temporal downsampling that can discard subtle yet crucial temporal information. Also, traditional CNNs often struggle to capture long-range dependencies within each window due to their limited receptive fields, and they lack effective mechanisms to aggregate information across multiple windows without stacking multiple layers, which increases computational cost. In this study, we introduce Retentive-HAR, a model designed to enhance feature learning by capturing dependencies both within and across sliding windows. The proposed model intentionally omits the MaxPooling layer, thereby preserving the full temporal resolution throughout the network. The model begins with parallel dilated convolutions, which capture long-range dependencies within each window. Feature outputs from these convolutional layers are then concatenated along the feature dimension and transposed, allowing the Retentive Module to analyze dependencies across both window and feature dimensions. Additional 1D-CNN layers are then applied to the transposed feature maps to capture complex interactions across concatenated window representations before including Bi-LSTM layers. Experiments on PAMAP2, HAPT, and WISDM datasets achieve a performance of 96.40%, 94.70%, and 96.16%, respectively, which outperforms the existing methods with minimal computational cost. Full article
Show Figures

Figure 1

16 pages, 1433 KB  
Article
Research on Improved Near-Infrared Fish Density Classification Method Based on ResNet18
by Xiaohong Peng, Yujie Wang and Ying Zhang
Fishes 2025, 10(12), 602; https://doi.org/10.3390/fishes10120602 - 24 Nov 2025
Viewed by 376
Abstract
Addressing the technological requirement for real-time monitoring of fish density in dim aquaculture environments, this study proposes a near-infrared (NIR) image classification method using a modified ResNet18 architecture. Initially, an NIR-Fish dataset comprising 736 high-quality annotated images (256 × 256 resolution) spanning three [...] Read more.
Addressing the technological requirement for real-time monitoring of fish density in dim aquaculture environments, this study proposes a near-infrared (NIR) image classification method using a modified ResNet18 architecture. Initially, an NIR-Fish dataset comprising 736 high-quality annotated images (256 × 256 resolution) spanning three density scenarios (low, medium, and high density) was constructed. Contrast-Limited Adaptive Histogram Equalization (CLAHE) preprocessing was implemented with an 8 × 8 tiling strategy and clip limit = 4.0, significantly enhancing the discernibility of faint boundary features. A dual-channel attention module (DCAM) was embedded into the ResNet18 backbone, featuring a parallel architecture integrating Global Average Pooling (GAP) and Global Max Pooling (GMP). This design synergistically optimized local salient feature enhancement and global statistical feature fusion through parameter-shared fully connected layers (reduction ratio of 16:1). The experiments show that the classification accuracy of the proposed method on the independent test set is 80.57%, which is 4.34 percentage points higher than that of the original ResNet18. F1 scores for the three density levels were 0.8308 (low), 0.7674 (medium), and 0.8294 (high), respectively. Ablation studies confirmed the dual-channel design’s significant performance contribution, while the parameter-sharing mechanism effectively mitigated overfitting risks. By leveraging feature complementarity and lightweight design, this work overcomes the classification bottleneck for NIR images under low signal-to-noise conditions, providing a highly robust technical solution for intelligent aquaculture management. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Aquaculture)
Show Figures

Figure 1

17 pages, 2260 KB  
Article
CONTI-CrackNet: A Continuity-Aware State-Space Network for Crack Segmentation
by Wenjie Song, Min Zhao and Xunqian Xu
Sensors 2025, 25(22), 6865; https://doi.org/10.3390/s25226865 - 10 Nov 2025
Viewed by 764
Abstract
Crack segmentation in cluttered scenes with slender and irregular patterns remains difficult, and practical systems must balance accuracy and efficiency. We present CONTI-CrackNet, which is a lightweight visual state-space network that integrates a Multi-Directional Selective Scanning Strategy (MD3S). MD3S performs bidirectional scanning along [...] Read more.
Crack segmentation in cluttered scenes with slender and irregular patterns remains difficult, and practical systems must balance accuracy and efficiency. We present CONTI-CrackNet, which is a lightweight visual state-space network that integrates a Multi-Directional Selective Scanning Strategy (MD3S). MD3S performs bidirectional scanning along the horizontal, vertical, and diagonal directions, and it fuses the complementary paths with a Bidirectional Gated Fusion (BiGF) module to strengthen global continuity. To preserve fine details while completing global texture, we propose a Dual-Branch Pixel-Level Global–Local Fusion (DBPGL) module that incorporates a Pixel-Adaptive Pooling (PAP) mechanism to dynamically weight max-pooled responses and average-pooled responses. Evaluated on two public benchmarks, the proposed method achieves an F1 score (F1) of 0.8332 and a mean Intersection over Union (mIoU) of 0.8436 on the TUT dataset, and it achieves an mIoU of 0.7760 on the CRACK500 dataset, surpassing competitive Convolutional Neural Network (CNN), Transformer, and Mamba baselines. With 512 × 512 input, the model requires 24.22 G floating point operations (GFLOPs), 6.01 M parameters (Params), and operates at 42 frames per second (FPS) on an RTX 3090 GPU, delivering a favorable accuracy–efficiency balance. These results show that CONTI-CrackNet improves continuity and edge recovery for thin cracks while keeping computational cost low, and it is lightweight in terms of parameter count and computational cost. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

18 pages, 2241 KB  
Article
FADP-GT: A Frequency-Adaptive and Dual-Pooling Graph Transformer Model for Device Placement in Model Parallelism
by Hao Shu, Wangli Hao, Meng Han and Fuzhong Li
Electronics 2025, 14(21), 4333; https://doi.org/10.3390/electronics14214333 - 5 Nov 2025
Viewed by 346
Abstract
The increasing scale and complexity of graph-structured data necessitate efficient parallel training strategies for graph neural networks (GNNs). The effectiveness of these strategies hinges on the quality of graph feature representation. To this end, we propose a Frequency-Adaptive Dual-Pooling Graph Transformer (FADP-GT) model [...] Read more.
The increasing scale and complexity of graph-structured data necessitate efficient parallel training strategies for graph neural networks (GNNs). The effectiveness of these strategies hinges on the quality of graph feature representation. To this end, we propose a Frequency-Adaptive Dual-Pooling Graph Transformer (FADP-GT) model to enhance feature learning for computational graphs. We propose a Frequency-Adaptive Dual-Pooling Graph Transformer (FADP-GT) model, which incorporates two modules: a Frequency-Adaptive Graph Attention (FAGA) module and a Dual-Pooling Feature Refinement (DPFR) module. The FAGA module adaptively filters frequency components in the spectral domain to dynamically adjust the contribution of high- and low-frequency information in attention computation, thereby enhancing the model’s ability to capture structural information and mitigating the over-smoothing problem in multi-layer network propagation. On the other hand, the DPFR module refines graph features through dual-pooling operations—Global Average Pooling (GAP) and Global Max Pooling (GMP)—along the node dimension, which captures both global feature distributions and salient local patterns to enrich multi-scale representations. By improving graph feature representation, our FADP-GT model indirectly supports the development of efficient device placement strategies, as enhanced feature extraction enables the more accurate modeling of node dependencies in computational graphs. The experimental results demonstrate that FADP-GT outperforms existing methods, reducing the average computation time for device placement by 96.52% and the execution time by 9.06% to 26.48%. Full article
Show Figures

Figure 1

17 pages, 448 KB  
Article
Leveraging Max-Pooling Aggregation and Enhanced Entity Embeddings for Few-Shot Knowledge Graph Completion
by Meng Zhang and Wonjun Chung
Mathematics 2025, 13(21), 3486; https://doi.org/10.3390/math13213486 - 1 Nov 2025
Viewed by 410
Abstract
Few-shot knowledge graph (KG) completion is challenged by the dynamic and long-tail nature of real-world KGs, where only a handful of relation-specific triples are available for each new relation. Existing methods often over-rely on neighbor information and use sequential LSTM aggregators that impose [...] Read more.
Few-shot knowledge graph (KG) completion is challenged by the dynamic and long-tail nature of real-world KGs, where only a handful of relation-specific triples are available for each new relation. Existing methods often over-rely on neighbor information and use sequential LSTM aggregators that impose an inappropriate order bias on inherently unordered triples. To address these limitations, we propose a lightweight yet principled framework that (1) enhances entity representations by explicitly integrating intrinsic (self) features with attention-aggregated neighbor context, and (2) introduces a permutation-invariant max-pooling aggregator to replace the LSTM-based reference set encoder. This design faithfully respects the set-based nature of triples while preserving critical entity semantics. Extensive experiments on the standard few-shot KG completion benchmarks NELL-One and Wiki-One demonstrate that our method consistently outperforms strong baselines, including non-LSTM models such as MetaR, and delivers robust gains across multiple evaluation metrics. These results show that carefully tailored, task-aligned refinements can achieve significant improvements without increasing model complexity. Full article
Show Figures

Figure 1

22 pages, 6177 KB  
Article
Deep Q-Learning for Gastrointestinal Disease Detection and Classification
by Aini Saba, Javaria Amin and Muhammad Umair Ali
Bioengineering 2025, 12(11), 1184; https://doi.org/10.3390/bioengineering12111184 - 30 Oct 2025
Viewed by 1010
Abstract
Stomach ulcers, a common type of gastrointestinal (GI) disease, pose serious health risks if not diagnosed and treated at an early stage. Therefore, in this research, a method is proposed based on two deep learning models for classification and segmentation. The classification model [...] Read more.
Stomach ulcers, a common type of gastrointestinal (GI) disease, pose serious health risks if not diagnosed and treated at an early stage. Therefore, in this research, a method is proposed based on two deep learning models for classification and segmentation. The classification model is based on Convolutional Neural Networks (CNN) and incorporates Q-learning to achieve learning stability and decision accuracy through reinforcement-based feedback. In this model, input images are passed through a custom CNN model comprising seven layers, including convolutional, ReLU, max pooling, flattening, and fully connected layers, for feature extraction. Furthermore, the agent selects an action (class) for each input and receives a +1 reward for a correct prediction and −1 for an incorrect one. The Q-table stores a mapping between image features (states) and class predictions (actions), and is updated at each step based on the reward using the Q-learning update rule. This process runs over 1000 episodes and utilizes Q-learning parameters (α = 0.1, γ = 0.6, ϵ = 0.1) to help the agent learn an optimal classification strategy. After training, the agent is evaluated on the test data using only its learned policy. The classified ulcer images are passed to the proposed attention-based U-Net model to segment the lesion regions. The model contains an encoder, a decoder, and attention layers. The encoder block extracts features through pooling and convolution layers, while the decoder block up-samples the features and reconstructs the segmentation map. Similarly, the attention block is used to highlight the important features obtained from the encoder block before passing them to the decoder block, helping the model focus on relevant spatial information. The model is trained using the selected hyperparameters, including an 8-batch size, the Adam optimizer, and 50 epochs. The performance of the models is evaluated on Kvasir, Nerthus, CVC-ClinicDB, and a private POF dataset. The classification framework provides 99.08% accuracy on Kvasir and 100% accuracy on Nerthus. In contrast, the segmentation framework yields 98.09% accuracy on Kvasir, 99.77% accuracy on Nerthus, 98.49% accuracy on CVC-ClinicDB, and 99.13% accuracy on the private dataset. The achieved results are superior to those of previous methods published in this domain. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Figure 1

Back to TopTop