Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (40)

Search Parameters:
Keywords = asymmetric U-Net

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 2225 KB  
Article
An Automatic Pixel-Level Segmentation Method for Coal-Crack CT Images Based on U2-Net
by Yimin Zhang, Chengyi Wu, Jinxia Yu, Guoqiang Wang and Yingying Li
Electronics 2025, 14(21), 4179; https://doi.org/10.3390/electronics14214179 - 26 Oct 2025
Viewed by 266
Abstract
Automatically segmenting coal cracks in CT images is crucial for 3D reconstruction and the physical properties of mines. This paper proposes an automatic pixel-level deep learning method called Attention Double U2-Net to enhance the segmentation accuracy of coal cracks in CT [...] Read more.
Automatically segmenting coal cracks in CT images is crucial for 3D reconstruction and the physical properties of mines. This paper proposes an automatic pixel-level deep learning method called Attention Double U2-Net to enhance the segmentation accuracy of coal cracks in CT images. Due to the lack of public datasets of coal CT images, a pixel-level labeled coal crack dataset is first established through industrial CT scanning experiments and post-processing. Then, the proposed method utilizes a Double Residual U-Block structure (DRSU) based on U2-Net to improve feature extraction and fusion capabilities. Moreover, an attention mechanism module is proposed, which is called Atrous Asymmetric Fusion Non-Local Block (AAFNB). The AAFNB module is based on the idea of Asymmetric Non-Local, which enables the collection of global information to enhance the segmentation results. Compared with previous state-of-the-art models, the proposed Attention Double U2-Net method exhibits better performance over the coal crack CT image dataset in various evaluation metrics such as PA, mPA, MIoU, IoU, Precision, Recall, and Dice scores. The crack segmentation results obtained from this method are more accurate and efficient, which provides experimental data and theoretical support to the field of CBM exploration and damage of coal. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 5150 KB  
Article
VSM-UNet: A Visual State Space Reconstruction Network for Anomaly Detection of Catenary Support Components
by Shuai Xu, Jiyou Fei, Haonan Yang, Xing Zhao, Xiaodong Liu and Hua Li
Sensors 2025, 25(19), 5967; https://doi.org/10.3390/s25195967 - 25 Sep 2025
Viewed by 491
Abstract
Anomaly detection of catenary support components (CSCs) is an important component in railway condition monitoring systems. However, because the abnormal features of CSCs loosening are not obvious, and the current CNN models and visual Transformer models have problems such as limited remote modeling [...] Read more.
Anomaly detection of catenary support components (CSCs) is an important component in railway condition monitoring systems. However, because the abnormal features of CSCs loosening are not obvious, and the current CNN models and visual Transformer models have problems such as limited remote modeling capabilities and secondary computational complexity, it is difficult for existing deep learning anomaly detection methods to effectively exert their performance. The state space model (SSM) represented by Mamba is not only good at long-range modeling, but also maintains linear computational complexity. In this paper, using the state space model (SSM), we proposed a new visual state space reconstruction network (VSM-UNet) for the detection of CSC loosening anomalies. First, based on the structure of UNet, a visual state space block (VSS block) is introduced to capture extensive contextual information and multi-scale features, and an asymmetric encoder–decoder structure is constructed through patch merging operations and patch expanding operations. Secondly, the CBAM attention mechanism is introduced between the encoder–decoder structure to enhance the model’s ability to focus on key abnormal features. Finally, a stable abnormality score calculation module is designed using MLP to evaluate the degree of abnormality of components. The experiment shows that the VSM-UNet model, learning strategy and anomaly score calculation method proposed in this article are effective and reasonable, and have certain advantages. Specifically, the proposed method framework can achieve an AUROC of 0.986 and an FPS of 26.56 in the anomaly detection task of looseness on positioning clamp nuts, U-shaped hoop nuts, and cotton pins. Therefore, the method proposed in this article can be effectively applied to the detection of CSCs abnormalities. Full article
(This article belongs to the Special Issue AI-Enabled Smart Sensors for Industry Monitoring and Fault Diagnosis)
Show Figures

Figure 1

18 pages, 5562 KB  
Article
Symmetry-Aware Face Illumination Enhancement via Pixel-Adaptive Curve Mapping
by Jieqiong Yang, Yumeng Lu, Jiaqi Liu and Jizheng Yi
Symmetry 2025, 17(9), 1560; https://doi.org/10.3390/sym17091560 - 18 Sep 2025
Viewed by 470
Abstract
Face recognition under uneven illumination conditions presents significant challenges, as asymmetric shadows often obscure facial features while overexposed regions lose critical texture details. To address this problem, a novel symmetry-aware illumination enhancement method named face shadow detection network (FSDN) is proposed, which features [...] Read more.
Face recognition under uneven illumination conditions presents significant challenges, as asymmetric shadows often obscure facial features while overexposed regions lose critical texture details. To address this problem, a novel symmetry-aware illumination enhancement method named face shadow detection network (FSDN) is proposed, which features a nested U-Net architecture combined with Gaussian convolution. This method enables precise illumination intensity maps for the given face images through higher-order quadratic enhancement curves, effectively extending the low-light dynamic range while preserving essential facial symmetry. Comprehensive evaluations on the Extended Yale B and CMU-PIE datasets demonstrate the superiority of the proposed FSDN over conventional approaches, achieving structural similarity (SSIM) indices of 0.48 and 0.59, respectively, along with remarkably low face recognition error rates of 1.3% and 0.2%, respectively. The key innovation of this work lies in its simultaneous optimization of illumination uniformity and facial symmetry preservation, thereby significantly improving face analysis reliability under challenging lighting conditions. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

19 pages, 2838 KB  
Article
Cascaded Spatial and Depth Attention UNet for Hippocampus Segmentation
by Zi-Zheng Wei, Bich-Thuy Vu, Maisam Abbas and Ran-Zan Wang
J. Imaging 2025, 11(9), 311; https://doi.org/10.3390/jimaging11090311 - 11 Sep 2025
Viewed by 558
Abstract
This study introduces a novel enhancement to the UNet architecture, termed Cascaded Spatial and Depth Attention U-Net (CSDA-UNet), tailored specifically for precise hippocampus segmentation in T1-weighted brain MRI scans. The proposed architecture integrates two key attention mechanisms: a Spatial Attention (SA) module, which [...] Read more.
This study introduces a novel enhancement to the UNet architecture, termed Cascaded Spatial and Depth Attention U-Net (CSDA-UNet), tailored specifically for precise hippocampus segmentation in T1-weighted brain MRI scans. The proposed architecture integrates two key attention mechanisms: a Spatial Attention (SA) module, which refines spatial feature representations by producing attention maps from the deepest convolutional layer and modulating the matching object features; and an Inter-Slice Attention (ISA) module, which enhances volumetric uniformity by integrating related information from adjacent slices, thereby reinforcing the model’s capacity to capture inter-slice dependencies. The CSDA-UNet is assessed using hippocampal segmentation data derived from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and Decathlon, two benchmark studies widely employed in neuroimaging research. The proposed model outperforms state-of-the-art methods, achieving a Dice coefficient of 0.9512 and an IoU score of 0.9345 on ADNI and Dice scores of 0.9907/0.8963 (train/validation) and an IoU score of 0.9816/0.8132 (train/validation) on the Decathlon dataset across multiple quantitative metrics. These improvements underscore the efficacy of the proposed dual-attention framework in accurately explaining small, asymmetrical structures such as the hippocampus, while maintaining computational efficiency suitable for clinical deployment. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

22 pages, 1243 KB  
Article
ProCo-NET: Progressive Strip Convolution and Frequency- Optimized Framework for Scale-Gradient-Aware Semantic Segmentation in Off-Road Scenes
by Zihang Liu, Donglin Jing and Chenxiang Ji
Symmetry 2025, 17(9), 1428; https://doi.org/10.3390/sym17091428 - 2 Sep 2025
Viewed by 572
Abstract
In off-road scenes, segmentation targets exhibit significant scale progression due to perspective depth effects from oblique viewing angles, meaning that the size of the same target undergoes continuous, boundary-less progressive changes along a specific direction. This asymmetric variation disrupts the geometric symmetry of [...] Read more.
In off-road scenes, segmentation targets exhibit significant scale progression due to perspective depth effects from oblique viewing angles, meaning that the size of the same target undergoes continuous, boundary-less progressive changes along a specific direction. This asymmetric variation disrupts the geometric symmetry of targets, causing traditional segmentation networks to face three key challenges: (1) inefficientcapture of continuous-scale features, where pyramid structures and multi-scale kernels struggle to balance computational efficiency with sufficient coverage of progressive scales; (2) degraded intra-class feature consistency, where local scale differences within targets induce semantic ambiguity; and (3) loss of high-frequency boundary information, where feature sampling operations exacerbate the blurring of progressive boundaries. To address these issues, this paper proposes the ProCo-NET framework for systematic optimization. Firstly, a Progressive Strip Convolution Group (PSCG) is designed to construct multi-level receptive field expansion through orthogonally oriented strip convolution cascading (employing symmetric processing in horizontal/vertical directions) integrated with self-attention mechanisms, enhancing perception capability for asymmetric continuous-scale variations. Secondly, an Offset-Frequency Cooperative Module (OFCM) is developed wherein a learnable offset generator dynamically adjusts sampling point distributions to enhance intra-class consistency, while a dual-channel frequency domain filter performs adaptive high-pass filtering to sharpen target boundaries. These components synergistically solve feature consistency degradation and boundary ambiguity under asymmetric changes. Experiments show that this framework significantly improves the segmentation accuracy and boundary clarity of multi-scale targets in off-road scene segmentation tasks: it achieves 71.22% MIoU on the standard RUGD dataset (0.84% higher than the existing optimal method) and 83.05% MIoU on the Freiburg_Forest dataset. Among them, the segmentation accuracy of key obstacle categories is significantly improved to 52.04% (2.7% higher than the sub-optimal model). This framework effectively compensates for the impact of asymmetric deformation through a symmetric computing mechanism. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

25 pages, 11680 KB  
Article
ETAFHrNet: A Transformer-Based Multi-Scale Network for Asymmetric Pavement Crack Segmentation
by Chao Tan, Jiaqi Liu, Zhedong Zhao, Rufei Liu, Peng Tan, Aishu Yao, Shoudao Pan and Jingyi Dong
Appl. Sci. 2025, 15(11), 6183; https://doi.org/10.3390/app15116183 - 30 May 2025
Cited by 1 | Viewed by 1129
Abstract
Accurate segmentation of pavement cracks from high-resolution remote sensing imagery plays a crucial role in automated road condition assessment and infrastructure maintenance. However, crack structures often exhibit asymmetry, irregular morphology, and multi-scale variations, posing significant challenges to conventional CNN-based methods in real-world environments. [...] Read more.
Accurate segmentation of pavement cracks from high-resolution remote sensing imagery plays a crucial role in automated road condition assessment and infrastructure maintenance. However, crack structures often exhibit asymmetry, irregular morphology, and multi-scale variations, posing significant challenges to conventional CNN-based methods in real-world environments. Specifically, the proposed ETAFHrNet focuses on two predominant pavement-distress morphologies—linear cracks (transverse and longitudinal) and alligator cracks—and has been empirically validated on their intersections and branching patterns over both asphalt and concrete road surfaces. In this work, we present ETAFHrNet, a novel attention-guided segmentation network designed to address the limitations of traditional architectures in detecting fine-grained and asymmetric patterns. ETAFHrNet integrates Transformer-based global attention and multi-scale hybrid feature fusion, enhancing both contextual perception and detail sensitivity. The network introduces two key modules: the Efficient Hybrid Attention Transformer (EHAT), which captures long-range dependencies, and the Cross-Scale Hybrid Attention Module (CSHAM), which adaptively fuses features across spatial resolutions. To support model training and benchmarking, we also propose QD-Crack, a high-resolution, pixel-level annotated dataset collected from real-world road inspection scenarios. Experimental results show that ETAFHrNet significantly outperforms existing methods—including U-Net, DeepLabv3+, and HRNet—in both segmentation accuracy and generalization ability. These findings demonstrate the effectiveness of interpretable, multi-scale attention architectures in complex object detection and image classification tasks, making our approach relevant for broader applications, such as autonomous driving, remote sensing, and smart infrastructure systems. Full article
(This article belongs to the Special Issue Object Detection and Image Classification)
Show Figures

Figure 1

29 pages, 1184 KB  
Review
AI-Driven Technology in Heart Failure Detection and Diagnosis: A Review of the Advancement in Personalized Healthcare
by Ikteder Akhand Udoy and Omiya Hassan
Symmetry 2025, 17(3), 469; https://doi.org/10.3390/sym17030469 - 20 Mar 2025
Cited by 6 | Viewed by 5476
Abstract
Artificial intelligence (AI) is playing a dominant role in advancing heart failure detection and diagnosis, significantly furthering personalized healthcare. This review synthesizes AI-driven innovations by examining methodologies, applications, and outcomes. We investigate the integration of machine learning algorithms, diverse datasets including electronic health [...] Read more.
Artificial intelligence (AI) is playing a dominant role in advancing heart failure detection and diagnosis, significantly furthering personalized healthcare. This review synthesizes AI-driven innovations by examining methodologies, applications, and outcomes. We investigate the integration of machine learning algorithms, diverse datasets including electronic health records (EHRs), medical records, imaging data, and clinical notes, deep learning models, and neural networks to enhance diagnostic accuracy. Key advancements include prediction models that leverage real-time data from wearable devices alongside state-of-the-art AI systems trained on patient data from hospitals and clinics. Notably, recent studies have reported diagnostic accuracies ranging from 86.7% to as high as 99.9%, with sensitivity and specificity values often exceeding 97%, underscoring the potential of these AI systems to improve early detection and clinical decision-making substantially. Our review further explores the impact of symmetry and asymmetry in model design, highlighting that symmetric architectures like U-Net offer computational efficiency and structured feature extraction. In contrast, asymmetric models improve the sensitivity to rare conditions and subtle clinical patterns. Incorporating these deep learning (DL) methods in anomaly detection and disease progression modeling further reinforces their positive impact on diagnostic accuracy and patient outcomes. Furthermore, this review identifies challenges in current AI applications, such as data quality, algorithmic transparency, model bias, and evaluation metrics, while outlining future research directions, including integrating generative models, hybrid architectures, and explainable AI techniques to optimize clinical practice. Full article
Show Figures

Figure 1

14 pages, 1376 KB  
Article
Football Net: Leveraging the Structure of Truncated Icosahedron in Convolutional Neural Network Design
by Zhijian Zhu and Qinghui Wang
Appl. Sci. 2025, 15(3), 1369; https://doi.org/10.3390/app15031369 - 28 Jan 2025
Cited by 1 | Viewed by 994
Abstract
Deep neural networks often suffer from the degradation of fine-grained features during feature transmission. To mitigate this issue, we propose an innovative CNN architecture, Football Net, which is designed to enhance feature propagation. By introducing Asymmetric Skip Connections, Football Net effectively captures and [...] Read more.
Deep neural networks often suffer from the degradation of fine-grained features during feature transmission. To mitigate this issue, we propose an innovative CNN architecture, Football Net, which is designed to enhance feature propagation. By introducing Asymmetric Skip Connections, Football Net effectively captures and preserves fine-grained details. Another significant challenge in deep neural networks is achieving robustness. Football Net addresses this challenge by incorporating a novel misaligned feature merging mechanism and a new homogeneous ensemble learning strategy. The experimental results indicate that this improved ensemble strategy significantly reduces both bias and variance, thereby enhancing overall classification performance. We conducted extensive experiments on the CIFAR-10, ImageNet-100, and ImageNet-1k datasets. The results demonstrate the competitiveness of Football Net in image classification tasks, achieving accuracy comparable to state-of-the-art models such as ResNet, U-Net, and U-Net++ while also improving robustness. Full article
Show Figures

Figure 1

21 pages, 71952 KB  
Article
A Hierarchical Feature-Aware Model for Accurate Tomato Blight Disease Spot Detection: Unet with Vision Mamba and ConvNeXt Perspective
by Dongyuan Shi, Changhong Li, Hui Shi, Longwei Liang, Huiying Liu and Ming Diao
Agronomy 2024, 14(10), 2227; https://doi.org/10.3390/agronomy14102227 - 27 Sep 2024
Cited by 5 | Viewed by 1657
Abstract
Tomato blight significantly threatened tomato yield and quality, making precise disease detection essential for modern agricultural practices. Traditional segmentation models often struggle with over-segmentation and missed segmentation, particularly in complex backgrounds and with diverse lesion morphologies. To address these challenges, we proposed Unet [...] Read more.
Tomato blight significantly threatened tomato yield and quality, making precise disease detection essential for modern agricultural practices. Traditional segmentation models often struggle with over-segmentation and missed segmentation, particularly in complex backgrounds and with diverse lesion morphologies. To address these challenges, we proposed Unet with Vision Mamba and ConvNeXt (VMC-Unet), an asymmetric segmentation model for quantitative analysis of tomato blight. Built on the Unet framework, VMC-Unet integrated a parallel feature-aware backbone combining ConvNeXt, Vision Mamba, and Atrous Spatial Pyramid Pooling (ASPP) modules to enhance spatial feature focusing and multi-scale information processing. During decoding, Vision Mamba was hierarchically embedded to accurately recover complex lesion morphologies through refined feature processing and efficient up-sampling. A joint loss function was designed to optimize the model’s performance. Extensive experiments on both tomato epidemic and public datasets demonstrated VMC-Unet superior performance, achieving 97.82% pixel accuracy, 87.94% F1 score, and 86.75% mIoU. These results surpassed those of classical segmentation models, underscoring the effectiveness of VMC-Unet in mitigating over-segmentation and under-segmentation while maintaining high segmentation accuracy in complex backgrounds. The consistent performance of the model across various datasets further validated its robustness and generalization potential, highlighting its applicability in broader agricultural settings. Full article
Show Figures

Figure 1

25 pages, 4887 KB  
Article
High-Resolution CAD-Based Shape Parametrisation of a U-Bend Channel
by Rejish Jesudasan and Jens-Dominik Müeller
Aerospace 2024, 11(8), 663; https://doi.org/10.3390/aerospace11080663 - 13 Aug 2024
Cited by 5 | Viewed by 1612
Abstract
The parametrisation of the geometry in shape optimisation has an important influence on the quality of the optimum and the rate of convergence of the optimiser. Refinement studies for the parametrisation are not shown in the literature, as most methods use non-orthogonal parametrisations, [...] Read more.
The parametrisation of the geometry in shape optimisation has an important influence on the quality of the optimum and the rate of convergence of the optimiser. Refinement studies for the parametrisation are not shown in the literature, as most methods use non-orthogonal parametrisations, which cause issues with convergence when the parametrisation is refined. The NURBS-based parametrisation with complex constraints (NSPCC) is the only CAD-based parametrisation method that guarantees orthogonal shape modes by constructing an optimal basis. We conduct a parametrisation refinement study for the benchmark turbomachinery cooling bend (U-bend) geometry, an intially symmetric geometry. Using an adjoint RANS solver, we optimise for mininmal total pressure drop. The results show significant effects of the control net density on the final shape, with the finest control net producing an asymmetric optimal shape resembling strakes that induces swirl ahead of the bend. These asymmetric modes have not been reported in the literature so far. We also demonstrate that the convergence rate of the optimiser is not significantly affected by the refinement of the parametrisation. The effectiveness of these shape features obtained with single-point optimisation is evaluated for a range of Reynolds numbers. It is shown that total pressure drop reduction is not sensitive to Reynolds number. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

14 pages, 2930 KB  
Article
AsymUNet: An Efficient Multi-Layer Perceptron Model Based on Asymmetric U-Net for Medical Image Noise Removal
by Yan Cui, Xiangming Hong, Haidong Yang, Zhili Ge and Jielin Jiang
Electronics 2024, 13(16), 3191; https://doi.org/10.3390/electronics13163191 - 12 Aug 2024
Cited by 1 | Viewed by 1584
Abstract
With the continuous advancement of deep learning technology, U-Net–based algorithms for image denoising play a crucial role in medical image processing. However, most U-Net-based medical image denoising algorithms typically have large parameter sizes, which poses significant limitations in practical applications where computational resources [...] Read more.
With the continuous advancement of deep learning technology, U-Net–based algorithms for image denoising play a crucial role in medical image processing. However, most U-Net-based medical image denoising algorithms typically have large parameter sizes, which poses significant limitations in practical applications where computational resources are limited or large-scale patient data processing are required. In this paper, we propose a medical image denoising algorithm called AsymUNet, developed using an asymmetric U-Net framework and a spatially rearranged multilayer perceptron (MLP). AsymUNet utilizes an asymmetric U-Net to reduce the computational burden, while a multiscale feature fusion module enhances the feature interaction between the encoder and decoder. To better preserve the image details, spatially rearranged MLP blocks serve as the core building blocks of AsymUNet. These blocks effectively extract both the local and global features of the image, reducing the model’s reliance on prior knowledge of the image and further accelerating the training and inference processes. Experimental results demonstrate that AsymUNet achieves superior performance metrics and visual results compared with other state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning in Image Processing and Segmentation)
Show Figures

Figure 1

19 pages, 2039 KB  
Article
EAD-Net: Efficiently Asymmetric Network for Semantic Labeling of High-Resolution Remote Sensing Images with Dynamic Routing Mechanism
by Qiongqiong Hu, Feiting Wang and Ying Li
Remote Sens. 2024, 16(9), 1478; https://doi.org/10.3390/rs16091478 - 23 Apr 2024
Cited by 1 | Viewed by 1592
Abstract
Semantic labeling of high-resolution remote sensing images (HRRSIs) holds a significant position in the remote sensing domain. Although numerous deep-learning-based segmentation models have enhanced segmentation precision, their complexity leads to a significant increase in parameters and computational requirements. While ensuring segmentation accuracy, it [...] Read more.
Semantic labeling of high-resolution remote sensing images (HRRSIs) holds a significant position in the remote sensing domain. Although numerous deep-learning-based segmentation models have enhanced segmentation precision, their complexity leads to a significant increase in parameters and computational requirements. While ensuring segmentation accuracy, it is also crucial to improve segmentation speed. To address this issue, we propose an efficient asymmetric deep learning network for HRRSIs, referred to as EAD-Net. First, EAD-Net employs ResNet50 as the backbone without pooling, instead of the RepVGG block, to extract rich semantic features while reducing model complexity. Second, a dynamic routing module is proposed in EAD-Net to adjust routing based on the pixel occupancy of small-scale objects. Concurrently, a channel attention mechanism is used to preserve their features even with minimal occupancy. Third, a novel asymmetric decoder is introduced, which uses convolutional operations while discarding skip connections. This not only effectively reduces redundant features but also allows using low-level image features to enhance EAD-Net’s performance. Extensive experimental results on the ISPRS 2D semantic labeling challenge benchmark demonstrate that EAD-Net achieves state-of-the-art (SOTA) accuracy performance while reducing model complexity and inference time, while the mean Intersection over Union (mIoU) score reaching 87.38% and 93.10% in the Vaihingen and Potsdam datasets, respectively. Full article
Show Figures

Figure 1

19 pages, 8487 KB  
Article
MRFA-Net: Multi-Scale Receptive Feature Aggregation Network for Cloud and Shadow Detection
by Jianxiang Wang, Yuanlu Li, Xiaoting Fan, Xin Zhou and Mingxuan Wu
Remote Sens. 2024, 16(8), 1456; https://doi.org/10.3390/rs16081456 - 20 Apr 2024
Cited by 2 | Viewed by 1663
Abstract
The effective segmentation of clouds and cloud shadows is crucial for surface feature extraction, climate monitoring, and atmospheric correction, but it remains a critical challenge in remote sensing image processing. Cloud features are intricate, with varied distributions and unclear boundaries, making accurate extraction [...] Read more.
The effective segmentation of clouds and cloud shadows is crucial for surface feature extraction, climate monitoring, and atmospheric correction, but it remains a critical challenge in remote sensing image processing. Cloud features are intricate, with varied distributions and unclear boundaries, making accurate extraction difficult, with only a few networks addressing this challenge. To tackle these issues, we introduce a multi-scale receptive field aggregation network (MRFA-Net). The MRFA-Net comprises an MRFA-Encoder and MRFA-Decoder. Within the encoder, the net includes the asymmetric feature extractor module (AFEM) and multi-scale attention, which capture diverse local features and enhance contextual semantic understanding, respectively. The MRFA-Decoder includes the multi-path decoder module (MDM) for blending features and the global feature refinement module (GFRM) for optimizing information via learnable matrix decomposition. Experimental results demonstrate that our model excelled in generalization and segmentation performance when addressing various complex backgrounds and different category detections, exhibiting advantages in terms of parameter efficiency and computational complexity, with the MRFA-Net achieving a mean intersection over union (MIoU) of 94.12% on our custom Cloud and Shadow dataset, and 87.54% on the open-source HRC_WHU dataset, outperforming other models by at least 0.53% and 0.62%. The proposed model demonstrates applicability in practical scenarios where features are difficult to distinguish. Full article
Show Figures

Figure 1

19 pages, 4238 KB  
Article
Symmetry Breaking in the U-Net: Hybrid Deep-Learning Multi-Class Segmentation of HeLa Cells in Reflected Light Microscopy Images
by Ali Ghaznavi, Renata Rychtáriková, Petr Císař, Mohammad Mehdi Ziaei and Dalibor Štys
Symmetry 2024, 16(2), 227; https://doi.org/10.3390/sym16020227 - 13 Feb 2024
Cited by 4 | Viewed by 2853
Abstract
Multi-class segmentation of unlabelled living cells in time-lapse light microscopy images is challenging due to the temporal behaviour and changes in cell life cycles and the complexity of these images. The deep-learning-based methods achieved promising outcomes and remarkable success in single- and multi-class [...] Read more.
Multi-class segmentation of unlabelled living cells in time-lapse light microscopy images is challenging due to the temporal behaviour and changes in cell life cycles and the complexity of these images. The deep-learning-based methods achieved promising outcomes and remarkable success in single- and multi-class medical and microscopy image segmentation. The main objective of this study is to develop a hybrid deep-learning-based categorical segmentation and classification method for living HeLa cells in reflected light microscopy images. A symmetric simple U-Net and three asymmetric hybrid convolution neural networks—VGG19-U-Net, Inception-U-Net, and ResNet34-U-Net—were proposed and mutually compared to find the most suitable architecture for multi-class segmentation of our datasets. The inception module in the Inception-U-Net contained kernels with different sizes within the same layer to extract all feature descriptors. The series of residual blocks with the skip connections in each ResNet34-U-Net’s level alleviated the gradient vanishing problem and improved the generalisation ability. The m-IoU scores of multi-class segmentation for our datasets reached 0.7062, 0.7178, 0.7907, and 0.8067 for the simple U-Net, VGG19-U-Net, Inception-U-Net, and ResNet34-U-Net, respectively. For each class and the mean value across all classes, the most accurate multi-class semantic segmentation was achieved using the ResNet34-U-Net architecture (evaluated as the m-IoU and Dice metrics). Full article
Show Figures

Figure 1

17 pages, 4699 KB  
Article
Deep Learning-Based Fishing Ground Prediction Using Asymmetric Spatiotemporal Scales: A Case Study of Ommastrephes bartramii
by Mingyang Xie, Bin Liu, Xinjun Chen, Wei Yu and Jintao Wang
Fishes 2024, 9(2), 64; https://doi.org/10.3390/fishes9020064 - 4 Feb 2024
Cited by 6 | Viewed by 3512
Abstract
Selecting the optimal spatiotemporal scale in fishing ground prediction models can maximize prediction accuracy. Current research on spatiotemporal scales shows that they are symmetrically distributed, which may not capture specific oceanographic features conducive to fishing ground formation. Recent studies have shown that deep [...] Read more.
Selecting the optimal spatiotemporal scale in fishing ground prediction models can maximize prediction accuracy. Current research on spatiotemporal scales shows that they are symmetrically distributed, which may not capture specific oceanographic features conducive to fishing ground formation. Recent studies have shown that deep learning is a promising research direction for addressing spatiotemporal scale issues. In the era of big data, deep learning outperforms traditional methods by more accurately and efficiently mining high-value, nonlinear information. In this study, taking Ommastrephes bartramii in the Northwest Pacific as an example, we used the U-Net model with sea surface temperature (SST) as the input factor and center fishing ground as the output factor. We constructed 80 different combinations of temporal scales and asymmetric spatial scales using data in 1998–2020. By comparing the results, we found that the optimal temporal scale for the deep learning fishing ground prediction model is 15 days, and the spatial scale is 0.25° × 0.25°. Larger time scales lead to higher model accuracy, and latitude has a greater impact on the model than longitude. It further enriches and refines the criteria for selecting spatiotemporal scales. This result deepens our understanding of the oceanographic characteristics of the Northwest Pacific environmental field and lays the foundation for future artificial intelligence-based fishery research. This study provides a scientific basis for the sustainable development of efficient fishery production. Full article
(This article belongs to the Special Issue AI and Fisheries)
Show Figures

Graphical abstract

Back to TopTop