This study addresses the common challenges in medical image segmentation and recognition, including boundary ambiguity, scale variation, and the difficulty of modeling long-range dependencies, by proposing a unified framework based on a hierarchical attention mechanism. The framework consists of a local detail attention
[...] Read more.
This study addresses the common challenges in medical image segmentation and recognition, including boundary ambiguity, scale variation, and the difficulty of modeling long-range dependencies, by proposing a unified framework based on a hierarchical attention mechanism. The framework consists of a local detail attention module, a global context attention module, and a cross-scale consistency constraint module, which collectively enable adaptive weighting and collaborative optimization across different feature levels, thereby achieving a balance between detail preservation and global modeling. The framework was systematically validated on multiple public datasets, and the results demonstrated that the proposed method achieved Dice, IoU, Precision, Recall, and F1 scores of 0.886, 0.781, 0.898, 0.875, and 0.886, respectively, on the combined dataset, outperforming traditional models such as U-Net, Mask R-CNN, DeepLabV3+, SegNet, and TransUNet. On the BraTS dataset, the proposed method achieved a Dice score of 0.922, Precision of 0.930, and Recall of 0.915, exhibiting superior boundary modeling capability in complex brain MRI images. On the LIDC-IDRI dataset, the Dice score and Recall were improved from 0.751 and 0.732 to 0.822 and 0.807, respectively, effectively reducing the missed detection rate of small nodules compared to traditional convolutional models. On the ISIC dermoscopy dataset, the proposed framework achieved a Dice score of 0.914 and a Precision of 0.922, significantly improving the accuracy of skin lesion recognition. The ablation study further revealed that local detail attention significantly enhanced boundary and texture modeling, global context attention strengthened long-range dependency capture, and cross-scale consistency constraints ensured the stability and coherence of prediction results. From a medical economics perspective, the proposed framework has the potential to reduce diagnostic costs and improve healthcare efficiency by enabling faster and more accurate image-based clinical decision-making. In summary, the hierarchical attention mechanism presented in this work not only provides an innovative breakthrough in mathematical modeling but also demonstrates outstanding performance and generalization ability in experiments, offering new perspectives and technical pathways for intelligent segmentation and recognition in medical imaging.
Full article