MDPI - Publisher of Open Access Journals

19 pages, 3130 KiB

Open AccessArticle

Deep Learning-Based Instance Segmentation of Galloping High-Speed Railway Overhead Contact System Conductors in Video Images

by Xiaotong Yao, Huayu Yuan, Shanpeng Zhao, Wei Tian, Dongzhao Han, Xiaoping Li, Feng Wang and Sihua Wang

Sensors 2025, 25(15), 4714; https://doi.org/10.3390/s25154714 (registering DOI) - 30 Jul 2025

Viewed by 183

Abstract

The conductors of high-speed railway OCSs (Overhead Contact Systems) are susceptible to conductor galloping due to the impact of natural elements such as strong winds, rain, and snow, resulting in conductor fatigue damage and significantly compromising train operational safety. Consequently, monitoring the galloping [...] Read more.

The conductors of high-speed railway OCSs (Overhead Contact Systems) are susceptible to conductor galloping due to the impact of natural elements such as strong winds, rain, and snow, resulting in conductor fatigue damage and significantly compromising train operational safety. Consequently, monitoring the galloping status of conductors is crucial, and instance segmentation techniques, by delineating the pixel-level contours of each conductor, can significantly aid in the identification and study of galloping phenomena. This work expands upon the YOLO11-seg model and introduces an instance segmentation approach for galloping video and image sensor data of OCS conductors. The algorithm, designed for the stripe-like distribution of OCS conductors in the data, employs four-direction Sobel filters to extract edge features in horizontal, vertical, and diagonal orientations. These features are subsequently integrated with the original convolutional branch to form the FDSE (Four Direction Sobel Enhancement) module. It integrates the ECA (Efficient Channel Attention) mechanism for the adaptive augmentation of conductor characteristics and utilizes the FL (Focal Loss) function to mitigate the class-imbalance issue between positive and negative samples, hence enhancing the model’s sensitivity to conductors. Consequently, segmentation outcomes from neighboring frames are utilized, and mask-difference analysis is performed to autonomously detect conductor galloping locations, emphasizing their contours for the clear depiction of galloping characteristics. Experimental results demonstrate that the enhanced YOLO11-seg model achieves 85.38% precision, 77.30% recall, 84.25% AP@0.5, 81.14% F1-score, and a real-time processing speed of 44.78 FPS. When combined with the galloping visualization module, it can issue real-time alerts of conductor galloping anomalies, providing robust technical support for railway OCS safety monitoring. Full article

(This article belongs to the Section Industrial Sensors)

► Show Figures

Figure 1

14 pages, 2178 KiB

Open AccessArticle

State-of-the-Art Document Image Binarization Using a Decision Tree Ensemble Trained on Classic Local Binarization Algorithms and Image Statistics

by Nicolae Tarbă, Costin-Anton Boiangiu and Mihai-Lucian Voncilă

Appl. Sci. 2025, 15(15), 8374; https://doi.org/10.3390/app15158374 - 28 Jul 2025

Viewed by 208

Abstract

Image binarization algorithms reduce the original color space to only two values, black and white. They are an important preprocessing step in many computer vision applications. Image binarization is typically performed using a threshold value by classifying the pixels into two categories: lower [...] Read more.

Image binarization algorithms reduce the original color space to only two values, black and white. They are an important preprocessing step in many computer vision applications. Image binarization is typically performed using a threshold value by classifying the pixels into two categories: lower and higher than the threshold. Global thresholding uses a single threshold value for the entire image, whereas local thresholding uses different values for the different pixels. Although slower and more complex than global thresholding, local thresholding can better classify pixels in noisy areas of an image by considering not only the pixel’s value, but also its surrounding neighborhood. This study introduces a local thresholding method that uses the results of several local thresholding algorithms and other image statistics to train a decision tree ensemble. Through cross-validation, we demonstrate that the model is robust and performs well on new data. We compare the results with state-of-the-art solutions and reveal significant improvements in the average F-measure for all DIBCO datasets, obtaining an F-measure of 95.8%, whereas the previous high score was 93.1%. The proposed solution significantly outperformed the previous state-of-the-art algorithms on the DIBCO 2019 dataset, obtaining an F-measure of 95.8%, whereas the previous high score was 73.8%. Full article

(This article belongs to the Special Issue Statistical Signal Processing: Theory, Methods and Applications)

► Show Figures

Figure 1

30 pages, 92065 KiB

Open AccessArticle

A Picking Point Localization Method for Table Grapes Based on PGSS-YOLOv11s and Morphological Strategies

by Jin Lu, Zhongji Cao, Jin Wang, Zhao Wang, Jia Zhao and Minjie Zhang

Agriculture 2025, 15(15), 1622; https://doi.org/10.3390/agriculture15151622 - 26 Jul 2025

Viewed by 266

Abstract

During the automated picking of table grapes, the automatic recognition and segmentation of grape pedicels, along with the positioning of picking points, are vital components for all the following operations of the harvesting robot. In the actual scene of a grape plantation, however, [...] Read more.

During the automated picking of table grapes, the automatic recognition and segmentation of grape pedicels, along with the positioning of picking points, are vital components for all the following operations of the harvesting robot. In the actual scene of a grape plantation, however, it is extremely difficult to accurately and efficiently identify and segment grape pedicels and then reliably locate the picking points. This is attributable to the low distinguishability between grape pedicels and the surrounding environment such as branches, as well as the impacts of other conditions like weather, lighting, and occlusion, which are coupled with the requirements for model deployment on edge devices with limited computing resources. To address these issues, this study proposes a novel picking point localization method for table grapes based on an instance segmentation network called Progressive Global-Local Structure-Sensitive Segmentation (PGSS-YOLOv11s) and a simple combination strategy of morphological operators. More specifically, the network PGSS-YOLOv11s is composed of an original backbone of the YOLOv11s-seg, a spatial feature aggregation module (SFAM), an adaptive feature fusion module (AFFM), and a detail-enhanced convolutional shared detection head (DE-SCSH). And the PGSS-YOLOv11s have been trained with a new grape segmentation dataset called Grape-⊥, which includes 4455 grape pixel-level instances with the annotation of ⊥-shaped regions. After the PGSS-YOLOv11s segments the ⊥-shaped regions of grapes, some morphological operations such as erosion, dilation, and skeletonization are combined to effectively extract grape pedicels and locate picking points. Finally, several experiments have been conducted to confirm the validity, effectiveness, and superiority of the proposed method. Compared with the other state-of-the-art models, the main metrics

F 1

score and mask mAP@0.5 of the PGSS-YOLOv11s reached 94.6% and 95.2% on the Grape-⊥ dataset, as well as 85.4% and 90.0% on the Winegrape dataset. Multi-scenario tests indicated that the success rate of positioning the picking points reached up to 89.44%. In orchards, real-time tests on the edge device demonstrated the practical performance of our method. Nevertheless, for grapes with short pedicels or occluded pedicels, the designed morphological algorithm exhibited the loss of picking point calculations. In future work, we will enrich the grape dataset by collecting images under different lighting conditions, from various shooting angles, and including more grape varieties to improve the method’s generalization performance. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

20 pages, 2786 KiB

Open AccessArticle

Inverse Kinematics-Augmented Sign Language: A Simulation-Based Framework for Scalable Deep Gesture Recognition

by Binghao Wang, Lei Jing and Xiang Li

Algorithms 2025, 18(8), 463; https://doi.org/10.3390/a18080463 - 24 Jul 2025

Viewed by 216

Abstract

In this work, we introduce IK-AUG, a unified algorithmic framework for kinematics-driven data augmentation tailored to sign language recognition (SLR). Departing from traditional augmentation techniques that operate at the pixel or feature level, our method integrates inverse kinematics (IK) and virtual simulation to [...] Read more.

In this work, we introduce IK-AUG, a unified algorithmic framework for kinematics-driven data augmentation tailored to sign language recognition (SLR). Departing from traditional augmentation techniques that operate at the pixel or feature level, our method integrates inverse kinematics (IK) and virtual simulation to synthesize anatomically valid gesture sequences within a structured 3D environment. The proposed system begins with sparse 3D keypoints extracted via a pose estimator and projects them into a virtual coordinate space. A differentiable IK solver based on forward-and-backward constrained optimization is then employed to reconstruct biomechanically plausible joint trajectories. To emulate natural signer variability and enhance data richness, we define a set of parametric perturbation operators spanning spatial displacement, depth modulation, and solver sensitivity control. These operators are embedded into a generative loop that transforms each original gesture sample into a diverse sequence cluster, forming a high-fidelity augmentation corpus. We benchmark our method across five deep sequence models (CNN3D, TCN, Transformer, Informer, and Sparse Transformer) and observe consistent improvements in accuracy and convergence. Notably, Informer achieves 94.1% validation accuracy with IK-AUG enhanced training, underscoring the framework’s efficacy. These results suggest that algorithmic augmentation via kinematic modeling offers a scalable, annotation free pathway for improving SLR systems and lays the foundation for future integration with multi-sensor inputs in hybrid recognition pipelines. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

16 pages, 2308 KiB

Open AccessArticle

Reconstructing of Satellite-Derived CO₂ Using Multiple Environmental Variables—A Case Study in the Provinces of Huai River Basin, China

by Yuxin Zhu, Ying Zhang, Linping Zhu and Jinzong Zhang

Atmosphere 2025, 16(8), 903; https://doi.org/10.3390/atmos16080903 - 24 Jul 2025

Viewed by 193

Abstract

The introduction of the ”dual carbon” target has increased the need for products that can accurately measure carbon dioxide levels, reflecting the rising demand. Due to challenges in achieving the required spatiotemporal resolution, accuracy, and spatial continuity with current carbon dioxide concentration products, [...] Read more.

The introduction of the ”dual carbon” target has increased the need for products that can accurately measure carbon dioxide levels, reflecting the rising demand. Due to challenges in achieving the required spatiotemporal resolution, accuracy, and spatial continuity with current carbon dioxide concentration products, it is essential to explore methods for obtaining carbon dioxide concentration products with completeness in space and time. Based on the 2018 OCO-2 carbon dioxide products and environmental variables such as vegetation coverage (FVC, LAI), net primary productivity (NPP), relative humidity (RH), evapotranspiration (ET), temperature (T) and wind (U, V), this study constructed a multiple regression model to obtain the spatial continuous carbon dioxide concentration products in the provinces of Huai River Basin. Using indicators such as correlation coefficient, root mean square error (RMSE), local variance, and percentage of valid pixels, the performance of model was validated. The validation results are shown as follows: (1) Among the selected environmental variables, the primary factors affecting the spatiotemporal distribution of carbon dioxide concentration are ET, LAI, FVC, NPP, T, U, and RH. (2) Compared with the OCO-2 carbon dioxide products, the percentage of valid pixels of the reconstructed carbon dioxide concentration data increased from less than 1% to over 90%. (3) The local variance in reconstructed data was significantly larger than that of original OCO-2

{C O}_{2}

products. (4) The average monthly RMSE is 2.69. Therefore, according to the model developed in this study, we can obtain a carbon dioxide concentration dataset that is spatially complete, meets precision requirements, and is rich in local detail information, which can better reflect the spatial pattern of carbon dioxide concentration and can be used to examine the carbon cycle between the terrestrial environment, biosphere, and atmosphere. Full article

(This article belongs to the Section Air Quality)

► Show Figures

Figure 1

23 pages, 7709 KiB

Open AccessArticle

Spatiotemporal Land Use Change Detection Through Automated Sampling and Multi-Feature Composite Analysis: A Case Study of the Ebinur Lake Basin

by Yi Yang, Liang Zhao, Ya Guo, Shihua Liu, Xiang Qin, Yixiao Li and Xiaoqiong Jiang

Sensors 2025, 25(14), 4314; https://doi.org/10.3390/s25144314 - 10 Jul 2025

Viewed by 215

Abstract

Land use change plays a pivotal role in understanding surface processes and environmental dynamics, exerting considerable influence on regional ecosystem management. Traditional monitoring approaches, which often rely on manual sampling and single spectral features, exhibit limitations in efficiency and accuracy. This study proposes [...] Read more.

Land use change plays a pivotal role in understanding surface processes and environmental dynamics, exerting considerable influence on regional ecosystem management. Traditional monitoring approaches, which often rely on manual sampling and single spectral features, exhibit limitations in efficiency and accuracy. This study proposes an innovative technical framework that integrates automated sample generation, multi-feature optimization, and classification model refinement to enhance the accuracy of land use classification and enable detailed spatiotemporal analysis in the Ebinur Lake Basin. By integrating Landsat data with multi-temporal European Space Agency (ESA) products, we acquired 14,000 pixels of 2021 land use samples, with multi-temporal spectral features enabling robust sample transfer to 12028 pixels in 2011 and 10,997 pixels in 2001. Multi-temporal composite data were reorganized and reconstructed to form annual and monthly feature spaces that integrate spectral bands, indices, terrain, and texture information. Feature selection based on the Gini coefficient and Out-Of-Bag Error (OOBE) reduced the original 48 features to 23. In addition, an object-oriented Gradient Boosting Decision Tree (GBDT) model was employed to perform accurate land use classification. A systematic evaluation confirmed the effectiveness of the proposed framework, achieving an overall accuracy of 93.17% and a Kappa coefficient of 92.03%, while significantly reducing noise in the classification maps. Based on land use classification results from three different periods, the spatial distribution and pattern changes of major land use types in the region over the past two decades were investigated through analyses of ellipses, centroid shifts, area changes, and transition matrices. This automated framework effectively enhances automation, offering technical support for accurate large-area land use classification. Full article

(This article belongs to the Special Issue Remote Sensing Technology for Agricultural and Land Management)

► Show Figures

Figure 1

21 pages, 5148 KiB

Open AccessArticle

Research on Buckwheat Weed Recognition in Multispectral UAV Images Based on MSU-Net

by Jinlong Wu, Xin Wu and Ronghui Miao

Agriculture 2025, 15(14), 1471; https://doi.org/10.3390/agriculture15141471 - 9 Jul 2025

Viewed by 275

Abstract

Quickly and accurately identifying weed areas is of great significance for improving weeding efficiency, reducing pesticide residues, protecting soil ecological environment, and increasing crop yield and quality. Targeting low detection efficiency in complex agricultural environments and inability of multispectral input in weed recognition [...] Read more.

Quickly and accurately identifying weed areas is of great significance for improving weeding efficiency, reducing pesticide residues, protecting soil ecological environment, and increasing crop yield and quality. Targeting low detection efficiency in complex agricultural environments and inability of multispectral input in weed recognition of minor grain based on unmanned aerial vehicles (UAVs), a semantic segmentation model for buckwheat weeds based on MSU-Net (multispectral U-shaped network) was proposed to explore the influence of different band optimizations on recognition accuracy. Five spectral features—red (R), blue (B), green (G), red edge (REdge), and near-infrared (NIR)—were collected in August when the weeds were more prominent. Based on the U-net image semantic segmentation model, the input module was improved to adaptively adjust the input bands. The neuron death caused by the original ReLU activation function may lead to misidentification, so it was replaced by the Swish function to improve the adaptability to complex inputs. Five single-band multispectral datasets and nine groups of multi-band combined data were, respectively, input into the improved MSU-Net model to verify the performance of our method. Experimental results show that in the single-band recognition results, the B band performs better than other bands, with mean pixel accuracy (mPA), mean intersection over union (mIoU), Dice, and F1 values of 0.75, 0.61, 0.87, and 0.80, respectively. In the multi-band recognition results, the R+G+B+NIR band performs better than other combined bands, with mPA, mIoU, Dice, and F1 values of 0.76, 0.65, 0.85, and 0.78, respectively. Compared with U-Net, DenseASPP, PSPNet, and DeepLabv3, our method achieved a preferable balance between model accuracy and resource consumption. These results indicate that our method can adapt to multispectral input bands and achieve good results in weed segmentation tasks. It can also provide reference for multispectral data analysis and semantic segmentation in the field of minor grain crops. Full article

(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

► Show Figures

Figure 1

30 pages, 4399 KiB

Open AccessArticle

Confident Learning-Based Label Correction for Retinal Image Segmentation

by Tanatorn Pethmunee, Supaporn Kansomkeat, Patama Bhurayanontachai and Sathit Intajag

Diagnostics 2025, 15(14), 1735; https://doi.org/10.3390/diagnostics15141735 - 8 Jul 2025

Viewed by 317

Abstract

Background/Objectives: In automatic medical image analysis, particularly for diabetic retinopathy, the accuracy of labeled data is crucial, as label noise can significantly complicate the analysis and lead to diagnostic errors. To tackle the issue of label noise in retinal image segmentation, an innovative [...] Read more.

Background/Objectives: In automatic medical image analysis, particularly for diabetic retinopathy, the accuracy of labeled data is crucial, as label noise can significantly complicate the analysis and lead to diagnostic errors. To tackle the issue of label noise in retinal image segmentation, an innovative label correction framework is introduced that combines Confident Learning (CL) with a human-in-the-loop re-annotation process to meticulously detect and rectify pixel-level labeling inaccuracies. Methods: Two CL-oriented strategies are assessed: Confident Joint Analysis (CJA) employing DeeplabV3+ with a ResNet-50 architecture, and Prune by Noise Rate (PBNR) utilizing ResNet-18. These methodologies are implemented on four publicly available retinal image datasets: HRF, STARE, DRIVE, and CHASE_DB1. After the models have been trained on the original labeled datasets, label noise is quantified, and amendments are executed on suspected misclassified pixels prior to the assessment of model performance. Results: The reduction in label noise yielded consistent advancements in accuracy, Intersection over Union (IoU), and weighted IoU across all the datasets. The segmentation of tiny structures, such as the fovea, demonstrated a significant enhancement following refinement. The Mean Boundary F1 Score (MeanBFScore) remained invariant, signifying the maintenance of boundary integrity. CJA and PBNR demonstrated strengths under different conditions, producing variations in performance that were dependent on the noise level and dataset characteristics. CL-based label correction techniques, when amalgamated with human refinement, could significantly enhance the segmentation accuracy and evaluation robustness for Accuracy, IoU, and MeanBFScore, achieving values of 0.9156, 0.8037, and 0.9856, respectively, with regard to the original ground truth, reflecting increases of 4.05%, 9.95%, and 1.28% respectively. Conclusions: This methodology represents a feasible and scalable solution to the challenge of label noise in medical image analysis, holding particular significance for real-world clinical applications. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image and Signal Processing: Recent Advancements and Applications)

► Show Figures

Figure 1

19 pages, 6293 KiB

Open AccessArticle

Restoring Anomalous Water Surface in DOM Product of UAV Remote Sensing Using Local Image Replacement

by Chunjie Wang, Ti Zhang, Liang Tao and Jiayuan Lin

Sensors 2025, 25(13), 4225; https://doi.org/10.3390/s25134225 - 7 Jul 2025

Viewed by 376

Abstract

In the production of a digital orthophoto map (DOM) from unmanned aerial vehicle (UAV)-acquired overlapping images, some anomalies such as texture stretching or data holes frequently occur in water areas due to the lack of significant textural features. These anomalies seriously affect the [...] Read more.

In the production of a digital orthophoto map (DOM) from unmanned aerial vehicle (UAV)-acquired overlapping images, some anomalies such as texture stretching or data holes frequently occur in water areas due to the lack of significant textural features. These anomalies seriously affect the visual quality and data integrity of the resulting DOMs. In this study, we attempted to eliminate the water surface anomalies in an example DOM via replacing the entire water area with an intact one that was clipped out from one single UAV image. The water surface scope and boundary in the image was first precisely achieved using the multisource seed filling algorithm and contour-finding algorithm. Next, the tie points were selected from the boundaries of the normal and anomalous water surfaces, and employed to realize their spatial alignment using affine plane coordinate transformation. Finally, the normal water surface was overlaid onto the DOM to replace the corresponding anomalous water surface. The restored water area had good visual effect in terms of spectral consistency, and the texture transition with the surrounding environment was also sufficiently natural. According to the standard deviations and mean values of RGB pixels, the quality of the restored DOM was greatly improved in comparison with the original one. These demonstrated that the proposed method had a sound performance in restoring abnormal water surfaces in a DOM, especially for scenarios where the water surface area is relatively small and can be contained in a single UAV image. Full article

(This article belongs to the Special Issue Remote Sensing and UAV Technologies for Environmental Monitoring)

► Show Figures

Figure 1

24 pages, 76230 KiB

Open AccessArticle

Secure and Efficient Video Management: A Novel Framework for CCTV Surveillance Systems

by Swarnalatha Camalapuram Subramanyam, Ansuman Bhattacharya and Koushik Sinha

IoT 2025, 6(3), 38; https://doi.org/10.3390/iot6030038 - 4 Jul 2025

Viewed by 334

Abstract

This paper presents a novel video encoding and decoding method aimed at enhancing security and reducing storage requirements, particularly for CCTV systems. The technique merges two video streams of matching frame dimensions into a single stream, optimizing disk space usage without compromising video [...] Read more.

This paper presents a novel video encoding and decoding method aimed at enhancing security and reducing storage requirements, particularly for CCTV systems. The technique merges two video streams of matching frame dimensions into a single stream, optimizing disk space usage without compromising video quality. The combined video is secured using an advanced encryption standard (AES)-based shift algorithm that rearranges pixel positions, preventing unauthorized access. During decoding, the AES shift is reversed, enabling precise reconstruction of the original videos. This approach provides a space-efficient and secure solution for managing multiple video feeds while ensuring accurate recovery of the original content. The experimental results demonstrate that the transmission time for the encoded video is consistently shorter compared to transmitting the video streams separately. This, in turn, leads to about

54 %

reduction in energy consumption across diverse outdoor and indoor video datasets, highlighting significant improvements in both transmission efficiency and energy savings by our proposed scheme. Full article

► Show Figures

Figure 1

16 pages, 6657 KiB

Open AccessArticle

Experimental Assessment of YOLO Variants for Coronary Artery Disease Segmentation from Angiograms

by Eduardo Díaz-Gaxiola, Arturo Yee-Rendon, Ines F. Vega-Lopez, Juan Augusto Campos-Leal, Iván García-Aguilar, Ezequiel López-Rubio and Rafael M. Luque-Baena

Electronics 2025, 14(13), 2683; https://doi.org/10.3390/electronics14132683 - 2 Jul 2025

Viewed by 507

Abstract

Coronary artery disease (CAD) is one of the leading causes of mortality worldwide, highlighting the importance of developing accurate and efficient diagnostic tools. This study presents a comparative evaluation of three recent YOLO architecture versions (YOLOv8, YOLOv9, and YOLOv11) for the tasks of [...] Read more.

Coronary artery disease (CAD) is one of the leading causes of mortality worldwide, highlighting the importance of developing accurate and efficient diagnostic tools. This study presents a comparative evaluation of three recent YOLO architecture versions (YOLOv8, YOLOv9, and YOLOv11) for the tasks of coronary vessel segmentation and stenosis detection using the ARCADE dataset. Two workflows were explored: one with original angiographic images and another incorporating Contrast Limited Adaptive Histogram Equalization (CLAHE) for image enhancement. Models were trained for 100 epochs using the AdamW optimizer and evaluated with precision, recall, and F1-score under a pixel-based segmentation framework. YOLOv9-E achieved the highest performance in vessel segmentation with an F1-score of 0.4524, while YOLOv11-X was most effective for stenosis detection, achieving an F1-score of 0.7826. Although CLAHE improved local contrast, it did not consistently improve segmentation results and occasionally introduced artifacts that negatively affected model performance. Compared to state-of-the-art methods, the YOLO models demonstrated competitive results, especially for large, well-defined coronary segments, but showed limitations in detecting smaller or more complex pathological structures. These findings support the use of YOLO-based architectures for real-time CAD segmentation tasks and highlight opportunities for future improvement through the integration of attention mechanisms or hybrid deep learning strategies. Full article

(This article belongs to the Special Issue Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications, 2nd Edition)

► Show Figures

Figure 1

26 pages, 6653 KiB

Open AccessArticle

Development of a Calibration Procedure of the Additive Masked Stereolithography Method for Improving the Accuracy of Model Manufacturing

by Paweł Turek, Anna Bazan, Paweł Kubik and Michał Chlost

Appl. Sci. 2025, 15(13), 7412; https://doi.org/10.3390/app15137412 - 1 Jul 2025

Viewed by 417

Abstract

The article presents a three-stage methodology for calibrating 3D printing using mSLA technology, aimed at improving dimensional accuracy and print repeatability. The proposed approach is based on procedures that enable the collection and analysis of numerical data, thereby minimizing the influence of the [...] Read more.

The article presents a three-stage methodology for calibrating 3D printing using mSLA technology, aimed at improving dimensional accuracy and print repeatability. The proposed approach is based on procedures that enable the collection and analysis of numerical data, thereby minimizing the influence of the operator’s subjective judgment, which is commonly relied upon in traditional calibration methods. In the first stage, compensation for the uneven illumination of the LCD matrix was performed by establishing a regression model that describes the relationship between UV radiation intensity and pixel brightness. Based on this model, a grayscale correction mask was developed. The second stage focused on determining the optimal exposure time, based on its effect on dimensional accuracy, detail reproduction, and model strength. The optimal exposure time is defined as the duration that provides the highest possible mechanical strength without significant loss of detail due to the light bleed phenomenon (i.e., diffusion of UV radiation beyond the mask edge). In the third stage, scale correction was applied to compensate for shrinkage and geometric distortions, further reducing the impact of light bleed on the dimensional fidelity of printed components. The proposed methodology was validated using an Anycubic Photon M3 Premium printer with Anycubic ABS-Like Resin Pro 2.0. Compensating for light intensity variation reduced the original standard deviation from 0.26 to 0.17 mW/cm², corresponding to a decrease of more than one third. The methodology reduced surface displacement due to shrinkage from 0.044% to 0.003%, and the residual internal dimensional error from 0.159 mm to 0.017 mm (a 72% reduction). Full article

(This article belongs to the Section Additive Manufacturing Technologies)

► Show Figures

Figure 1

22 pages, 6735 KiB

Open AccessArticle

SFMattingNet: A Trimap-Free Deep Image Matting Approach for Smoke and Fire Scenes

by Shihui Ma, Zhaoyang Xu and Hongping Yan

Remote Sens. 2025, 17(13), 2259; https://doi.org/10.3390/rs17132259 - 1 Jul 2025

Viewed by 372

Abstract

Smoke and fire detection is vital for timely fire alarms, but traditional sensor-based methods are often unresponsive and costly. While deep learning-based methods offer promise using aerial images and surveillance images, the scarcity and limited diversity of smoke-and-fire-related image data hinder model accuracy [...] Read more.

Smoke and fire detection is vital for timely fire alarms, but traditional sensor-based methods are often unresponsive and costly. While deep learning-based methods offer promise using aerial images and surveillance images, the scarcity and limited diversity of smoke-and-fire-related image data hinder model accuracy and generalization. Alpha composition, blending foreground and background using per-pixel alpha values (transparency parameters stored in the alpha channel alongside RGB channels), can effectively augment smoke and fire image datasets. Since image matting algorithms compute these alpha values, the quality of the alpha composition directly depends on the performance of the smoke and fire matting methods. However, due to the lack of smoke and fire image matting datasets for model training, existing image matting methods exhibit significant errors in predicting the alpha values of smoke and fire targets, leading to unrealistic composite images. Therefore, to address these above issues, the main research contributions of this paper are as follows: (1) Construction of a high-precision, large-scale smoke and fire image matting dataset, SFMatting-800. The images in this dataset are sourced from diverse real-world scenarios. It provides precise foreground opacity values and attribute annotations. (2) Evaluation of existing image matting baseline methods. Based on the SFMatting-800 dataset, traditional, trimap-based deep learning and trimap-free deep learning matting methods are evaluated to identify their strengths and weaknesses, providing a benchmark for improving future smoke and fire matting methods. (3) Proposal of a deep learning-based trimap-free smoke and fire image matting network, SFMattingNet, which takes the original image as input without using trimaps. Taking into account the unique characteristics of smoke and fire, the network incorporates a non-rigid object feature extraction module and a spatial awareness module, achieving improved performance. Compared to the suboptimal approach, MODNet, our SFMattingNet method achieved an average error reduction of 12.65% in the smoke and fire matting task. Full article

(This article belongs to the Special Issue Advanced AI Technology for Remote Sensing Analysis)

► Show Figures

Figure 1

14 pages, 1438 KiB

Open AccessArticle

CDBA-GAN: A Conditional Dual-Branch Attention Generative Adversarial Network for Robust Sonar Image Generation

by Wanzeng Kong, Han Yang, Mingyang Jia and Zhe Chen

Appl. Sci. 2025, 15(13), 7212; https://doi.org/10.3390/app15137212 - 26 Jun 2025

Viewed by 300

Abstract

The acquisition of real-world sonar data necessitates substantial investments of manpower, material resources, and financial capital, rendering it challenging to obtain sufficient authentic samples for sonar-related research tasks. Consequently, sonar image simulation technology has become increasingly vital in the field of sonar data [...] Read more.

The acquisition of real-world sonar data necessitates substantial investments of manpower, material resources, and financial capital, rendering it challenging to obtain sufficient authentic samples for sonar-related research tasks. Consequently, sonar image simulation technology has become increasingly vital in the field of sonar data analysis. Traditional sonar simulation methods predominantly focus on low-level physical modeling, which often suffers from limited image controllability and diminished fidelity in multi-category and multi-background scenarios. To address these limitations, this paper proposes a Conditional Dual-Branch Attention Generative Adversarial Network (CDBA-GAN). The framework comprises three key innovations: The conditional information fusion module, dual-branch attention feature fusion mechanism, and cross-layer feature reuse. By integrating encoded conditional information with the original input data of the generative adversarial network, the fusion module enables precise control over the generation of sonar images under specific conditions. A hierarchical attention mechanism is implemented, sequentially performing channel-level and pixel-level attention operations. This establishes distinct weight matrices at both granularities, thereby enhancing the correlation between corresponding elements. The dual-branch attention features are fused via a skip-connection architecture, facilitating efficient feature reuse across network layers. The experimental results demonstrate that the proposed CDBA-GAN generates condition-specific sonar images with a significantly lower Fréchet inception distance (FID) compared to existing methods. Notably, the framework exhibits robust imaging performance under noisy interference and outperforms state-of-the-art models (e.g., DCGAN, WGAN, SAGAN) in fidelity across four categorical conditions, as quantified by FID metrics. Full article

► Show Figures

Figure 1

14 pages, 5250 KiB

Open AccessArticle

An Enhanced Siamese Network-Based Visual Tracking Algorithm with a Dual Attention Mechanism

by Xueying Cai, Sheng Feng, Varshosaz Masood, Senang Ying, Binchao Zhou, Wentao Jia, Jianing Yang, Canlin Wei and Yucheng Feng

Electronics 2025, 14(13), 2579; https://doi.org/10.3390/electronics14132579 - 26 Jun 2025

Viewed by 226

Abstract

Aiming at the problems of SiamFC, such as shallow network architecture, a fixed template, a lack of semantic understanding, and temporal modeling, this paper proposes a robust target-tracking algorithm that incorporates both channel and spatial attention mechanisms. The backbone network of our algorithm [...] Read more.

Aiming at the problems of SiamFC, such as shallow network architecture, a fixed template, a lack of semantic understanding, and temporal modeling, this paper proposes a robust target-tracking algorithm that incorporates both channel and spatial attention mechanisms. The backbone network of our algorithm adopts depthwise, separable convolution to improve computational efficiency, adjusts the output stride and convolution kernel size to improve the network feature extraction capability, and optimizes the network structure through neural architecture search, enabling the extraction of deeper, richer features with stronger semantic information. In addition, we add channel attention to the target template branch after feature extraction to make it adaptively adjust the weights of different feature channels. In the search region branch, a sequential combination of channel and spatial attention is introduced to model spatial dependencies among pixels and suppress background and distractor information. Finally, we evaluate the proposed algorithm on the OTB2015, VOT2018, and VOT2016 datasets. The results show that our method achieves a tracking precision of 0.631 and a success rate of 0.468, improving upon the original SiamFC by 3.4% and 1.2%, respectively. The algorithm ensures robust tracking in complex scenarios, maintains real-time performance, and further reduces both parameter counts and overall computational complexity. Full article

(This article belongs to the Special Issue Advances in Mobile Networked Systems)

► Show Figures

Figure 1

Search Results (1,003)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,003)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI