MDPI - Publisher of Open Access Journals

20 pages, 8158 KB

Open AccessArticle

IIR-PoinTr: A Framework for Enhancing Pig Body Structure in Pose Point Cloud Completion

by Faming Chang, Mengting Zhou, Zhenwei Yu, Haobo Hu, Benhai Xiong, Fuyang Tian and Xiangfang Tang

Agriculture 2026, 16(13), 1375; https://doi.org/10.3390/agriculture16131375 (registering DOI) - 24 Jun 2026

Viewed by 5

Abstract

In precision livestock farming, 3D point clouds provide important data support for analyzing pig behavior and monitoring their health. However, due to environmental occlusions, limited sensor viewpoints, and mutual shielding between pigs, the acquired point clouds are often severely partial, which affects the [...] Read more.

In precision livestock farming, 3D point clouds provide important data support for analyzing pig behavior and monitoring their health. However, due to environmental occlusions, limited sensor viewpoints, and mutual shielding between pigs, the acquired point clouds are often severely partial, which affects the accuracy of body shape modeling and behavior recognition. To address these challenges, this study constructed a pig pose point cloud dataset using multi-view depth camera acquisition and point cloud registration techniques. Based on this dataset, an improved point cloud completion model, IIR-PoinTr, is proposed to enhance the reconstruction of geometric and topological structures in pig bodies. By strengthening local geometric perception and high-dimensional feature representation, the model improves the reconstruction quality of partial pig point clouds and produces more structurally consistent pig body shapes. Experimental results show that, on the self-constructed pig posture dataset, the proposed method reduces Chamfer Distance (CD-L1) by 3.6%, CD-L2 by 6.9%, and Earth Mover’s Distance (EMD) by 2.0%, while improving the F-score by 5.4% compared with the baseline model. In single-view point cloud completion tasks, the method is capable of reconstructing geometrically consistent pig body structures and increases downstream classification accuracy by 34.9%. These results indicate that the proposed method can improve the reconstruction quality of partial pig point clouds and provide preliminary technical support for posture analysis under occlusion. Full article

(This article belongs to the Special Issue Machine Learning in Precision Livestock Farming: From Animal Activity Forecasting to Environmental Control)

► Show Figures

Figure 1

18 pages, 2423 KB

Open AccessArticle

Flexible Light Field Reconstruction: Enabling Arbitrary Sampling and Angular Resolution

by Xia Liu, Junzhen Ye, Zhangmin Wu and Qiang Fu

Electronics 2026, 15(13), 2763; https://doi.org/10.3390/electronics15132763 (registering DOI) - 23 Jun 2026

Viewed by 56

Abstract

Compared with hardware-dependent methods, light field (LF) reconstruction algorithms enable a more economical and convenient acquisition of densely sampled LF (DSLF). Existing learning-based LF reconstruction methods suffer from limited flexibility, as they rely on fixed sampling patterns and predefined angular resolutions. In this [...] Read more.

Compared with hardware-dependent methods, light field (LF) reconstruction algorithms enable a more economical and convenient acquisition of densely sampled LF (DSLF). Existing learning-based LF reconstruction methods suffer from limited flexibility, as they rely on fixed sampling patterns and predefined angular resolutions. In this paper, we propose a flexible deep learning framework, which can reconstruct DSLF with arbitrary angular resolution from randomly distributed sparse input views of an arbitrary quantity. The proposed framework consists of two core stages, namely the SAI Synthesis and the LF Refinement. The SAI Synthesis adopts Plane Sweep Volume (PSV) to cope with randomly sampled input views, and leverages the Multi-Scale Attention (MSA) module to compute per-view weights for adaptive feature fusion and support arbitrary numbers of input views. The LF Refinement stage integrates intermediate results and fully exploits LF parallax structures to further improve reconstruction quality. Experimental results demonstrate that our method achieves superior flexibility and reconstruction quality, and outperforms most state-of-the-art LF reconstruction methods. Full article

(This article belongs to the Special Issue Computer Vision and Image Processing in Machine Learning)

37 pages, 19621 KB

Open AccessReview

Unveiling the Landscape of Human Pose Estimation

by Jianjun Yang, Sankarshan Dasgupta, Wenjiao Liu, Ju Shen, Bryson R. Payne, Ying Luo, Ruixu Liu and Tam V. Nguyen

Appl. Sci. 2026, 16(12), 6242; https://doi.org/10.3390/app16126242 (registering DOI) - 22 Jun 2026

Viewed by 251

Abstract

Human pose estimation (HPE) has advanced rapidly with deep learning, enabling a transition from specialized sensing and multi-view systems toward monocular RGB-based approaches. These developments have expanded applications in healthcare, robotics, sports analytics, and human–computer interaction. However, the growing diversity of deep learning [...] Read more.

Human pose estimation (HPE) has advanced rapidly with deep learning, enabling a transition from specialized sensing and multi-view systems toward monocular RGB-based approaches. These developments have expanded applications in healthcare, robotics, sports analytics, and human–computer interaction. However, the growing diversity of deep learning paradigms, ranging from convolutional and recurrent models to graph-based and Transformer-based approaches, has resulted in a fragmented literature, making it difficult to systematically compare methods and guide system design. This paper addresses this challenge by providing a comprehensive survey of deep learning-based monocular HPE methods published over the past decade and introducing a unified modular framework. The proposed framework organizes HPE systems into six modular estimation paradigms, including single-image-based estimation, multi-frame-based estimation, Top-Down and Bottom-Up pose estimation strategies, 2D-to-3D pose reconstruction, and direct 3D estimation. Each module is analyzed in terms of representative approaches, design trade-offs, and practical considerations, supported by algorithmic formulations that outline the computational pipeline at each stage. Unlike prior surveys that primarily catalog methods or report benchmark results in isolation, this work emphasizes how component-level design choices relate to overall system performance. The paper summarizes performance trends on benchmarks including Human3.6M, COCO, and MPII, highlighting persistent challenges such as occlusion and viewpoint variation, and outlines future research directions including interaction-aware modeling, efficient deployment, and improved robustness under real-world conditions. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 8518 KB

Open AccessArticle

CVA-Net: Multi-View 3D Reconstruction for Fringe Projection Profilometry via Cross-View Attention and Sim2Real Learning

by Zuqiong Chen, Xiaopin Zhong and Yibin Tian

Photonics 2026, 13(6), 601; https://doi.org/10.3390/photonics13060601 (registering DOI) - 21 Jun 2026

Viewed by 206

Abstract

Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that [...] Read more.

Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that directly reconstructs dense depth maps from multi-view fringe patterns. CVA-Net simultaneously processes four fringe images acquired from orthogonal projection directions and leverages a CVA module to explicitly model inter-view dependencies, enabling adaptive fusion of complementary information. A 3D U-Net backbone with attention gates, atrous spatial pyramid pooling (ASPP), and an auxiliary parameter estimation branch further enhances reconstruction accuracy and structural consistency via multitask learning. To support Sim2Real network training, we build a Blender-based digital twin of a multi-view FPP system and generate a large-scale synthetic dataset with perfect ground truth. Extensive experiments on both synthetic and real-world objects demonstrate that CVA-Net significantly outperforms state-of-the-art single-view methods. With a symmetric four-view configuration and fringe period of 8, CVA-Net achieves an MAE of 0.0359 mm, an MSE of 0.0379 mm² and an RMSE of 0.1947 mm, reducing the MAE, MSE, and RMSE by 32.8%, 54.1%, and 32.2%, respectively, compared to the best single-view competitor. Ablation studies validate the contribution of each architectural component, while real-system experiments demonstrate the feasibility of transferring a network trained purely on synthetic data to practical FPP measurements without domain adaptation. Although further improvements are required to enhance reconstruction accuracy under real imaging conditions, the proposed framework provides an effective initial step toward bridging the gap between digital-twin-based training and real-world multi-view FPP applications. CVA-Net provides a robust, occlusion-aware solution for multi-view FPP reconstruction. Full article

(This article belongs to the Special Issue Optical Imaging for 3D Surface and Phase Recovery: Techniques and Applications)

► Show Figures

Figure 1

21 pages, 20806 KB

Open AccessArticle

Research on Spanning Tree Topology Optimization and Pyramid-Based Fine Alignment Algorithm for Multi-View Point Cloud Registration

by Chang Deng, Pingqing Fan and Hongzhou Chen

Information 2026, 17(6), 611; https://doi.org/10.3390/info17060611 (registering DOI) - 19 Jun 2026

Viewed by 219

Abstract

Multi-view point cloud registration is a fundamental technology for 3D reconstruction and indoor robot navigation and remains a core challenge for robust environmental perception. Its key difficulty lies in achieving globally consistent alignment of multiple partially overlapping point clouds efficiently and reliably. To [...] Read more.

Multi-view point cloud registration is a fundamental technology for 3D reconstruction and indoor robot navigation and remains a core challenge for robust environmental perception. Its key difficulty lies in achieving globally consistent alignment of multiple partially overlapping point clouds efficiently and reliably. To address the limitations of existing methods, including low registration accuracy under small overlaps, severe error accumulation in long sequences, and the difficulty of balancing computational efficiency with global consistency, this paper proposes a multi-view point cloud registration framework that integrates spanning tree-based global topology constraints with a multi-scale pyramid-based local refinement strategy, specifically validated for indoor environments. First, a Voxel-Guided Normal Consistency Keypoint Extraction (VG-NCKE) method is presented. It leverages voxel grids to guide stable computation of local geometric features and filters candidate keypoints using a neighborhood normal direction consistency metric, effectively improving keypoint repeatability and spatial uniformity on unevenly distributed point clouds. Second, a coarse registration strategy with global constraints is constructed based on the Overlap Confidence-weighted Minimum Spanning Tree (OC-WST). It quantifies inter-frame overlap reliability as edge weights and employs Prim’s algorithm to build the minimum spanning tree as the topological skeleton for global registration. By prioritizing high-overlap frame pairs, the method suppresses error propagation and reduces the complexity of multi-view registration. Additionally, a multi-scale pyramid ICP fine registration algorithm is designed. It adopts a point-to-plane error model instead of the traditional point-to-point distance metric and performs progressive optimization through a three-layer point cloud pyramid from coarse to fine. This expands the convergence basin and gradually improves alignment accuracy, mitigating the sensitivity of single-scale ICP to initial poses. Extensive experiments on the indoor 3DMatch dataset and real indoor LiDAR sequences demonstrate that the proposed method outperforms competing approaches in terms of registration accuracy, computational efficiency, and long-sequence robustness, validating its effectiveness for indoor multi-view point cloud registration tasks. Full article

(This article belongs to the Section Information Applications)

► Show Figures

Figure 1

23 pages, 11802 KB

Open AccessArticle

LE-DETR: A Lightweight and Efficient Model for Small-Object Detection in Remote Sensing Images

by Qi Wang, Hongyun An and Yongji Chen

Remote Sens. 2026, 18(12), 2018; https://doi.org/10.3390/rs18122018 - 17 Jun 2026

Viewed by 217

Abstract

Object detection in remote sensing imagery plays an irreplaceable role in critical fields such as military reconnaissance and disaster monitoring. However, when dealing with minute targets characterised by an extremely low pixel proportion, a lack of textural information, and severe background interference, existing [...] Read more.

Object detection in remote sensing imagery plays an irreplaceable role in critical fields such as military reconnaissance and disaster monitoring. However, when dealing with minute targets characterised by an extremely low pixel proportion, a lack of textural information, and severe background interference, existing algorithms still face the challenge of balancing detection accuracy with computational efficiency. To address this, this paper proposes a lightweight frequency-domain-aware end-to-end detection model, LE-DETR, based on an improved version of RT-DETR. Firstly, a Lightweight Feature Extraction Module (LFEM) is designed. Through a heterogeneous dual-path architecture and reparameterisation techniques, it significantly reduces computational complexity whilst enhancing the capture of fine-grained spatial features. Secondly, an Efficient Spatio-Frequency Fusion Module (ESFFM) is introduced. This utilises a multi-head self-attention mechanism to construct a global view whilst combining the Fourier transform to reconstruct target features from a frequency-domain perspective, thereby effectively suppressing background noise and enhancing the target’s edge signals. Finally, we propose the Efficient Frequency-Aware Fusion Feature Pyramid Network (EFAM-FPN), which utilises SPD Conv to mitigate the loss of key features during downsampling and introduces a frequency-domain attention mechanism to suppress complex background noise, thereby improving the model’s detection accuracy for extremely small objects. The experimental results show that, whilst reducing the number of parameters by 41.7% compared to the baseline model, LE-DETR achieved improvements of 2.6%, 1.7% and 2.4%, respectively, in the mAP50 metric across the three mainstream remote sensing datasets—VisDrone2019, NWPU VHR-10 and DIOR. This demonstrates an effective balance between detection accuracy and inference efficiency, fully validating its robustness and practical value in complex remote sensing application scenarios. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

32 pages, 8597 KB

Open AccessReview

Intelligent Digital Rock Physics: Advances and Perspectives from Imaging Reconstruction to Pore-Scale Multiphase Flow Simulation

by Xue Li, Lin Zhu, Feng Gao, Xin Liang and Zhengzheng Cao

Appl. Sci. 2026, 16(12), 6118; https://doi.org/10.3390/app16126118 - 17 Jun 2026

Viewed by 246

Abstract

In characterizing unconventional reservoirs, conventional Digital Rock Physics (DRP) has long been constrained by three fundamental bottlenecks: the trade-off between imaging resolution and field of view, challenges in reconstructing multiscale pore topology, and the prohibitive computational cost of direct numerical simulation (DNS) at [...] Read more.

In characterizing unconventional reservoirs, conventional Digital Rock Physics (DRP) has long been constrained by three fundamental bottlenecks: the trade-off between imaging resolution and field of view, challenges in reconstructing multiscale pore topology, and the prohibitive computational cost of direct numerical simulation (DNS) at the pore scale. The deep integration of artificial intelligence and rock physics has given rise to a new paradigm—Intelligent Digital Rock Physics (IDRP). This paper provides a systematic review of the evolutionary trajectory of IDRP, with a focus on how machine learning is reshaping the end-to-end workflow from imaging and segmentation to reconstruction and simulation. First, we survey image super-resolution and 3D pore structure generation techniques based on convolutional neural networks (CNNs), generative adversarial networks (GANs), and diffusion models, elucidating their mechanisms for surpassing optical diffraction limits and incorporating macroscopic petrophysical constraints. Second, we outline algorithmic strategies for fusing multi-source heterogeneous data (e.g., Micro-CT and SEM) and representing dual-porosity or multi-continuum systems. Third, we critically examine the application of machine learning surrogates in single- and multiphase flow prediction, highlighting how physics-informed machine learning (PIML) and reinforcement learning (RL)—by embedding governing equations such as Navier–Stokes or Muskat–Leverett into loss functions—achieve both computational acceleration and physical consistency. We further identify key limitations of current IDRP approaches, including insufficient validation of generated topological realism, narrow generalization across lithologies, inadequate representation of dynamic wettability, and limited model interpretability. Finally, we propose a forward-looking roadmap centered on multimodal foundation models for rocks, coupled with neural operators and uncertainty quantification frameworks, emphasizing the critical pathways for translating IDRP into engineering digital twins for unconventional hydrocarbon development, coalbed methane production enhancement, Enhanced Geothermal Systems, and geological CO₂ storage. This review offers a comprehensive reference for researchers at the intersection of geophysics, rock mechanics, and artificial intelligence. Full article

(This article belongs to the Section Civil Engineering)

► Show Figures

Figure 1

17 pages, 1555 KB

Open AccessReview

Whole-Body Dynamic Positron Emission and Computed Tomography (WBD-PET/CT): Latest Developments, Challenges and Opportunities

by Anastasios Vatalis, Dimitra Tsivaka, Varvara Valotassiou, Emmanouil Panagiotidis, Panagiotis Georgoulias, Nicolas A. Karakatsanis and Ioannis Tsougos

Diagnostics 2026, 16(12), 1866; https://doi.org/10.3390/diagnostics16121866 - 16 Jun 2026

Viewed by 263

Abstract

Whole-body dynamic positron emission tomography/computed tomography (WBD-PET/CT) has transformed medical imaging, enabling the fusion between (i) detailed anatomical maps of the human body and (ii) quantitative multi-parametric functional maps of specific biochemical and physiological processes across the human body beyond the semi-quantitative limitations [...] Read more.

Whole-body dynamic positron emission tomography/computed tomography (WBD-PET/CT) has transformed medical imaging, enabling the fusion between (i) detailed anatomical maps of the human body and (ii) quantitative multi-parametric functional maps of specific biochemical and physiological processes across the human body beyond the semi-quantitative limitations of static PET/CT imaging. Latest developments in systems hardware, particularly with the introduction of long-axial-field-of-view (LAFOV) and Time-of-Flight (TOF) PET scanners and low-dose CT scanners, and in data analysis, primarily with direct parametric PET image reconstruction and Artificial Intelligence, offer unprecedented opportunities towards the wide clinical adoption of the superior quantitative accuracy and precision of WBD-PET/CT imaging overcoming current challenges, such as data acquisition complexity and long scan durations. This review aims to summarize the latest developments, current challenges, and emerging opportunities in WBD-PET/CT, emphasizing its potential to broaden the diagnostic and theranostic role of PET/CT in clinical practice. Full article

(This article belongs to the Special Issue Whole-Body PET/CT: From Diagnosis to Prognosis)

► Show Figures

Figure 1

22 pages, 1854 KB

Open AccessArticle

Efficient HDR Image Reconstruction: A ResNet Approach with Enhanced Data Augmentation

by Ting-Wei He, Pei-Chi Chen and Tzung-Her Chen

Electronics 2026, 15(12), 2595; https://doi.org/10.3390/electronics15122595 - 12 Jun 2026

Viewed by 192

Abstract

High dynamic range (HDR) image reconstruction from a single low dynamic range (LDR) input remains an important problem for computational photography, particularly when practical deployment on consumer-grade hardware is considered. With the increasing availability of hardware supporting HDR, public demand for capturing and [...] Read more.

High dynamic range (HDR) image reconstruction from a single low dynamic range (LDR) input remains an important problem for computational photography, particularly when practical deployment on consumer-grade hardware is considered. With the increasing availability of hardware supporting HDR, public demand for capturing and viewing HDR images has grown significantly. Recent research has explored deep learning-based approaches to reconstruct HDR images from low dynamic range (LDR) inputs by extracting regional pixel features or leveraging the camera response function (CRF) for model training. Many of these approaches employ Convolutional Neural Network (CNN) architectures and utilize skip connections to preserve learned information. Nevertheless, the configuration-level effects of data augmentation in HDR reconstruction remain insufficiently discussed. Existing CNN-based approaches, such as HDRCNN, HDRUNet, and ExpandNet, have demonstrated promising reconstruction ability, but they may involve a heavy backbone architecture, a long training time, or a limited discussion of how preprocessing configurations affect reconstruction performance. This study presents an engineering-oriented HDR reconstruction framework derived from HDRCNN, focusing on practical efficiency, structural fidelity, and training feasibility. The proposed framework introduces three modifications: (1) a configuration-level comparison of composite data augmentation settings, including unsharp masking, denoising, Gaussian blur, and brightness–contrast adjustment; (2) the replacement of the original VGG16 backbone with a ResNet50-based encoder enhanced with attention blocks and squeeze-and-excitation (SE) blocks for improved multi-scale feature extraction and channel-wise recalibration; and (3) the integration of mixed-precision training with cosine annealing learning-rate scheduling to reduce computational cost. Experimental results on the SI-HDR dataset show that the best composite augmentation configuration improves PSNR from 19.05 dB to 22.10 dB and SSIM from 0.6444 to 0.7714 without increasing the training time. Compared with the original VGG16-based HDRCNN setting, the ResNet50-based model reduces training time while improving SSIM from 0.2705 to 0.8512. Under the adopted comparison protocol, the proposed model achieves the shortest training time and slightly higher PSNR than HDRUNet, while HDRUNet retains a higher SSIM. This indicates a trade-off among pixel-wise fidelity, structural similarity, and computational efficiency. The current evaluation is limited by a small test setting, composite rather than operation-level augmentation analysis, and the use of PSNR and SSIM only; therefore, future work should include full benchmark evaluation, additional perceptual/HDR-specific metrics, and controlled component-level ablation studies. Full article

(This article belongs to the Special Issue Computer Vision and Image Processing in Machine Learning)

► Show Figures

Figure 1

32 pages, 25468 KB

Open AccessArticle

MLE-ResUNet: SWIR Image Super-Resolution Using Along-Track Oversampling and Visible-Light-Guided Deep Learning

by Yongqian Zhu, Bo Cheng, Qianmin Liu, Zhijing He, Tianzhen Ma, Chen Cao, Bangjian Zhao, Miao Hu, Xianqiang He and Chunlai Li

Remote Sens. 2026, 18(12), 1922; https://doi.org/10.3390/rs18121922 (registering DOI) - 10 Jun 2026

Viewed by 166

Abstract

Shortwave infrared (SWIR) imagery plays an important role in land–water boundary delineation, coastal monitoring, and complex aquatic environment observation. However, the spatial resolution of SWIR bands is usually lower than that of visible bands, which limits their capability to represent fine-scale targets and [...] Read more.

Shortwave infrared (SWIR) imagery plays an important role in land–water boundary delineation, coastal monitoring, and complex aquatic environment observation. However, the spatial resolution of SWIR bands is usually lower than that of visible bands, which limits their capability to represent fine-scale targets and boundary structures. To address this problem, this study proposes MLE-ResUNet, a SWIR image super-resolution method that integrates along-track oversampling with visible-light-guided deep learning. The proposed method first exploits dual-view SWIR observations with sub-pixel displacement generated by increasing the sampling line rate in the push-broom imaging process. A maximum likelihood estimation (MLE)-based physical prior module is then introduced to transform multi-view degraded observations into a physically consistent latent high-resolution prior. Finally, high-resolution visible images are used to provide edge, texture, and structural guidance, and a ResUNet-based network is employed for multi-source feature fusion and residual reconstruction. Based on multi-region measured data acquired by the LHRSI (Lightweight High-Resolution Spectral Imager) payload onboard the BlueCarbon-1A satellite, a SWIR super-resolution dataset covering typical urban, farmland, and coastal scenarios was constructed. Comparative experiments were conducted against PCA, BDSD, PanNet, GPPNN, and two additional lightweight-guided deep learning baselines, namely LGPConv and a CANConv-style visible-guided baseline. The results show that MLE-ResUNet achieves the best performance across different scenarios and consistently outperforms the comparison methods in terms of SSIM, SAM, ERGAS, and Q-index. The proposed method effectively enhances spatial detail recovery while maintaining favorable spectral consistency. Ablation experiments further demonstrate that both along-track oversampling information and the MLE-based physical prior contribute to improved reconstruction quality and more stable training convergence. These findings indicate that the proposed method can enhance fine-scale SWIR observation capability without substantially increasing hardware complexity, providing an effective technical solution for shoreline identification, land–water boundary extraction, and complex surface target monitoring. Full article

(This article belongs to the Special Issue Advanced Object Detection, Classification and Recognition in VIR Optical and SAR Remote Sensing Imagery)

► Show Figures

Figure 1

21 pages, 5157 KB

Open AccessArticle

3D Quantitative Modeling for Stone Fruit Quality Assessment by LF-NMRI

by Kang Wang, Bing Li, Shan Zeng, Wei Tao, Ke Yang and Zhiguang Yang

Foods 2026, 15(11), 2012; https://doi.org/10.3390/foods15112012 - 4 Jun 2026

Viewed by 264

Abstract

The core volume ratio (CVR) is a key indicator for evaluating the proportion of edible fraction in stone fruits. Traditionally, CVR is determined through destructive sampling by separately measuring the masses of the core and entire fruit. Recently, low-field nuclear magnetic resonance imaging [...] Read more.

The core volume ratio (CVR) is a key indicator for evaluating the proportion of edible fraction in stone fruits. Traditionally, CVR is determined through destructive sampling by separately measuring the masses of the core and entire fruit. Recently, low-field nuclear magnetic resonance imaging (LF-NMRI) has been introduced as a non-destructive alternative, but its sparse sampling limits the ability to achieve accurate spatial and volumetric quantification of fruit quality. To address this limitation, we propose a novel method for high-precision three-dimensional (3D) modeling of stone fruits. The method acquires tomographic LF-NMRI sequences along three orthogonal axes. Each sequence is segmented into pulp and core regions using a SwinUNet deep learning model and converted into point clouds for each view. Point clouds from the three orthogonal views are registered via a genetic algorithm to align structural information from complementary perspectives and fused into a unified 3D model through Poisson surface reconstruction. Using prunes as a representative case, the method enables accurate quantification of core and entire fruit volumes, achieving a CVR estimation with a mean absolute error of 0.13% compared to manual measurements. The proposed three-view reconstruction strategy yields a volumetric error of only 0.73%, significantly outperforming single-view (4.57%) and dual-view (3.73%) approaches. This technology provides a robust and accurate non-destructive solution for 3D internal quality analysis of fruits. Full article

(This article belongs to the Special Issue Rapid and Non-Destructive Detection Technology for Food Quality and Safety)

► Show Figures

Figure 1

21 pages, 1323 KB

Open AccessArticle

Global-Local Complementary Fusion: Unsupervised Graph Anomaly Detection via Diffusion Reconstruction and Contrastive Learning

by Ruibin Hu, Qian Chen, Huiying Xu, Ruidong Wang, Huazhen Jin, Xiao Huang and Xinzhong Zhu

Symmetry 2026, 18(6), 968; https://doi.org/10.3390/sym18060968 - 3 Jun 2026

Viewed by 179

Abstract

Anomaly detection on attributed graphs is essential for scientific integrity, cybersecurity, and financial oversight, where abnormal patterns often manifest as breaks in structure or attributes. However, existing unsupervised methods are difficult to combine both global and local perspectives to detect anomalies. To address [...] Read more.

Anomaly detection on attributed graphs is essential for scientific integrity, cybersecurity, and financial oversight, where abnormal patterns often manifest as breaks in structure or attributes. However, existing unsupervised methods are difficult to combine both global and local perspectives to detect anomalies. To address this issue, we propose DCGAD, a unified unsupervised framework that captures anomalies by fusing global reconstruction error and local view inconsistency. Our model leverages diffusion reconstruction to strengthen global semantic information, employing two parallel autoencoders to reconstruct the graph structure based on the original features and diffusion-enhanced features, respectively, to capture global structural differences. Complementarily, the model samples two local subgraph views per target node and uses multi-view contrastive learning to evaluate local contextual inconsistencies. By jointly optimizing these two complementary objectives, our proposed model achieves collaborative use of local and global information. Extensive experiments on six real-world graph datasets show that DCGAD outperforms other state-of-the-art approaches, achieving excellent scores on citation networks and significant gains on social and collaborative platforms. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

16 pages, 4116 KB

Open AccessArticle

Repowering Without Removal: Field-Verified Multi-Year Outdoor Storage of Damaged Photovoltaic Modules on Agricultural Land in Czechia

by Martin Kozelka, Vladislav Poulek, Václav Beránek and Tomáš Finsterle

Sustainability 2026, 18(11), 5632; https://doi.org/10.3390/su18115632 - 2 Jun 2026

Viewed by 359

Abstract

Ground-mounted photovoltaic (PV) plants generate discrete end-of-life waste streams during repowering/revamping, yet damaged modules do not always leave the site. We document two field-verified case studies from Czechia, in which damaged PV modules remained stored outdoors on agricultural land after repowering/revamping. The two [...] Read more.

Ground-mounted photovoltaic (PV) plants generate discrete end-of-life waste streams during repowering/revamping, yet damaged modules do not always leave the site. We document two field-verified case studies from Czechia, in which damaged PV modules remained stored outdoors on agricultural land after repowering/revamping. The two sites are treated as illustrative, field-verified cases rather than as a statistically representative sample of PV plants in Czechia or Europe. The sites were first identified during field visits in summer 2025, and a retrospective review of public CUZK orthophoto time series was then used to reconstruct when the stockpiles first became visible and whether they were still present in the latest available imagery. The stored module piles first became visible in 2022 and 2021 at the two sites, and were still present in summer 2025, corresponding to a minimum confirmed persistence of about 3 and 4 years, respectively. Orthophoto-based GIS supported by field photographs was used to quantify the land parcel area (19,560 and 22,100 m²), PV plan-view area (4960 and 5080 m²), storage footprint (109 and 100 m²), approximate stored module count (~1800 and ~2000), and stored mass (39.6 and 36.0 t). Using site-specific module footprints and a representative 30-module stack, the local stack-based pressures were calculated to be 3.92 and 3.26 kPa, respectively. Soil chemistry, leachate, and groundwater were not measured; therefore, the environmental implications should be interpreted as precautionary risk and as a need for monitoring, not as measured contamination at the two sites. The study shows that repowering/revamping can create a multi-year gap between module replacement and actual site clearance, during which recycling and final disposal are effectively delayed. Full article

(This article belongs to the Section Environmental Sustainability and Applications)

► Show Figures

Figure 1

16 pages, 1930 KB

Open AccessArticle

Optimal Camera Positioning for Single-View 3D Foot Scan Completion: Evaluation Using Deep Learning-Based Reconstruction

by Matthias Jäger, Jörg Eberhardt and Douglas W. Cunningham

Appl. Syst. Innov. 2026, 9(6), 119; https://doi.org/10.3390/asi9060119 - 2 Jun 2026

Viewed by 462

Abstract

Shoes are increasingly being bought online without being put on in person as internet shopping gains popularity. As a result, returns have increased significantly, which has had negative effects on the economy and the environment. Numerous technologies are available to measure foot size [...] Read more.

Shoes are increasingly being bought online without being put on in person as internet shopping gains popularity. As a result, returns have increased significantly, which has had negative effects on the economy and the environment. Numerous technologies are available to measure foot size precisely at home or in-store in order to address this problem. People can identify their perfect shoe size and avoid needless returns by taking accurate foot measurements. A single image should be enough to measure the foot in order to make the system as easy as feasible for the user. This is accomplished by using point clouds from one side of the foot, which are produced by capturing a depth image. In order to optimise the reconstruction of partial data, this study investigates the impact of the acquisition position of a single partial foot scan on reconstruction quality and measurement accuracy when a state-of-the-art network is employed for completion. To this end, task-specific partial foot datasets were created with varying camera positions and foot orientations to determine the optimal conditions for depth map acquisition. Utilising the foot dataset that has been introduced for the purposes of training and evaluation, the network was able to generate accurate reconstructions. These reconstructions allowed for the estimation of shoe size in accordance with the European sizing system. The method is accurate enough in all tested positions to reconstruct a foot with sufficient precision. However, we also identified position 5 in our multi-view setup, which is viewed from a lower angle, as the position that leads to the best reconstruction results. Additionally, advantages were found with input data that show more of the forefoot than the heel area. Therefore, the forefoot provides more information on the overall geometry and should be the focus of single-shot procedures. Full article

► Show Figures

Figure 1

27 pages, 39300 KB

Open AccessArticle

Multi-Frame Temporal Integration for 3-D Shape Measurement of Freely Falling Small Objects Using a High-Speed Camera Array

by Hao Duan, Shaopeng Hu, Feiyue Wang, Kohei Shimasaki and Idaku Ishii

Sensors 2026, 26(11), 3457; https://doi.org/10.3390/s26113457 - 30 May 2026

Viewed by 269

Abstract

Dynamic three-dimensional (3-D) reconstruction of small objects moving at high speed is fundamentally limited by the number of viewpoints that a fixed camera array can provide at any single time instant. When the camera count is insufficient, single-frame multi-view stereo produces incomplete or [...] Read more.

Dynamic three-dimensional (3-D) reconstruction of small objects moving at high speed is fundamentally limited by the number of viewpoints that a fixed camera array can provide at any single time instant. When the camera count is insufficient, single-frame multi-view stereo produces incomplete or inaccurate geometry. This paper proposes a multi-frame temporal integration approach that overcomes this limitation by exploiting the rigid-body assumption: because a falling object maintains its shape across consecutive frames, images captured at different time instants can be combined into a single, viewpoint-enriched reconstruction. A three-layer circular array of 32 synchronized RGB cameras captures 1440 × 1080 images at 160 fps, and a free-fall-oriented algorithm automatically detects active frames, selects informative temporal windows, and feeds the accumulated multi-frame images into a structure-from-motion and multi-view stereo (SfM-MVS) pipeline, effectively multiplying the number of viewpoints without additional hardware. The algorithm simultaneously recovers the 6-DOF pose trajectory of each object from the SfM-estimated camera parameters. Progressive accumulation experiments on freely falling soybeans (approximately 9–10 mm diameter) show that a single 32-camera frame already achieves an F-score exceeding 0.97 at a 0.5 mm threshold against an industrial structured-light scanner reference, and that accumulating additional temporal frames reaches a stable convergence plateau with both objects reaching a plateau F-score of 0.984. Beyond approximately one to two accumulated frames, additional frames yield diminishing returns, confirming that a small number of temporal frames is sufficient for convergent sub-millimeter accuracy. Across 30 independent free-fall trials with three objects, the system achieves an overall mean error of

0.146 \pm 0.033

mm and an overall F-score of

0.980 \pm 0.006

—a mean relative error of approximately 1.6% on 8–10 mm targets—and fine surface features such as structural cracks are resolved at a fidelity sufficient for visual defect identification. These results establish rigid-body multi-frame temporal integration as an effective strategy for high-throughput, non-contact 3-D inspection of small objects in motion. Full article

(This article belongs to the Special Issue Visual Sensing Methods for 3D Object Detection, Tracking, and Quantification)

► Show Figures

Figure 1

Search Results (554)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (554)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI