MDPI - Publisher of Open Access Journals

26 pages, 477 KB

Open AccessArticle

A Low-Cost RGB-D Sensing Front-End for Stable 3D Hand Landmark Reconstruction Using MediaPipe and ZED2 Stereo Depth

by Laixin Peng, Tiansheng Liu and Bingwei He

Sensors 2026, 26(12), 3730; https://doi.org/10.3390/s26123730 - 11 Jun 2026

Viewed by 218

Stable three-dimensional hand landmark reconstruction using low-cost RGB-D sensors is important for human–computer interaction, robot teleoperation, and vision-based motion analysis. RGB-based hand landmark detectors provide stable semantic 2D landmarks, but their depth output is not a metric measurement in the physical camera coordinate system. Stereo cameras can provide metric depth, but direct landmark-level back-projection is sensitive to invalid pixels, local depth holes, boundary noise, and partial occlusion. To address these problems, this paper presents a lightweight RGB-D sensing front-end that combines MediaPipe semantic hand landmarks with ZED2 stereo depth. The proposed pipeline detects 21 semantic hand landmarks in the RGB image, obtains landmark-level metric depth from the aligned ZED2 depth map using local median sampling, reconstructs 3D landmarks by camera back-projection, and further applies exponential moving average filtering and a bone-length consistency constraint. Experiments were conducted on a self-collected SVO dataset containing 13 hand actions and 26 recorded sequences, and an additional checkerboard-based reference-distance validation was performed to evaluate the metric depth sampling and 3D back-projection component. Compared with single-pixel sampling, the

5 \times 5

local median strategy slightly increased the valid-depth ratio from 0.9731 to 0.9738 and reduced the temporal smoothness metric from 1.7163 mm to 1.6902 mm. To further justify the temporal filtering choice, an additional comparison with the 1 Euro Filter was conducted using the reconstructed win5 trajectories. The 1 Euro Filter produced stronger smoothing, reducing the temporal smoothness metric to 0.196 mm, but also reduced the path-length ratio to 0.484, indicating substantial motion attenuation. EMA0.7 was therefore retained as a more balanced setting, reducing the temporal smoothness metric to 0.826 mm while maintaining a path-length ratio of 0.803. The BL0.5 bone-length constraint reduced the bone-length standard deviation from 2.0727 mm to 1.1995 mm with limited trajectory modification. The final configuration provides a practical low-cost RGB-D front-end for stable 3D hand landmark reconstruction under controlled indoor conditions. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

12 pages, 4149 KB

Open AccessReview

Projected Augmented Reality in Surgery: History, Validation, and Future Applications

by Nikhil Dipak Shah, Lohrasb Sayadi, Peyman Kassani and Raj Vyas

J. Clin. Med. 2025, 14(22), 8246; https://doi.org/10.3390/jcm14228246 - 20 Nov 2025

Viewed by 1427

Abstract

Background/Objectives: Projected augmented reality (PAR) enables real-time projection of digital surgical information directly onto the operative field. This offers a hands-free, headset-free platform that is universally visible to all members of the surgical team. Compared to head-mounted display systems, which are limited by restricted fields of view, ergonomic challenges, and user exclusivity, PAR provides a more intuitive and collaborative surgical interface. When paired with artificial intelligence (AI), PAR has the potential to automate aspects of surgical planning and deliver high-precision guidance in both high-resource and global health settings. Our team is working on the development and validation of a PAR platform to dynamically project surgical and anatomic markings directly onto the patients intraoperatively. Methods: We developed a PAR system using a structured light scanner and depth camera to generate digital 3D surface reconstructions of a patient’s anatomy. Surgical markings were then made digitally, and a projector was used to precisely project these points directly onto the patient’s skin. We also developed a trained machine learning model that detects cleft lip landmarks and automatically designs surgical markings, with the plan to integrate this into our PAR system. Results: The PAR system accurately projected surgeon and AI-generated surgical markings onto anatomical models with sub-millimeter precision. Projections remained aligned during movement and were clearly visible to the entire surgical team without requiring wearable hardware. Conclusions: PAR integrated with AI provides accurate, real-time, and shared intraoperative guidance. This platform improves surgical precision and has broad potential for remote mentorship and global surgical training. Full article

(This article belongs to the Special Issue Plastic Surgery: Challenges and Future Directions)

► Show Figures

Figure 1

34 pages, 11523 KB

Open AccessArticle

Hand Kinematic Model Construction Based on Tracking Landmarks

by Yiyang Dong and Shahram Payandeh

Appl. Sci. 2025, 15(16), 8921; https://doi.org/10.3390/app15168921 - 13 Aug 2025

Cited by 4 | Viewed by 2861

Abstract

Visual body-tracking techniques have seen widespread adoption in applications such as motion analysis, human–machine interaction, tele-robotics and extended reality (XR). These systems typically provide 2D landmark coordinates corresponding to key limb positions. However, to construct a meaningful 3D kinematic model for body joint reconstruction, a mapping must be established between these visual landmarks and the underlying joint parameters of individual body parts. This paper presents a method for constructing a 3D kinematic model of the human hand using calibrated 2D landmark-tracking data augmented with depth information. The proposed approach builds a hierarchical model in which the palm serves as the root coordinate frame, and finger landmarks are used to compute both forward and inverse kinematic solutions. Through step-by-step examples, we demonstrate how measured hand landmark coordinates are used to define the palm reference frame and solve for joint angles for each finger. These solutions are then used in a visualization framework to qualitatively assess the accuracy of the reconstructed hand motion. As a future work, the proposed model offers a foundation for model-based hand kinematic estimation and has utility in scenarios involving occlusion or missing data. In such cases, the hierarchical structure and kinematic solutions can be used as generative priors in an optimization framework to estimate unobserved landmark positions and joint configurations. The novelty of this work lies in its model-based approach using real sensor data, without relying on wearable devices or synthetic assumptions. Although current validation is qualitative, the framework provides a foundation for future robust estimation under occlusion or sensor noise. It may also serve as a generative prior for optimization-based methods and be quantitatively compared with joint measurements from wearable motion-capture systems. Full article

(This article belongs to the Special Issue Human Activity Recognition (HAR) in Healthcare, 3rd Edition)

► Show Figures

Figure 1

15 pages, 1597 KB

Open AccessReview

A 10-Year Retrospective Review of Prenatal Applications, Current Challenges and Future Prospects of Three-Dimensional Sonoangiography

by Tuangsit Wataganara, Thanapa Rekhawasin, Nalat Sompagdee, Sommai Viboonchart, Nisarat Phithakwatchara and Katika Nawapun

Diagnostics 2021, 11(8), 1511; https://doi.org/10.3390/diagnostics11081511 - 21 Aug 2021

Cited by 7 | Viewed by 4103

Abstract

Realistic reconstruction of angioarchitecture within the morphological landmark with three-dimensional sonoangiography (three-dimensional power Doppler; 3D PD) may augment standard prenatal ultrasound and Doppler assessments. This study aimed to (a) present a technical overview, (b) determine additional advantages, (c) identify current challenges, and (d) predict trajectories of 3D PD for prenatal assessments. PubMed and Scopus databases for the last decade were searched. Although 307 publications addressed our objectives, their heterogeneity was too broad for statistical analyses. Important findings are therefore presented in descriptive format and supplemented with the authors’ 3D PD images. Acquisition, analysis, and display techniques need to be personalized to improve the quality of flow-volume data. While 3D PD indices of the first-trimester placenta may improve the prediction of preeclampsia, research is needed to standardize the measurement protocol. In highly experienced hands, the unique 3D PD findings improve the diagnostic accuracy of placenta accreta spectrum. A lack of quality assurance is the central challenge to incorporating 3D PD in prenatal care. Machine learning may broaden clinical translations of prenatal 3D PD. Due to its operator dependency, 3D PD has low reproducibility. Until standardization and quality assurance protocols are established, its use as a stand-alone clinical or research tool cannot be recommended. Full article

(This article belongs to the Special Issue Application of 3D-Imaging in Diagnosis)

► Show Figures

Figure 1

28 pages, 1401 KB

Open AccessArticle

An Efficient 3D Human Pose Retrieval and Reconstruction from 2D Image-Based Landmarks

by Hashim Yasin and Björn Krüger

Sensors 2021, 21(7), 2415; https://doi.org/10.3390/s21072415 - 1 Apr 2021

Cited by 7 | Viewed by 6843

Abstract

We propose an efficient and novel architecture for 3D articulated human pose retrieval and reconstruction from 2D landmarks extracted from a 2D synthetic image, an annotated 2D image, an in-the-wild real RGB image or even a hand-drawn sketch. Given 2D joint positions in a single image, we devise a data-driven framework to infer the corresponding 3D human pose. To this end, we first normalize 3D human poses from Motion Capture (MoCap) dataset by eliminating translation, orientation, and the skeleton size discrepancies from the poses and then build a knowledge-base by projecting a subset of joints of the normalized 3D poses onto 2D image-planes by fully exploiting a variety of virtual cameras. With this approach, we not only transform 3D pose space to the normalized 2D pose space but also resolve the 2D-3D cross-domain retrieval task efficiently. The proposed architecture searches for poses from a MoCap dataset that are near to a given 2D query pose in a definite feature space made up of specific joint sets. These retrieved poses are then used to construct a weak perspective camera and a final 3D posture under the camera model that minimizes the reconstruction error. To estimate unknown camera parameters, we introduce a nonlinear, two-fold method. We exploit the retrieved similar poses and the viewing directions at which the MoCap dataset was sampled to minimize the projection error. Finally, we evaluate our approach thoroughly on a large number of heterogeneous 2D examples generated synthetically, 2D images with ground-truth, a variety of real in-the-wild internet images, and a proof of concept using 2D hand-drawn sketches of human poses. We conduct a pool of experiments to perform a quantitative study on PARSE dataset. We also show that the proposed system yields competitive, convincing results in comparison to other state-of-the-art methods. Full article

(This article belongs to the Special Issue Sensors for Posture and Human Motion Recognition)

► Show Figures

Figure 1

8 pages, 8106 KB

Open AccessTechnical Note

Thoracic, Lumbar, and Sacral Pedicle Screw Placement Using Stryker-Ziehm Virtual Screw Technology and Navigated Stryker Cordless Driver 3: Technical Note

by Praveen Satarasinghe, Kojo D. Hamilton, Michael J. Tarver, Robert J. Buchanan and Michael T. Koltz

J. Clin. Med. 2018, 7(4), 84; https://doi.org/10.3390/jcm7040084 - 17 Apr 2018

Cited by 1 | Viewed by 6895

Abstract

Object. Utilization of pedicle screws (PS) for spine stabilization is common in spinal surgery. With reliance on visual inspection of anatomical landmarks prior to screw placement, the free-hand technique requires a high level of surgeon skill and precision. Three-dimensional (3D), computer-assisted virtual neuronavigation improves the precision of PS placement and minimization steps. Methods. Twenty-three patients with degenerative, traumatic, or neoplastic pathologies received treatment via a novel three-step PS technique that utilizes a navigated power driver in combination with virtual screw technology. (1) Following visualization of neuroanatomy using intraoperative CT, a navigated 3-mm match stick drill bit was inserted at an anatomical entry point with a screen projection showing a virtual screw. (2) A Navigated Stryker Cordless Driver with an appropriate tap was used to access the vertebral body through a pedicle with a screen projection again showing a virtual screw. (3) A Navigated Stryker Cordless Driver with an actual screw was used with a screen projection showing the same virtual screw. One hundred and forty-four consecutive screws were inserted using this three-step, navigated driver, virtual screw technique. Results. Only 1 screw needed intraoperative revision after insertion using the three-step, navigated driver, virtual PS technique. This amounts to a 0.69% revision rate. One hundred percent of patients had intraoperative CT reconstructed images taken to confirm hardware placement. Conclusions. Pedicle screw placement utilizing the Stryker-Ziehm neuronavigation virtual screw technology with a three step, navigated power drill technique is safe and effective. Full article

(This article belongs to the Section Nuclear Medicine & Radiology)

► Show Figures

Figure 1

Search Results (6)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (6)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI