You are currently on the new version of our website. Access the old version .

89 Results Found

  • Article
  • Open Access
4,010 Views
18 Pages

17 September 2023

Realistic fluid models play an important role in computer graphics applications. However, efficiently reconstructing volumetric fluid flows from monocular videos remains challenging. In this work, we present a novel approach for reconstructing 3D flo...

  • Article
  • Open Access
13 Citations
4,212 Views
19 Pages

Feasibility of 3D Body Tracking from Monocular 2D Video Feeds in Musculoskeletal Telerehabilitation

  • Carolina Clemente,
  • Gonçalo Chambel,
  • Diogo C. F. Silva,
  • António Mesquita Montes,
  • Joana F. Pinto and
  • Hugo Plácido da Silva

29 December 2023

Musculoskeletal conditions affect millions of people globally; however, conventional treatments pose challenges concerning price, accessibility, and convenience. Many telerehabilitation solutions offer an engaging alternative but rely on complex hard...

  • Article
  • Open Access
501 Views
26 Pages

Semantic-Guided Spatial and Temporal Fusion Framework for Enhancing Monocular Video Depth Estimation

  • Hyunsu Kim,
  • Yeongseop Lee,
  • Hyunseong Ko,
  • Junho Jeong and
  • Yunsik Son

24 December 2025

Despite advancements in deep learning-based Monocular Depth Estimation (MDE), applying these models to video sequences remains challenging due to geometric ambiguities in texture-less regions and temporal instability caused by independent per-frame i...

  • Article
  • Open Access
1 Citations
1,147 Views
24 Pages

6 October 2025

Self-supervised monocular depth estimation from oblique UAV videos is crucial for enabling autonomous navigation and large-scale mapping. However, existing self-supervised monocular depth estimation methods face key challenges in UAV oblique video sc...

  • Article
  • Open Access
2 Citations
3,416 Views
17 Pages

17 June 2022

Monocular depth estimation is a fundamental yet challenging task in computer vision as depth information will be lost when 3D scenes are mapped to 2D images. Although deep learning-based methods have led to considerable improvements for this task in...

  • Article
  • Open Access
2 Citations
1,988 Views
22 Pages

24 September 2025

Multi-view data, captured from various perspectives, is crucial for training view-invariant human action recognition models, yet its acquisition is hindered by spatio-temporal constraints and high costs. This study aims to develop the Pose Scene Ever...

  • Article
  • Open Access
3,379 Views
22 Pages

29 September 2025

The rapid evolution of autonomous vehicle technologies has amplified the need for crash detection that operates robustly under complex traffic conditions with minimal latency. We propose a hybrid temporal hierarchy that augments a Region-based Convol...

  • Article
  • Open Access
719 Views
21 Pages

8 November 2025

Thermal cameras are known for their ability to overcome lighting constraints and provide reliable thermal radiation images. This capability facilitates methods for depth and ego-motion estimation, enabling efficient learning of poses and scene struct...

  • Article
  • Open Access
4 Citations
2,715 Views
14 Pages

Learning Effective Geometry Representation from Videos for Self-Supervised Monocular Depth Estimation

  • Hailiang Zhao,
  • Yongyi Kong,
  • Chonghao Zhang,
  • Haoji Zhang and
  • Jiansen Zhao

Recent studies on self-supervised monocular depth estimation have achieved promising results, which are mainly based on the joint optimization of depth and pose estimation via high-level photometric loss. However, how to learn the latent and benefici...

  • Feature Paper
  • Article
  • Open Access
1 Citations
3,781 Views
18 Pages

27 August 2021

MonoMR is a system that synthesizes pseudo-2.5D content from monocular videos for mixed reality (MR) head-mounted displays (HMDs). Unlike conventional systems that require multiple cameras, the MonoMR system can be used by casual end-users to generat...

  • Article
  • Open Access
1 Citations
1,682 Views
14 Pages

Improving Monocular Camera Localization for Video-Based Three-Dimensional Outer Ear Reconstruction Tasks

  • Mantas Tamulionis,
  • Artūras Serackis,
  • Kęstutis Bartnykas,
  • Darius Miniotas,
  • Šarūnas Mikučionis,
  • Raimond Laptik,
  • Andrius Ušinskas and
  • Dalius Matuzevičius

28 July 2023

This work addresses challenges related to camera 3D localization while reconstructing a 3D model of an ear. This work explores the potential solution of using a cap, specifically designed not to obstruct the ear, and its efficiency in enhancing the c...

  • Article
  • Open Access
31 Citations
5,827 Views
16 Pages

4 August 2020

Modern visual SLAM (vSLAM) algorithms take advantage of computer vision developments in image processing and in interest point detectors to create maps and trajectories from camera images. Different feature detectors and extractors have been evaluate...

  • Article
  • Open Access
1,100 Views
24 Pages

Milepost-to-Vehicle Monocular Depth Estimation with Boundary Calibration and Geometric Optimization

  • Enhua Zhang,
  • Tao Ma,
  • Handuo Yang,
  • Jiaqi Li,
  • Zhiwei Xie and
  • Zheng Tong

29 August 2025

Milepost-assisted positioning estimates the distance between a vehicle-mounted camera and a milepost as a reference position for autonomous driving. However, the accuracy of monocular metric depth estimation is compromised by camera installation angl...

  • Article
  • Open Access
3 Citations
2,603 Views
16 Pages

An Unsupervised Monocular Visual Odometry Based on Multi-Scale Modeling

  • Henghui Zhi,
  • Chenyang Yin,
  • Huibin Li and
  • Shanmin Pang

11 July 2022

Unsupervised deep learning methods have shown great success in jointly estimating camera pose and depth from monocular videos. However, previous methods mostly ignore the importance of multi-scale information, which is crucial for pose estimation and...

  • Article
  • Open Access
4 Citations
2,155 Views
13 Pages

Monocular 3D Tooltip Tracking in Robotic Surgery—Building a Multi-Stage Pipeline

  • Sanjeev Narasimhan,
  • Mehmet Kerem Turkcan,
  • Mattia Ballo,
  • Sarah Choksi,
  • Filippo Filicori and
  • Zoran Kostic

Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking of surgical tooltips is challenging to implement when using monoc...

  • Article
  • Open Access
23 Citations
645 Views
14 Pages

Video eye trackers rely on the position of the pupil centre. However, the pupil centre can shift when the pupil size changes. This pupillary artefact is investigated for binocular vergence accuracy (i.e., fixation disparity) in near vision where the...

  • Article
  • Open Access
360 Views
24 Pages

26 November 2025

Three-dimensional (3D) reconstruction is increasingly being adopted in construction site management. While most existing studies rely on auxiliary equipment such as LiDAR and depth cameras, monocular depth estimation offers broader applicability unde...

  • Article
  • Open Access
13 Citations
10,082 Views
18 Pages

Accuracy Evaluation of 3D Pose Reconstruction Algorithms Through Stereo Camera Information Fusion for Physical Exercises with MediaPipe Pose

  • Sebastian Dill,
  • Arjang Ahmadi,
  • Martin Grimmer,
  • Dennis Haufe,
  • Maurice Rohr,
  • Yanhua Zhao,
  • Maziar Sharbafi and
  • Christoph Hoog Antink

4 December 2024

In recent years, significant research has been conducted on video-based human pose estimation (HPE). While monocular two-dimensional (2D) HPE has been shown to achieve high performance, monocular three-dimensional (3D) HPE poses a more challenging pr...

  • Article
  • Open Access
715 Views
18 Pages

Adaptive Measurement of Space Target Separation Velocity Based on Monocular Vision

  • Haifeng Zhang,
  • Han Ai,
  • Zeyu He,
  • Delian Liu,
  • Jianzhong Cao and
  • Chao Mei

Spacecraft separation safety is the key characteristic of flight safety. Obtaining the velocity and distance curves of spacecraft and booster at the separation time is at the core of separation safety analysis. In order to solve the separation veloci...

  • Article
  • Open Access
1 Citations
3,224 Views
15 Pages

Lightweight Three-Dimensional Pose and Joint Center Estimation Model for Rehabilitation Therapy

  • Yeonggwang Kim,
  • Giwon Ku,
  • Chulseung Yang,
  • Jeonggi Lee and
  • Jinsul Kim

16 October 2023

In this study, we proposed a novel transformer-based model with independent tokens for estimating three-dimensional (3D) human pose and shape from monocular videos, specifically focusing on its application in rehabilitation therapy. The main objectiv...

  • Article
  • Open Access
8 Citations
6,074 Views
23 Pages

24 February 2016

A new approach to the monocular simultaneous localization and mapping (SLAM) problem is presented in this work. Data obtained from additional bearing-only sensors deployed as wearable devices is fully fused into an Extended Kalman Filter (EKF). The w...

  • Article
  • Open Access
4 Citations
3,864 Views
18 Pages

Real-Time Interaction for 3D Pixel Human in Virtual Environment

  • Haoke Deng,
  • Qimeng Zhang,
  • Hongyu Jin and
  • Chang-Hun Kim

11 January 2023

Conducting realistic interactions while communicating efficiently in online conferences is highly desired but challenging. In this work, we propose a novel pixel-style virtual avatar for interacting with virtual objects in virtual conferences that ca...

  • Article
  • Open Access
5 Citations
3,776 Views
22 Pages

G2O-Pose: Real-Time Monocular 3D Human Pose Estimation Based on General Graph Optimization

  • Haixun Sun,
  • Yanyan Zhang,
  • Yijie Zheng,
  • Jianxin Luo and
  • Zhisong Pan

30 October 2022

Monocular 3D human pose estimation is used to calculate a 3D human pose from monocular images or videos. It still faces some challenges due to the lack of depth information. Traditional methods have tried to disambiguate it by building a pose diction...

  • Article
  • Open Access
9 Citations
4,133 Views
15 Pages

As difficult vision-based tasks like object detection and monocular depth estimation are making their way in real-time applications and as more light weighted solutions for autonomous vehicles navigation systems are emerging, obstacle detection and c...

  • Article
  • Open Access
18 Citations
4,353 Views
13 Pages

18 April 2019

Pedestrian flow statistics and analysis in public places is an important means to ensure urban safety. However, in recent years, a video-based pedestrian flow statistics algorithm mainly relies on binocular vision or a vertical downward camera, which...

  • Article
  • Open Access
47 Citations
8,167 Views
20 Pages

Analysis of Statistical and Artificial Intelligence Algorithms for Real-Time Speed Estimation Based on Vehicle Detection with YOLO

  • Héctor Rodríguez-Rangel,
  • Luis Alberto Morales-Rosales,
  • Rafael Imperial-Rojo,
  • Mario Alberto Roman-Garay,
  • Gloria Ekaterine Peralta-Peñuñuri and
  • Mariana Lobato-Báez

11 March 2022

Automobiles have increased urban mobility, but traffic accidents have also increased. Therefore, road safety is a significant concern involving academics and government. Transit studies are the main supply for studying road accidents, congestion, and...

  • Article
  • Open Access
18 Citations
4,744 Views
27 Pages

11 April 2022

Accurate and reliable tracking of multi-pedestrian is of great importance for autonomous driving, human-robot interaction and video surveillance. Since different scenarios have different best-performing sensors, sensor fusion perception plans are bel...

  • Article
  • Open Access
3 Citations
7,028 Views
22 Pages

Motion Capture Research: 3D Human Pose Recovery Based on RGB Video Sequences

  • Xin Min,
  • Shouqian Sun,
  • Honglie Wang,
  • Xurui Zhang,
  • Chao Li and
  • Xianfu Zhang

2 September 2019

Using video sequences to restore 3D human poses is of great significance in the field of motion capture. This paper proposes a novel approach to estimate 3D human action via end-to-end learning of deep convolutional neural network to calculate the pa...

  • Article
  • Open Access
1 Citations
2,220 Views
25 Pages

Three-dimensional human pose estimation from monocular video remains challenging for clinical gait analysis due to high computational cost and the need for temporal consistency. We present Pose3DM, a bidirectional Mamba-based state-space framework th...

  • Article
  • Open Access
573 Views
21 Pages

E-Sem3DGS: Monocular Human and Scene Reconstruction via Event-Aided Semantic 3DGS

  • Xiaoting Yin,
  • Hao Shi,
  • Kailun Yang,
  • Jiajun Zhai,
  • Shangwei Guo and
  • Kaiwei Wang

27 December 2025

Reconstructing animatable humans, together with their surrounding static environments, from monocular, motion-blurred videos is still challenging for current neural rendering methods. Existing monocular human reconstruction approaches achieve impress...

  • Article
  • Open Access
32 Citations
9,090 Views
29 Pages

Monocular Stereo Measurement Using High-Speed Catadioptric Tracking

  • Shaopeng Hu,
  • Yuji Matsumoto,
  • Takeshi Takaki and
  • Idaku Ishii

9 August 2017

This paper presents a novel concept of real-time catadioptric stereo tracking using a single ultrafast mirror-drive pan-tilt active vision system that can simultaneously switch between hundreds of different views in a second. By accelerating video-sh...

  • Article
  • Open Access
4 Citations
3,674 Views
17 Pages

Unsupervised Learning of Depth from Monocular Videos Using 3D-2D Corresponding Constraints

  • Fusheng Jin,
  • Yu Zhao,
  • Chuanbing Wan,
  • Ye Yuan and
  • Shuliang Wang

1 May 2021

Depth estimation can provide tremendous help for object detection, localization, path planning, etc. However, the existing methods based on deep learning have high requirements on computing power and often cannot be directly applied to autonomous mov...

  • Article
  • Open Access
16 Citations
6,576 Views
18 Pages

Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos

  • Amal El Kaid,
  • Denis Brazey,
  • Vincent Barra and
  • Karim Baïna

28 May 2022

Two-dimensional (2D) multi-person pose estimation and three-dimensional (3D) root-relative pose estimation from a monocular RGB camera have made significant progress recently. Yet, real-world applications require depth estimations and the ability to...

  • Article
  • Open Access
14 Citations
448 Views
18 Pages

The inability of current video-based eye trackers to reliably detect very small eye movements has led to confusion about the prevalence or even the existence of monocular microsaccades (small, rapid eye movements that occur in only one eye at a time)...

  • Article
  • Open Access
11 Citations
3,589 Views
12 Pages

Farm Vehicle Following Distance Estimation Using Deep Learning and Monocular Camera Images

  • Saeed Arabi,
  • Anuj Sharma,
  • Michelle Reyes,
  • Cara Hamann and
  • Corinne Peek-Asa

2 April 2022

This paper presents a comprehensive solution for distance estimation of the following vehicle solely based on visual data from a low-resolution monocular camera. To this end, a pair of vehicles were instrumented with real-time kinematic (RTK) GPS, an...

  • Article
  • Open Access
12 Citations
5,199 Views
13 Pages

A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition

  • Letizia Gionfrida,
  • Wan M. R. Rusli,
  • Angela E. Kedgley and
  • Anil A. Bharath

This paper introduces a multi-class hand gesture recognition model developed to identify a set of hand gesture sequences from two-dimensional RGB video recordings, using both the appearance and spatiotemporal parameters of consecutive frames. The cla...

  • Article
  • Open Access
16 Citations
8,853 Views
34 Pages

10 November 2022

Player pose estimation is particularly important for sports because it provides more accurate monitoring of athlete movements and performance, recognition of player actions, analysis of techniques, and evaluation of action execution accuracy. All of...

  • Article
  • Open Access
12 Citations
6,765 Views
21 Pages

13 June 2024

Human pose estimation (HPE) is a technique used in computer vision and artificial intelligence to detect and track human body parts and poses using images or videos. Widely used in augmented reality, animation, fitness applications, and surveillance,...

  • Article
  • Open Access
10 Citations
4,550 Views
19 Pages

RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry

  • Claudio Cimarelli,
  • Hriday Bavle,
  • Jose Luis Sanchez-Lopez and
  • Holger Voos

30 March 2022

Unsupervised learning for monocular camera motion and 3D scene understanding has gained popularity over traditional methods, which rely on epipolar geometry or non-linear optimization. Notably, deep learning can overcome many issues of monocular visi...

  • Article
  • Open Access
4 Citations
2,890 Views
25 Pages

11 December 2023

One motivation for studying semi-supervised techniques for human pose estimation is to compensate for the lack of variety in curated 3D human pose datasets by combining labeled 3D pose data with readily available unlabeled video data—effectivel...

  • Article
  • Open Access
122 Citations
21,590 Views
39 Pages

Human Pose Estimation from Monocular Images: A Comprehensive Survey

  • Wenjuan Gong,
  • Xuena Zhang,
  • Jordi Gonzàlez,
  • Andrews Sobral,
  • Thierry Bouwmans,
  • Changhe Tu and
  • El-hadi Zahzah

25 November 2016

Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation ca...

  • Article
  • Open Access
8 Citations
6,150 Views
14 Pages

21 September 2018

We present an occlusion-aware unsupervised neural network for jointly learning three low-level vision tasks from monocular videos: depth, optical flow, and camera motion. The system consists of three different predicting sub-networks simultaneously c...

  • Article
  • Open Access
12 Citations
4,667 Views
14 Pages

Deep Forest-Based Monocular Visual Sign Language Recognition

  • Qifan Xue,
  • Xuanpeng Li,
  • Dong Wang and
  • Weigong Zhang

12 May 2019

Sign language recognition (SLR) is a bridge linking the hearing impaired and the general public. Some SLR methods using wearable data gloves are not portable enough to provide daily sign language translation service, while visual SLR is more flexible...

  • Article
  • Open Access
7 Citations
2,035 Views
20 Pages

23 May 2024

Rockfall intrusion detection is crucial for the safety management of railway operations, and video detection methods help reduce deployment costs and improve detection efficiency. Mainstream neural network-based video detection methods have rapidly e...

  • Review
  • Open Access
19 Citations
16,949 Views
38 Pages

12 December 2023

Three-dimensional human pose estimation has made significant advancements through the integration of deep learning techniques. This survey provides a comprehensive review of recent 3D human pose estimation methods, with a focus on monocular images, v...

  • Article
  • Open Access
836 Views
16 Pages

Toward an Augmented Reality Representation of Collision Risks in Harbors

  • Mario Miličević,
  • Igor Vujović,
  • Miro Petković and
  • Ana Kuzmanić Skelin

22 August 2025

In ports with a significant density of non-AIS vessels, there is an increased risk of collisions. This is because physical limitations restrict the maneuverability of AIS vessels, while small vessels that do not have AIS are unpredictable. To help wi...

  • Article
  • Open Access
4 Citations
1,740 Views
16 Pages

AffectiVR: A Database for Periocular Identification and Valence and Arousal Evaluation in Virtual Reality

  • Chaelin Seok,
  • Yeongje Park,
  • Junho Baek,
  • Hyeji Lim,
  • Jong-hyuk Roh,
  • Youngsam Kim,
  • Soohyung Kim and
  • Eui Chul Lee

18 October 2024

This study introduces AffectiVR, a dataset designed for periocular biometric authentication and emotion evaluation in virtual reality (VR) environments. To maximize immersion in VR environments, interactions must be seamless and natural, with unobtru...

  • Article
  • Open Access
1,994 Views
15 Pages

2 November 2023

We present a synthetic augmentation approach towards improving monocular face presentation–attack–detection (PAD) robustness to real-world noise additions. Face PAD algorithms secure authentication systems against spoofing attacks, such a...

  • Article
  • Open Access
1 Citations
2,848 Views
19 Pages

27 February 2025

In the rapidly evolving field of computer vision and machine learning, 3D skeleton estimation is critical for applications such as motion analysis and human–computer interaction. While stereo cameras are commonly used to acquire 3D skeletal dat...

  • Article
  • Open Access
1,547 Views
15 Pages

18 November 2025

Three-dimensional Human Reconstruction from Monocular Vision is a key technology in Virtual Reality and digital humans. It aims to recover the 3D structure and pose of the human body from 2D images or video. Current methods for dynamic 3D reconstruct...

of 2