applsci-logo

Journal Browser

Journal Browser

Advanced 2D/3D Computer Vision Technology and Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 December 2023) | Viewed by 5655

Special Issue Editor


E-Mail Website
Guest Editor
School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
Interests: computer graphics; digital image/video processing; and computer vision (artificial intelligence)

Special Issue Information

Dear Colleagues,

In recent years, we have witnessed many great computer vision techniques that have been developed and successfully applied in real applications thanks to the great advances in deep learning theory and its implementations.

This Special Issue aims to collect submissions about advanced 2D/3D techniques in computer vision and their applications in different fields. The topics include, but are not limited to, the following areas:

  • Object detection and tracking in public traffic;
  • Human pose/action prediction/analysis for behavior analysis;
  • Video parsing for anomaly detection;
  • Three-dimensional reconstruction by SLAM/SfM/NeRF;
  • Computer vision technology in biomedical engineering;
  • Computer vision methods for indoor scene understanding;
  • Low-level image parsing and analysis with modern deep models;
  • Deep networks for auto-driving;
  • Vision applications in human face editing and image style transfer;
  • New trends in AIGC (artificial intelligence-generated content).

Dr. Yongwei Nie
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep networks
  • computer vison
  • object detection and analysis
  • scene understanding
  • AIGC
  • autodriving
  • traffic
  • biomedial engineering
  • low-level image processing

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 5994 KiB  
Article
Micro-Gear Point Cloud Segmentation Based on Multi-Scale Point Transformer
by Yizhou Su, Xunwei Wang, Guanghao Qi and Baozhen Lei
Appl. Sci. 2024, 14(10), 4271; https://doi.org/10.3390/app14104271 - 17 May 2024
Viewed by 299
Abstract
To address the challenges in industrial precision component detection posed by existing point cloud datasets, this research endeavors to amass and construct a point cloud dataset comprising 1101 models of miniature gears. The data collection and processing procedures are elaborated upon in detail. [...] Read more.
To address the challenges in industrial precision component detection posed by existing point cloud datasets, this research endeavors to amass and construct a point cloud dataset comprising 1101 models of miniature gears. The data collection and processing procedures are elaborated upon in detail. In response to the segmentation issues encountered in point clouds of small industrial components, a novel Point Transformer network incorporating a multiscale feature fusion strategy is proposed. This network extends the original Point Transformer architecture by integrating multiple global feature extraction modules and employing an upsampling module for contextual information fusion, thereby enhancing its modeling capabilities for intricate point cloud structures. The network is trained and tested on the self-constructed gear dataset, yielding promising results. Comparative analysis with the baseline Point Transformer network indicates a notable improvement of 1.1% in mean Intersection over Union (mIoU), substantiating the efficacy of the proposed approach. To further assess the method’s effectiveness, several ablation experiments are designed, demonstrating that the introduced modules contribute to varying degrees of segmentation accuracy enhancement. Additionally, a comparative evaluation is conducted against various state-of-the-art point cloud segmentation networks, revealing the superior performance of the proposed methodology. This research not only aids in quality control, structural detection, and optimization of precision industrial components but also provides a scalable network architecture design paradigm for related point cloud processing tasks. Full article
(This article belongs to the Special Issue Advanced 2D/3D Computer Vision Technology and Applications)
Show Figures

Figure 1

13 pages, 1571 KiB  
Article
R-PointNet: Robust 3D Object Recognition Network for Real-World Point Clouds Corruption
by Zhongyuan Zhang, Lichen Lin and Xiaoli Zhi
Appl. Sci. 2024, 14(9), 3649; https://doi.org/10.3390/app14093649 - 25 Apr 2024
Viewed by 303
Abstract
Point clouds obtained with 3D scanners in realistic scenes inevitably contain corruption, including noise and outliers. Traditional algorithms for cleaning point cloud corruption require the selection of appropriate parameters based on the characteristics of the scene, data, and algorithm, which means that their [...] Read more.
Point clouds obtained with 3D scanners in realistic scenes inevitably contain corruption, including noise and outliers. Traditional algorithms for cleaning point cloud corruption require the selection of appropriate parameters based on the characteristics of the scene, data, and algorithm, which means that their performance is highly dependent on the experience and adaptation of the algorithm itself to the application. Three-dimensional object recognition networks for real-world recognition tasks can take the raw point cloud as input and output the recognition results directly. Current 3D object recognition networks generally acquire uniform sampling points by farthest point sampling (FPS) to extract features. However, sampled defective points from FPS lower the recognition accuracy by affecting the aggregated global feature. To deal with this issue, we design a compensation module, named offset-adjustment (OA). It can adaptively adjust the coordinates of sampled defective points based on neighbors and improve local feature extraction to enhance network robustness. Furthermore, we employ the OA module to build an end-to-end network based on PointNet++ framework for robust point cloud recognition, named R-PointNet. Experiments show that R-PointNet reaches state-of-the-art performance by 92.5% of recognition accuracy on ModelNet40, and significantly outperforms previous networks by 3–7.7% on the corruption dataset ModelNet40-C for robustness benchmark. Full article
(This article belongs to the Special Issue Advanced 2D/3D Computer Vision Technology and Applications)
Show Figures

Figure 1

12 pages, 3132 KiB  
Article
Depth Estimation from a Hierarchical Baseline Stereo with a Developed Light Field Camera
by Fei Liu and Guangqi Hou
Appl. Sci. 2024, 14(2), 550; https://doi.org/10.3390/app14020550 - 8 Jan 2024
Cited by 1 | Viewed by 794
Abstract
This paper presents a hierarchical baseline stereo-matching framework for depth estimation using a novelly developed light field camera. The imaging process of a micro-lens array-based light field camera is derived. A macro-pixel map is constructed by treating each micro-lens as one macro-pixel in [...] Read more.
This paper presents a hierarchical baseline stereo-matching framework for depth estimation using a novelly developed light field camera. The imaging process of a micro-lens array-based light field camera is derived. A macro-pixel map is constructed by treating each micro-lens as one macro-pixel in the light field’s raw image. For each macro-pixel, a feature vector is represented by leveraging texture and gradient cues over the surrounding ring of neighboring macro-pixels. Next, the micro-lenses containing edges are detected on the macro-pixel map. Hierarchical baseline stereo-matching is performed by macro-pixel-wise coarse matching and pixel-wise fine matching, effectively eliminating matching ambiguities. Finally, a post-processing step is applied to improve accuracy. The lab-designed light field camera’s imaging performance is evaluated in terms of accuracy and processing speed by capturing real-world scenes under studio lighting conditions. And an experiment using rendered synthetic samples is conducted for quantitative evaluation, showing that depth maps with local details can be accurately recovered. Full article
(This article belongs to the Special Issue Advanced 2D/3D Computer Vision Technology and Applications)
Show Figures

Figure 1

18 pages, 3433 KiB  
Article
Attentional Keypoint Detection on Point Clouds for 3D Object Part Segmentation
by Feng Zhou, Qi Zhang, He Zhu, Shibo Liu, Na Jiang, Xingquan Cai, Qianfang Qi and Yong Hu
Appl. Sci. 2023, 13(23), 12537; https://doi.org/10.3390/app132312537 - 21 Nov 2023
Viewed by 1323
Abstract
In the field of computer vision, segmenting a 3D object into its component parts is crucial to understanding its structure and characteristics. Much work has focused on 3D object part segmentation directly from point clouds, and significant progress has been made in this [...] Read more.
In the field of computer vision, segmenting a 3D object into its component parts is crucial to understanding its structure and characteristics. Much work has focused on 3D object part segmentation directly from point clouds, and significant progress has been made in this area. This paper proposes a novel 3D object part segmentation method that focuses on integrating three key modules: a keypoint-aware module, a feature extension module, and an attention-aware module. Our approach starts by detecting keypoints, which provide the global feature of the inner shape that serves as the basis for segmentation. Subsequently, we utilize the feature extension module to expand the dimensions, obtain the local representation of the obtained features, provide richer object representation, and improve segmentation accuracy. Furthermore, we introduce an attention-aware module that effectively combines the features of the global and local parts of objects to enhance the segmentation process. To validate the proposed model, we also conduct experiments on the point cloud classification task. The experimental results demonstrate the effectiveness of our method, thus outperforming several state-of-the-art methods in 3D object part segmentation and classification. Full article
(This article belongs to the Special Issue Advanced 2D/3D Computer Vision Technology and Applications)
Show Figures

Figure 1

15 pages, 40620 KiB  
Article
A Person Re-Identification Method Based on Multi-Branch Feature Fusion
by Xuefang Wang, Xintong Hu, Peishun Liu and Ruichun Tang
Appl. Sci. 2023, 13(21), 11707; https://doi.org/10.3390/app132111707 - 26 Oct 2023
Viewed by 828
Abstract
Due to the lack of a specific design for scenarios such as scale change, illumination difference, and occlusion, current person re-identification methods are difficult to put into practice. A Multi-Branch Feature Fusion Network (MFFNet) is proposed, and Shallow Feature Extraction (SFF) and Multi-scale [...] Read more.
Due to the lack of a specific design for scenarios such as scale change, illumination difference, and occlusion, current person re-identification methods are difficult to put into practice. A Multi-Branch Feature Fusion Network (MFFNet) is proposed, and Shallow Feature Extraction (SFF) and Multi-scale Feature Fusion (MFF) are utilized to obtain robust global feature representations while leveraging the Hybrid Attention Module (HAM) and Anti-erasure Federated Block Network (AFBN) to solve the problems of scale change, illumination difference and occlusion in scenes. Finally, multiple loss functions are used to efficiently converge the model parameters and enhance the information interaction between the branches. The experimental results show that our method achieves significant improvements over Market-1501, DukeMTMC-reID, and MSMT17. Especially on the MSMT17 dataset, which is close to real-world scenarios, MFFNet improves by 1.3 and 1.8% on Rank-1 and mAP, respectively. Full article
(This article belongs to the Special Issue Advanced 2D/3D Computer Vision Technology and Applications)
Show Figures

Figure 1

11 pages, 7598 KiB  
Article
Fast 3D Reconstruction of UAV Images Based on Neural Radiance Field
by Cancheng Jiang and Hua Shao
Appl. Sci. 2023, 13(18), 10174; https://doi.org/10.3390/app131810174 - 10 Sep 2023
Cited by 1 | Viewed by 1673
Abstract
Traditional methods for 3D reconstruction of unmanned aerial vehicle (UAV) images often rely on classical multi-view 3D reconstruction techniques. This classical approach involves a sequential process encompassing feature extraction, matching, depth fusion, point cloud integration, and mesh creation. However, these steps, particularly those [...] Read more.
Traditional methods for 3D reconstruction of unmanned aerial vehicle (UAV) images often rely on classical multi-view 3D reconstruction techniques. This classical approach involves a sequential process encompassing feature extraction, matching, depth fusion, point cloud integration, and mesh creation. However, these steps, particularly those that feature extraction and matching, are intricate and time-consuming. Furthermore, as the number of steps increases, a corresponding amplification of cumulative error occurs, leading to its continual augmentation. Additionally, these methods typically utilize explicit representation, which can result in issues such as model discontinuity and missing data during the reconstruction process. To effectively address the challenges associated with heightened temporal expenditures, the absence of key elements, and the fragmented models inherent in three-dimensional reconstruction using Unmanned Aerial Vehicle (UAV) imagery, an alternative approach is introduced—the neural radiance field. This novel method leverages neural networks to intricately fit spatial information within the scene, thereby streamlining the reconstruction steps and rectifying model deficiencies. The neural radiance field method employs a fully connected neural network to meticulously model object surfaces and directly generate the 3D object model. This methodology simplifies the intricacies found in conventional 3D reconstruction processes. Implicitly encapsulating scene characteristics, the neural radiance field allows for iterative refinement of neural network parameters via the utilization of volume rendering techniques. Experimental results substantiate the efficacy of this approach, demonstrating its ability to complete scene reconstruction within a mere 5 min timeframe, thereby reducing reconstruction time by 90% while markedly enhancing reconstruction quality. Full article
(This article belongs to the Special Issue Advanced 2D/3D Computer Vision Technology and Applications)
Show Figures

Figure 1

Back to TopTop