Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Browser

New Insights into Computer Vision and Graphics

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 October 2024 | Viewed by 2314

Share This Special Issue

Special Issue Editor

Dr. Yuanyuan Liu

E-Mail Website
Guest Editor

Department of Information Engineering, China University of Geosciences, Wuhan 430075, China
Interests: computer vision; deep learning; image and video understanding

Special Issue Information

Dear Colleagues,

Application trends, device technologies, and the blurring of boundaries between disciplines are propelling information technology forward. This poses new challenges in the study of visual computing-based interactive graphics processing technology. Therefore, this Special Issue intends to presentation new ideas and experimental discoveries in the field of computer vision and graphics, from its design, service, and theory, to its applications.

Computer vision and graphics focus on the computational processing and applications of visual data. Areas relevant to computer vision and graphics include, but are not limited to, robotics, medical imaging, security and surveillance, gaming and entertainment, education and training, art and design, environmental monitoring, etc. High-speed processing techniques and real-time performance, developing and refining deep learning techniques for computer vision and graphics applications, and explainable AI techniques to improve the transparency and interpretability of AI models are all topics of interest.

This Special Issue will publish high-quality, original research papers in overlapping fields, including the following:

Image processing/analysis;
Computer vision theory and application;
Video and audio encoding;
Motion detection and tracking;
Reconstruction and representation;
Facial and hand gesture recognition;
Rendering techniques;
Matching, inference, and recognition;
Geometric modeling;
3D vision;
Graph-based learning and applications.

Dr. Yuanyuan Liu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

image processing/analysis
computer vision theory and application
video and audio encoding
motion detection and tracking
reconstruction and representation
facial and hand gesture recognition
rendering techniques
matching, inference, and recognition
geometric modeling
3D vision
graph-based learning and applications

Published Papers (3 papers)

Download All Papers

Research

18 pages, 1493 KiB

Open AccessArticle

Hypergraph Position Attention Convolution Networks for 3D Point Cloud Segmentation

by Yanpeng Rong, Liping Nong, Zichen Liang, Zhuocheng Huang, Jie Peng and Yiping Huang

Appl. Sci. 2024, 14(8), 3526; https://doi.org/10.3390/app14083526 - 22 Apr 2024

Viewed by 415

Abstract

Point cloud segmentation, as the basis for 3D scene understanding and analysis, has made significant progress in recent years. Graph-based modeling and learning methods have played an important role in point cloud segmentation. However, due to the inherent complexity of point cloud data, it is difficult to capture higher-order and complex features of 3D data using graph learning methods. In addition, how to quickly and efficiently extract important features from point clouds also poses a great challenge to the current research. To address these challenges, we propose a new framework, called hypergraph position attention convolution networks (HGPAT), for point cloud segmentation. Firstly, we use hypergraph to model the higher-order relationships among point clouds. Secondly, in order to effectively learn the feature information of point cloud data, a hyperedge position attention convolution module is proposed, which utilizes the hyperedge–hyperedge propagation pattern to extract and aggregate more important features. Finally, we design a ResNet-like module to reduce the computational complexity of the network and improve its efficiency. We have conducted point cloud segmentation experiments on the ShapeNet Part and S3IDS datasets, and the experimental results demonstrate the effectiveness of the proposed method compared with the state-of-the-art ones. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

13 pages, 12039 KiB

Open AccessArticle

Camera Path Generation for Triangular Mesh Using Toroidal Patches

by Jinyoung Choi, Kangmin Kim, Seongil Kim, Minseok Kim, Taekgwan Nam and Youngjin Park

Appl. Sci. 2024, 14(2), 490; https://doi.org/10.3390/app14020490 - 5 Jan 2024

Viewed by 637

Abstract

Triangular mesh data structures are principal in computer graphics, serving as the foundation for many 3D models. To effectively utilize these 3D models across diverse industries, it is important to understand the model’s overall shape and geometric features thoroughly. In this work, we introduce a novel method for generating camera paths that emphasize the model’s local geometric characteristics. This method uses a toroidal patch-based spatial data structure, approximating the mesh’s faces within a predetermined tolerance

ϵ

, encapsulating their geometric intricacies. This facilitates the determination of the camera position and gaze path, ensuring the mesh’s key characteristics are captured. During the path construction, we create a bounding cylinder for the mesh, project the mesh’s faces and associated toroidal patches onto the cylinder’s lateral surface, and sequentially select grids of the cylinder containing the highest number of toroidal patches as we traverse the lateral surface. The centers of the selected grids are used as control points for a periodic B-spline curve, which serves as our foundational path. After initial curve generation, we generated camera position and gaze path from the curve by multiplying factors to ensure a uniform camera amplitude. We applied our method to ten triangular mesh models, demonstrating its effectiveness and adaptability across various mesh configurations. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

28 pages, 4448 KiB

Open AccessArticle

ED²IF²-Net: Learning Disentangled Deformed Implicit Fields and Enhanced Displacement Fields from Single Images Using Pyramid Vision Transformer

by Xiaoqiang Zhu, Xinsheng Yao, Junjie Zhang, Mengyao Zhu, Lihua You, Xiaosong Yang, Jianjun Zhang, He Zhao and Dan Zeng

Appl. Sci. 2023, 13(13), 7577; https://doi.org/10.3390/app13137577 - 27 Jun 2023

Viewed by 857

Abstract

There has emerged substantial research in addressing single-view 3D reconstruction and the majority of the state-of-the-art implicit methods employ CNNs as the backbone network. On the other hand, transformers have shown remarkable performance in many vision tasks. However, it is still unknown whether transformers are suitable for single-view implicit 3D reconstruction. In this paper, we propose the first end-to-end single-view 3D reconstruction network based on the Pyramid Vision Transformer (PVT), called

{ED}^{2} {IF}^{2}

-Net, which disentangles the reconstruction of an implicit field into the reconstruction of topological structures and the recovery of surface details to achieve high-fidelity shape reconstruction.

{ED}^{2} {IF}^{2}

-Net uses a Pyramid Vision Transformer encoder to extract multi-scale hierarchical local features and a global vector of the input single image, which are fed into three separate decoders. A coarse shape decoder reconstructs a coarse implicit field based on the global vector, a deformation decoder iteratively refines the coarse implicit field using the pixel-aligned local features to obtain a deformed implicit field through multiple implicit field deformation blocks (IFDBs), and a surface detail decoder predicts an enhanced displacement field using the local features with hybrid attention modules (HAMs). The final output is a fusion of the deformed implicit field and the enhanced displacement field, with four loss terms applied to reconstruct the coarse implicit field, structure details through a novel deformation loss, overall shape after fusion, and surface details via a Laplacian loss. The quantitative results obtained from the ShapeNet dataset validate the exceptional performance of

{ED}^{2} {IF}^{2}

-Net. Notably,

{ED}^{2} {IF}^{2}

-Net-L stands out as the top-performing variant, exhibiting the highest mean IoU, CD, EMD, ECD-3D, and ECD-2D scores, reaching impressive values of 61.1, 7.26, 2.51, 6.08, and 1.84, respectively. The extensive experimental evaluations consistently demonstrate the state-of-the-art capabilities of

{ED}^{2} {IF}^{2}

-Net in terms of reconstructing topological structures and recovering surface details, all while maintaining competitive inference time. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

Journal Menu

Journal Browser

New Insights into Computer Vision and Graphics

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (3 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI