Advances in Computer Vision and Machine Learning, 2nd Edition

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 31 August 2024 | Viewed by 2319

Special Issue Editors


E-Mail Website
Guest Editor
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China.
Interests: cross-domain scene classification; multi-modal image analysis; cross-modal image interpretation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China
Interests: Information and communication engineering; satellite communication and satellite navigation; machine learning; pattern recognition
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Computer vision focuses on the theories and practices that give rise to semantically meaningful interpretations of the visual world. Mathematical models and tools can provide enormous opportunities for developing intelligent algorithms that extract useful information from visual data, such as a single image, a video sequence, and even a multi-/hyper-spectral image cube. In recent years, a number of  emerging machine learning techniques have been applied in visual perception tasks such as camera imaging geometry, camera calibration, image stabilization, multiview geometry, feature learning, image classification, and object recognition and tracking. However, it is still challenging to provide theoretical explanations of the underlying learning processes, especial when using deep neural networks, where a few questions remain to be answered, such as the design principles, the optimal architecture, the number of required layers, the sample complexity, and the optimization algorithms.

This Special Issue focuses on recent advances in computer vision and machine learning. The topics of interest include, but are not limited to, the following:

  • Pattern recognition and machine learning for computer vision;
  • Feature learning for computer vision;
  • Self-supervised/weakly supervised/unsupervised learning;
  • Image processing and analysis;
  • Deep neural networks in computer vision;
  • Graph neural networks;
  • Optimization method for machine learning;
  • Evolutionary computation and optimization problems;
  • Emerging applications.

Dr. Xiangtao Zheng
Prof. Dr. Jinchang Ren
Prof. Dr. Ling Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • computer vision
  • pattern recognition
  • statistical learning
  • data mining
  • deep learning

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 6180 KiB  
Article
High-Efficiency and High-Precision Ship Detection Algorithm Based on Improved YOLOv8n
by Kun Lan, Xiaoliang Jiang, Xiaokang Ding, Huan Lin and Sixian Chan
Mathematics 2024, 12(7), 1072; https://doi.org/10.3390/math12071072 - 02 Apr 2024
Viewed by 448
Abstract
With the development of the intelligent vision industry, ship detection and identification technology has gradually become a research hotspot in the field of marine insurance and port logistics. However, due to the interference of rain, haze, waves, light, and other bad weather, the [...] Read more.
With the development of the intelligent vision industry, ship detection and identification technology has gradually become a research hotspot in the field of marine insurance and port logistics. However, due to the interference of rain, haze, waves, light, and other bad weather, the robustness and effectiveness of existing detection algorithms remain a continuous challenge. For this reason, an improved YOLOv8n algorithm is proposed for the detection of ship targets under unforeseen environmental conditions. In the proposed method, the efficient multi-scale attention module (C2f_EMAM) is introduced to integrate the context information of different scales so that the convolutional neural network can generate better pixel-level attention to high-level feature maps. In addition, a fully-concatenate bi-directional feature pyramid network (Concatenate_FBiFPN) is adopted to replace the simple superposition/addition of feature map, which can better solve the problem of feature propagation and information flow in target detection. An improved spatial pyramid pooling fast structure (SPPF2+1) is also designed to emphasize low-level pooling features and reduce the pooling depth to accommodate the information characteristics of the ship. A comparison experiment was conducted between other mainstream methods and our proposed algorithm. Results showed that our proposed algorithm outperformed other models by achieving 99.4% of accuracy, 98.2% of precision, 98.5% of recall, 99.1% of [email protected], and 85.4% of [email protected]:.95 on the SeaShips dataset. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Machine Learning, 2nd Edition)
Show Figures

Figure 1

18 pages, 5379 KiB  
Article
Tensor-Based Sparse Representation for Hyperspectral Image Reconstruction Using RGB Inputs
by Yingtao Duan, Nan Wang, Yifan Zhang and Chao Song
Mathematics 2024, 12(5), 708; https://doi.org/10.3390/math12050708 - 28 Feb 2024
Viewed by 496
Abstract
Hyperspectral image (HSI) reconstruction from RGB input has drawn much attention recently and plays a crucial role in further vision tasks. However, current sparse coding algorithms often take each single pixel as the basic processing unit during the reconstruction process, which ignores the [...] Read more.
Hyperspectral image (HSI) reconstruction from RGB input has drawn much attention recently and plays a crucial role in further vision tasks. However, current sparse coding algorithms often take each single pixel as the basic processing unit during the reconstruction process, which ignores the strong similarity and relation between adjacent pixels within an image or scene, leading to an inadequate learning of spectral and spatial features in the target hyperspectral domain. In this paper, a novel tensor-based sparse coding method is proposed to integrate both spectral and spatial information represented in tensor forms, which is capable of taking all the neighboring pixels into account during the spectral super-resolution (SSR) process without breaking the semantic structures, thus improving the accuracy of the final results. Specifically, the proposed method recovers the unknown HSI signals using sparse coding on the learned dictionary pairs. Firstly, the spatial information of pixels is used to constrain the sparse reconstruction process, which effectively improves the spectral reconstruction accuracy of pixels. In addition, the traditional two-dimensional dictionary learning is further extended to the tensor domain, by which the structure of inputs can be processed in a more flexible way, thus enhancing the spatial contextual relations. To this end, a rudimentary HSI estimation acquired in the sparse reconstruction stage is further enhanced by introducing the regression method, aiming to eliminate the spectral distortion to some extent. Abundant experiments are conducted on two public datasets, indicating the considerable availability of the proposed framework. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Machine Learning, 2nd Edition)
Show Figures

Figure 1

20 pages, 4383 KiB  
Article
Gradual OCR: An Effective OCR Approach Based on Gradual Detection of Texts
by Youngki Park and Youhyun Shin
Mathematics 2023, 11(22), 4585; https://doi.org/10.3390/math11224585 - 09 Nov 2023
Viewed by 837
Abstract
In this paper, we present a novel approach to optical character recognition that incorporates various supplementary techniques, including the gradual detection of texts and gradual filtering of inaccurately recognized texts. To minimize false negatives, we attempt to detect all text by incrementally lowering [...] Read more.
In this paper, we present a novel approach to optical character recognition that incorporates various supplementary techniques, including the gradual detection of texts and gradual filtering of inaccurately recognized texts. To minimize false negatives, we attempt to detect all text by incrementally lowering the relevant thresholds. To mitigate false positives, we implement a novel filtering method that dynamically adjusts based on the confidence levels of recognized texts and their corresponding detection thresholds. Additionally, we use straightforward yet effective strategies to enhance the optical character recognition accuracy and speed, such as upscaling, link refinement, perspective transformation, the merging of cropped images, and simple autoregression. Given our focus on Korean chart data, we compile a mix of real-world and artificial Korean chart datasets for experimentation. Our experimental results show that our approach outperforms Tesseract by approximately 7 to 15 times and EasyOCR by 3 to 5 times in accuracy, as measured using a Jaccard similarity-based error rate on our datasets. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Machine Learning, 2nd Edition)
Show Figures

Figure 1

Back to TopTop