entropy-logo

Journal Browser

Journal Browser

Information Theory in Computer Vision and Pattern Recognition

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Signal and Data Analysis".

Deadline for manuscript submissions: closed (31 January 2023) | Viewed by 9784

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan
Interests: multimedia signal processing; digital image processing; computer vision; machine learning; deep learning

E-Mail Website
Guest Editor
Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan
Interests: computer vision; digital image processing; digital still camera/digital video camcorder/camera module/surveillance camera/internet protocol (IP) camera; AOI: automatic optical inspection; defect inspection; automatic measurement; industrial automation; multi-touch panel with camera

Special Issue Information

Dear Colleagues,

Computer vision and pattern recognition is a challenging and rapidly developing field with applications spanning multiple industries, including transportation, manufacturing, healthcare, etc. Technologies such as autonomous driving, pedestrian and vehicle detection, road condition and traffic flow analysis, etc. provide a safer and intelligent driving environment. Quality control and automation solutions for packaging and assembly significantly reduce human resources requirements and improve efficiency and accuracy. As for medical images such as X-rays, ultrasound images, CT scans, and MRI, the related technologies provide automatic annotation and detection of abnormal areas, pathological analysis, and early advice and treatment. Although this field has been developed for decades, several issues remain challenging.

In recent years, the emerging development of applying the concepts of information theory to computer vision and pattern recognition has drawn significant attention. It expands new possibilities and gains significant advances in segmentation, classification, object detection, saliency detection, representation learning, framework design for deep learning, etc. This Special Issue aims to provide a forum for discussing challenging issues with trends in computer vision and pattern recognition. Original contributions to the theories, approaches, findings, and applications of computer vision and pattern recognition are all welcome.

Dr. Ming-Sui Lee
Prof. Dr. Chiou-Shann Fuh
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Information theory
  • Computer vision
  • Pattern recognition
  • Segmentation
  • Classification
  • Object Detection
  • Saliency Detection
  • Representation Learning
  • Applications

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 4521 KiB  
Article
Visual Sorting of Express Packages Based on the Multi-Dimensional Fusion Method under Complex Logistics Sorting
by Chuanxiang Ren, Haowei Ji, Xiang Liu, Juan Teng and Hui Xu
Entropy 2023, 25(2), 298; https://doi.org/10.3390/e25020298 - 5 Feb 2023
Cited by 3 | Viewed by 2207
Abstract
Visual sorting of express packages is faced with many problems such as the various types, complex status, and the changeable detection environment, resulting in low sorting efficiency. In order to improve the sorting efficiency of packages under complex logistics sorting, a multi-dimensional fusion [...] Read more.
Visual sorting of express packages is faced with many problems such as the various types, complex status, and the changeable detection environment, resulting in low sorting efficiency. In order to improve the sorting efficiency of packages under complex logistics sorting, a multi-dimensional fusion method (MDFM) for visual sorting in actual complex scenes is proposed. In MDFM, the Mask R-CNN is designed and applied to detect and recognize different kinds of express packages in complex scenes. Combined with the boundary information of 2D instance segmentation from Mask R-CNN, the 3D point cloud data of grasping surface is accurately filtered and fitted to determining the optimal grasping position and sorting vector. The images of box, bag, and envelope, which are the most common types of express packages in logistics transportation, are collected and the dataset is made. The experiments with Mask R-CNN and robot sorting were carried out. The results show that Mask R-CNN achieves better results in object detection and instance segmentation on the express packages, and the robot sorting success rate by the MDFM reaches 97.2%, improving 2.9, 7.5, and 8.0 percentage points, respectively, compared to baseline methods. The MDFM is suitable for complex and diverse actual logistics sorting scenes, and improves the efficiency of logistics sorting, which has great application value. Full article
(This article belongs to the Special Issue Information Theory in Computer Vision and Pattern Recognition)
Show Figures

Figure 1

20 pages, 1893 KiB  
Article
A Conceptual Multi-Layer Framework for the Detection of Nighttime Pedestrian in Autonomous Vehicles Using Deep Reinforcement Learning
by Muhammad Shoaib Farooq, Haris Khalid, Ansif Arooj, Tariq Umer, Aamer Bilal Asghar, Jawad Rasheed, Raed M. Shubair and Amani Yahyaoui
Entropy 2023, 25(1), 135; https://doi.org/10.3390/e25010135 - 9 Jan 2023
Cited by 17 | Viewed by 2726
Abstract
The major challenge faced by autonomous vehicles today is driving through busy roads without getting into an accident, especially with a pedestrian. To avoid collision with pedestrians, the vehicle requires the ability to communicate with a pedestrian to understand their actions. The most [...] Read more.
The major challenge faced by autonomous vehicles today is driving through busy roads without getting into an accident, especially with a pedestrian. To avoid collision with pedestrians, the vehicle requires the ability to communicate with a pedestrian to understand their actions. The most challenging task in research on computer vision is to detect pedestrian activities, especially at nighttime. The Advanced Driver-Assistance Systems (ADAS) has been developed for driving and parking support for vehicles to visualize sense, send and receive information from the environment but it lacks to detect nighttime pedestrian actions. This article proposes a framework based on Deep Reinforcement Learning (DRL) using Scale Invariant Faster Region-based Convolutional Neural Networks (SIFRCNN) technologies to efficiently detect pedestrian operations through which the vehicle, as agents train themselves from the environment and are forced to maximize the reward. The SIFRCNN has reduced the running time of detecting pedestrian operations from road images by incorporating Region Proposal Network (RPN) computation. Furthermore, we have used Reinforcement Learning (RL) for optimizing the Q-values and training itself to maximize the reward after getting the state from the SIFRCNN. In addition, the latest incarnation of SIFRCNN achieves near-real-time object detection from road images. The proposed SIFRCNN has been tested on KAIST, City Person, and Caltech datasets. The experimental results show an average improvement of 2.3% miss rate of pedestrian detection at nighttime compared to the other CNN-based pedestrian detectors. Full article
(This article belongs to the Special Issue Information Theory in Computer Vision and Pattern Recognition)
Show Figures

Figure 1

13 pages, 2383 KiB  
Article
Multiscale Hybrid Convolutional Deep Neural Networks with Channel Attention
by Hua Yang, Ming Yang, Bitao He, Tao Qin and Jing Yang
Entropy 2022, 24(9), 1180; https://doi.org/10.3390/e24091180 - 24 Aug 2022
Cited by 2 | Viewed by 1790
Abstract
Attention mechanisms can improve the performance of neural networks, but the recent attention networks bring a greater computational overhead while improving network performance. How to maintain model performance while reducing complexity is a hot research topic. In this paper, a lightweight Mixture Attention [...] Read more.
Attention mechanisms can improve the performance of neural networks, but the recent attention networks bring a greater computational overhead while improving network performance. How to maintain model performance while reducing complexity is a hot research topic. In this paper, a lightweight Mixture Attention (MA) module is proposed to improve network performance and reduce the complexity of the model. Firstly, the MA module uses multi-branch architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Secondly, in order to reduce the number of parameters, each branch uses group convolution independently, and the feature maps extracted by different branches are fused along the channel dimension. Finally, the fused feature maps are processed using the channel attention module to extract statistical information on the channels. The proposed method is efficient yet effective, e.g., the network parameters and computational cost are reduced by 9.86% and 7.83%, respectively, and the Top-1 performance is improved by 1.99% compared with ResNet50. Experimental results on common-used benchmarks, including CIFAR-10 for classification and PASCAL-VOC for object detection, demonstrate that the proposed MA outperforms the current SOTA methods significantly by achieving higher accuracy while having lower model complexity. Full article
(This article belongs to the Special Issue Information Theory in Computer Vision and Pattern Recognition)
Show Figures

Figure 1

18 pages, 2895 KiB  
Article
An Improved Tiered Head Pose Estimation Network with Self-Adjust Loss Function
by Xiaoliang Zhu, Qiaolai Yang, Liang Zhao, Zhicheng Dai, Zili He, Wenting Rong, Junyi Sun and Gendong Liu
Entropy 2022, 24(7), 974; https://doi.org/10.3390/e24070974 - 14 Jul 2022
Cited by 5 | Viewed by 1914
Abstract
As an important task in computer vision, head pose estimation has been widely applied in both academia and industry. However, there remains two challenges in the field of head pose estimation: (1) even given the same task (e.g., tiredness detection), the existing algorithms [...] Read more.
As an important task in computer vision, head pose estimation has been widely applied in both academia and industry. However, there remains two challenges in the field of head pose estimation: (1) even given the same task (e.g., tiredness detection), the existing algorithms usually consider the estimation of the three angles (i.e., roll, yaw, and pitch) as separate facets, which disregard their interplay as well as differences and thus share the same parameters for all layers; and (2) the discontinuity in angle estimation definitely reduces the accuracy. To solve these two problems, a THESL-Net (tiered head pose estimation with self-adjust loss network) model is proposed in this study. Specifically, first, an idea of stepped estimation using distinct network layers is proposed, gaining a greater freedom during angle estimation. Furthermore, the reasons for the discontinuity in angle estimation are revealed, including not only labeling the dataset with quaternions or Euler angles, but also the loss function that simply adds the classification and regression losses. Subsequently, a self-adjustment constraint on the loss function is applied, making the angle estimation more consistent. Finally, to examine the influence of different angle ranges on the proposed model, experiments are conducted on three popular public benchmark datasets, BIWI, AFLW2000, and UPNA, demonstrating that the proposed model outperforms the state-of-the-art approaches. Full article
(This article belongs to the Special Issue Information Theory in Computer Vision and Pattern Recognition)
Show Figures

Figure 1

Back to TopTop