Machine Perception in Intelligent Systems

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 April 2021) | Viewed by 4257

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Electronics and Information Technology, Institute of Control and Computation Engineering, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland
Interests: pattern recognition and artificial intelligence; computer vision; speech analysis; robot vision; biometrics; video analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Among others, rapid developments in the area of machine perception and machine learning are closing the so-called “semantic gap” between low-level signal- and image-analysis and symbolic high-level modeling and reasoning. An "intelligent system" is understood to have the primary capability of learning, which means it can acquire new knowledge (model of the environment/application domain) by the generalization of observations and/or to improve its decision strategy. In AI theory, this constitutes an autonomous agent while its technical applications take the form of softbots (computer programs with control- and perception-subsystems but virtually no effectors) or autonomous robots and devices (which, in addition to softbots, have true effectors – manipulators, navigation subsystems).

In this Special Issue, we are particularly interested in machine perception (encompassing various aspects, e.g., methodologies, algorithms, design structures, and particular solutions of signal- and image-analysis) seen as a subsystem of an autonomously acting agent (softbot, robot, vehicle, drone, ship, etc.). This particular association requires that perception reaches up to the symbolic representation level, directly addressing how to close the semantic gap between sensor data and semantic modeling. This means the perception system should provide at least object detection and recognition, if not go beyond it on a more abstract data level. We expect that various methodologies and techniques could be used originating from pattern recognition, machine learning, knowledge engineering, etc.

Prof. Dr. hab. Włodzimierz Kasprzak
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • AI agents
  • Autonomous robots, drones, vehicles
  • Knowledge engineering
  • Machine learning techniques
  • Object recognition
  • Pattern recognition techniques
  • Perception in autonomous systems
  • Semantic gap
  • Semantic models
  • Signal- and image recognition
  • Softbots

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 4484 KiB  
Article
Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
by Paweł Piwowarski and Włodzimierz Kasprzak
Appl. Sci. 2021, 11(13), 5863; https://doi.org/10.3390/app11135863 - 24 Jun 2021
Cited by 2 | Viewed by 1180
Abstract
We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of [...] Read more.
We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be found in fraud detection, deduplication procedure, or visual searching. The contribution of this paper is a novel distance measure for similarity of image sets and the experimental evaluation of several streams for the considered problem of same-car image set recognition. To determine a similarity score of image sets (this score expresses the certainty level that both sets represent the same object visible from the same set of views), we adapted a measure commonly applied in blind signal separation (BSS) evaluation. This measure is independent of the number of images in a set and the order of views in it. Separate streams for object classification (where a class represents either a car type or a car model-and-view) and object-to-object similarity evaluation (based on object features obtained alternatively by the convolutional neural network (CNN) or image keypoint descriptors) were designed. A late fusion by a fully-connected neural network (NN) completes the solution. The implementation is of modular structure—for semantic segmentation we use a Mask-RCNN (Mask regions with CNN features) with ResNet 101 as a backbone network; image feature extraction is either based on the DeepRanking neural network or classic keypoint descriptors (e.g., scale-invariant feature transform (SIFT)) and object classification is performed by two Inception V3 deep networks trained for car type-and-view and car model-and-view classification (4 views, 9 car types, and 197 car models are considered). Experiments conducted on the Stanford Cars dataset led to selection of the best system configuration that overperforms a base approach, allowing for a 67.7% GAR (genuine acceptance rate) at 3% FAR (false acceptance rate). Full article
(This article belongs to the Special Issue Machine Perception in Intelligent Systems)
Show Figures

Figure 1

13 pages, 3314 KiB  
Article
Low-Complexity Pupil Tracking for Sunglasses-Wearing Faces for Glasses-Free 3D HUDs
by Dongwoo Kang and Hyun Sung Chang
Appl. Sci. 2021, 11(10), 4366; https://doi.org/10.3390/app11104366 - 11 May 2021
Cited by 5 | Viewed by 2394
Abstract
This study proposes a pupil-tracking method applicable to drivers both with and without sunglasses on, which has greater compatibility with augmented reality (AR) three-dimensional (3D) head-up displays (HUDs). Performing real-time pupil localization and tracking is complicated by drivers wearing facial accessories such as [...] Read more.
This study proposes a pupil-tracking method applicable to drivers both with and without sunglasses on, which has greater compatibility with augmented reality (AR) three-dimensional (3D) head-up displays (HUDs). Performing real-time pupil localization and tracking is complicated by drivers wearing facial accessories such as masks, caps, or sunglasses. The proposed method fulfills two key requirements: low complexity and algorithm performance. Our system assesses both bare and sunglasses-wearing faces by first classifying images according to these modes and then assigning the appropriate eye tracker. For bare faces with unobstructed eyes, we applied our previous regression-algorithm-based method that uses scale-invariant feature transform features. For eyes occluded by sunglasses, we propose an eye position estimation method: our eye tracker uses nonoccluded face area tracking and a supervised regression-based pupil position estimation method to locate pupil centers. Experiments showed that the proposed method achieved high accuracy and speed, with a precision error of <10 mm in <5 ms for bare and sunglasses-wearing faces for both a 2.5 GHz CPU and a commercial 2.0 GHz CPU vehicle-embedded system. Coupled with its performance, the low CPU consumption (10%) demonstrated by the proposed algorithm highlights its promise for implementation in AR 3D HUD systems. Full article
(This article belongs to the Special Issue Machine Perception in Intelligent Systems)
Show Figures

Figure 1

Back to TopTop