Computer Vision for Security Applications

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 28 February 2025 | Viewed by 69244

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, Sapienza University of Rome, 00185 Rome, Italy
Interests: computer vision (feature extraction and pattern analysis); scene and event understanding (by people and/or vehicles and/or objects); human–computer interaction (pose estimation and gesture recognition by hands and/or body); sketch-based interaction (handwriting and freehand drawing); human–behaviour recognition (actions, emotions, feelings, affects, and moods by hands, body, facial expressions, and voice); biometric analysis (person re-identification by body visual features and/or gait and/or posture/pose); artificial intelligence (machine/deep learning); medical image analysis (MRI, ultrasound, X-rays, PET, and CT); multimodal fusion models; brain–computer interfaces (interaction and security systems); signal processing; visual cryptography (by RGB images); smart environments and natural interaction (with and without virtual/augmented reality); robotics (monitoring and surveillance systems with PTZ cameras, UAVs, AUVs, rovers, and humanoids)
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, Sapienza University of Rome, 00185 Rome, Italy
Interests: computer vision (feature extraction and pattern analysis); scene and event understanding (by people and/or vehicles and/or objects); human–behaviour recognition (actions, emotions, feelings, affects, and moods by hands, body, facial expressions, and voice); biometric analysis (person re-identification by body visual features and/or gait and/or posture/pose); artificial intelligence (machine/deep learning); brain–computer interfaces (interaction and security systems); signal processing; visual cryptography (by RGB images); robotics (monitoring and surveillance systems by PTZ cameras, UAVs, AUVs, rovers, and humanoids)
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, Sapienza University of Rome, 00185 Rome, Italy
Interests: computer vision (feature extraction and pattern analysis); scene and event understanding (by people and/or vehicles and/or objects); Human-Behaviour Recognition (actions, emotions, feelings, affects, and moods by hands, body, facial expressions, and voice); biometric analysis (person re-identification by body visual features and/or gait and/or posture/pose); artificial intelligence (machine/deep learning); medical image analysis (MRI, ultrasound, X-rays, PET, and CT); multimodal fusion models; signal processing; robotics (monitoring and surveillance systems by PTZ cameras, UAVs, AUVs, rovers, and humanoids)
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The MDPI Information journal invites submissions to a Special Issue on “Computer Vision for Security Applications”.
Automated security systems are gaining ever-increasing interest for both indoor and outdoor applications as a consequence of technological advances in acquisition devices. Specifically, video sequences, as well as static images, are exploited to obtain innovative solutions to heterogenous problems such as area surveillance and affect recognition. In particular, these applications are also employed in real-life security scenarios, including Unmanned Aerial Vehicles (UAVs), disaster monitoring, train stations and airports, in multi-view camera surveillance for human action recognition, and high-stakes situations (e.g., interrogations and court trials) that affect recognition, highlighting the importance of reliable and well-performing systems. Moreover, while computer vision has significantly benefitted from deep learning algorithms and neural networks, well-known issues afflicting automated video-based security applications, such as different viewing angles, illumination changes, background clutter, occlusions, and long-term re-identification still remain open. Therefore, innovative solutions are required to further advance the available systems for their employment in real-life contexts.

This Special Issue is concerned with ground-breaking approaches addressing common issues of computer vision-based methods, with a particular emphasis on security applications.

Dr. Danilo Avola
Dr. Daniele Pannone
Dr. Alessio Fagioli
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • drone patrolling
  • distributed smart cameras for person re-identification
  • image analysis for action recognition
  • machine/deep learning models in surveillance systems
  • automatic scene understanding
  • skeleton-based methods for person re-identification
  • methods and models for affect recognition
  • soft and hard biometrics
  • soft computing for visual security applications
  • application of 3D vision in security
  • image sensor fusion

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

23 pages, 9815 KiB  
Article
Military Decision-Making Process Enhanced by Image Detection
by Nikola Žigulić, Matko Glučina, Ivan Lorencin and Dario Matika
Information 2024, 15(1), 11; https://doi.org/10.3390/info15010011 - 24 Dec 2023
Cited by 1 | Viewed by 2870
Abstract
This study delves into the vital missions of the armed forces, encompassing the defense of territorial integrity, sovereignty, and support for civil institutions. Commanders grapple with crucial decisions, where accountability underscores the imperative for reliable field intelligence. Harnessing artificial intelligence, specifically, the YOLO [...] Read more.
This study delves into the vital missions of the armed forces, encompassing the defense of territorial integrity, sovereignty, and support for civil institutions. Commanders grapple with crucial decisions, where accountability underscores the imperative for reliable field intelligence. Harnessing artificial intelligence, specifically, the YOLO version five detection algorithm, ensures a paradigm of efficiency and precision. The presentation of trained models, accompanied by pertinent hyperparameters and dataset specifics derived from public military insignia videos and photos, reveals a nuanced evaluation. Results scrutinized through precision, recall, [email protected], [email protected], and F1 score metrics, illuminate the supremacy of the model employing Stochastic Gradient Descent at 640 × 640 resolution: 0.966, 0.957, 0.979, 0.830, and 0.961. Conversely, the suboptimal performance of the model using the Adam optimizer registers metrics of 0.818, 0.762, 0.785, 0.430, and 0.789. These outcomes underscore the model’s potential for military object detection across diverse terrains, with future prospects considering the implementation on unmanned arial vehicles to amplify and deploy the model effectively. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

27 pages, 3278 KiB  
Article
Deep Learning Approach for Human Action Recognition Using a Time Saliency Map Based on Motion Features Considering Camera Movement and Shot in Video Image Sequences
by Abdorreza Alavigharahbagh, Vahid Hajihashemi, José J. M. Machado and João Manuel R. S. Tavares
Information 2023, 14(11), 616; https://doi.org/10.3390/info14110616 - 15 Nov 2023
Cited by 3 | Viewed by 2375
Abstract
In this article, a hierarchical method for action recognition based on temporal and spatial features is proposed. In current HAR methods, camera movement, sensor movement, sudden scene changes, and scene movement can increase motion feature errors and decrease accuracy. Another important aspect to [...] Read more.
In this article, a hierarchical method for action recognition based on temporal and spatial features is proposed. In current HAR methods, camera movement, sensor movement, sudden scene changes, and scene movement can increase motion feature errors and decrease accuracy. Another important aspect to take into account in a HAR method is the required computational cost. The proposed method provides a preprocessing step to address these challenges. As a preprocessing step, the method uses optical flow to detect camera movements and shots in input video image sequences. In the temporal processing block, the optical flow technique is combined with the absolute value of frame differences to obtain a time saliency map. The detection of shots, cancellation of camera movement, and the building of a time saliency map minimise movement detection errors. The time saliency map is then passed to the spatial processing block to segment the moving persons and/or objects in the scene. Because the search region for spatial processing is limited based on the temporal processing results, the computations in the spatial domain are drastically reduced. In the spatial processing block, the scene foreground is extracted in three steps: silhouette extraction, active contour segmentation, and colour segmentation. Key points are selected at the borders of the segmented foreground. The last used features are the intensity and angle of the optical flow of detected key points. Using key point features for action detection reduces the computational cost of the classification step and the required training time. Finally, the features are submitted to a Recurrent Neural Network (RNN) to recognise the involved action. The proposed method was tested using four well-known action datasets: KTH, Weizmann, HMDB51, and UCF101 datasets and its efficiency was evaluated. Since the proposed approach segments salient objects based on motion, edges, and colour features, it can be added as a preprocessing step to most current HAR systems to improve performance. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

18 pages, 4388 KiB  
Article
An Automated Precise Authentication of Vehicles for Enhancing the Visual Security Protocols
by Kumarmangal Roy, Muneer Ahmad, Norjihan Abdul Ghani, Jia Uddin and Jungpil Shin
Information 2023, 14(8), 466; https://doi.org/10.3390/info14080466 - 18 Aug 2023
Viewed by 1999
Abstract
The movement of vehicles in and out of the predefined enclosure is an important security protocol that we encounter daily. Identification of vehicles is a very important factor for security surveillance. In a smart campus concept, thousands of vehicles access the campus every [...] Read more.
The movement of vehicles in and out of the predefined enclosure is an important security protocol that we encounter daily. Identification of vehicles is a very important factor for security surveillance. In a smart campus concept, thousands of vehicles access the campus every day, resulting in massive carbon emissions. Automated monitoring of both aspects (pollution and security) are an essential element for an academic institution. Among the reported methods, the automated identification of number plates is the best way to streamline vehicles. The performances of most of the previously designed similar solutions suffer in the context of light exposure, stationary backgrounds, indoor area, specific driveways, etc. We propose a new hybrid single-shot object detector architecture based on the Haar cascade and MobileNet-SSD. In addition, we adopt a new optical character reader mechanism for character identification on number plates. We prove that the proposed hybrid approach is robust and works well on live object detection. The existing research focused on the prediction accuracy, which in most state-of-the-art methods (SOTA) is very similar. Thus, the precision among several use cases is also a good evaluation measure that was ignored in the existing research. It is evident that the performance of prediction systems suffers due to adverse weather conditions stated earlier. In such cases, the precision between events of detection may result in high variance that impacts the prediction of vehicles in unfavorable circumstances. The performance assessment of the proposed solution yields a precision of 98% on real-time data for Malaysian number plates, which can be generalized in the future to all sorts of vehicles around the globe. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

16 pages, 1430 KiB  
Article
A Video Question Answering Model Based on Knowledge Distillation
by Zhuang Shao, Jiahui Wan and Linlin Zong
Information 2023, 14(6), 328; https://doi.org/10.3390/info14060328 - 12 Jun 2023
Cited by 1 | Viewed by 2706
Abstract
Video question answering (QA) is a cross-modal task that requires understanding the video content to answer questions. Current techniques address this challenge by employing stacked modules, such as attention mechanisms and graph convolutional networks. These methods reason about the semantics of video features [...] Read more.
Video question answering (QA) is a cross-modal task that requires understanding the video content to answer questions. Current techniques address this challenge by employing stacked modules, such as attention mechanisms and graph convolutional networks. These methods reason about the semantics of video features and their interaction with text-based questions, yielding excellent results. However, these approaches often learn and fuse features representing different aspects of the video separately, neglecting the intra-interaction and overlooking the latent complex correlations between the extracted features. Additionally, the stacking of modules introduces a large number of parameters, making model training more challenging. To address these issues, we propose a novel multimodal knowledge distillation method that leverages the strengths of knowledge distillation for model compression and feature enhancement. Specifically, the fused features in the larger teacher model are distilled into knowledge, which guides the learning of appearance and motion features in the smaller student model. By incorporating cross-modal information in the early stages, the appearance and motion features can discover their related and complementary potential relationships, thus improving the overall model performance. Despite its simplicity, our extensive experiments on the widely used video QA datasets, MSVD-QA and MSRVTT-QA, demonstrate clear performance improvements over prior methods. These results validate the effectiveness of the proposed knowledge distillation approach. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

23 pages, 27370 KiB  
Article
Recognizing the Wadi Fluvial Structure and Stream Network in the Qena Bend of the Nile River, Egypt, on Landsat 8-9 OLI Images
by Polina Lemenkova and Olivier Debeir
Information 2023, 14(4), 249; https://doi.org/10.3390/info14040249 - 20 Apr 2023
Cited by 7 | Viewed by 2736
Abstract
With methods for processing remote sensing data becoming widely available, the ability to quantify changes in spatial data and to evaluate the distribution of diverse landforms across target areas in datasets becomes increasingly important. One way to approach this problem is through satellite [...] Read more.
With methods for processing remote sensing data becoming widely available, the ability to quantify changes in spatial data and to evaluate the distribution of diverse landforms across target areas in datasets becomes increasingly important. One way to approach this problem is through satellite image processing. In this paper, we primarily focus on the methods of the unsupervised classification of the Landsat OLI/TIRS images covering the region of the Qena governorate in Upper Egypt. The Qena Bend of the Nile River presents a remarkable morphological feature in Upper Egypt, including a dense drainage network of wadi aquifer systems and plateaus largely dissected by numerous valleys of dry rivers. To identify the fluvial structure and stream network of the Wadi Qena region, this study addresses the problem of interpreting the relevant space-borne data using R, with an aim to visualize the land surface structures corresponding to various land cover types. To this effect, high-resolution 2D and 3D topographic and geologic maps were used for the analysis of the geomorphological setting of the Qena region. The information was extracted from the space-borne data for the comparative analysis of the distribution of wadi streams in the Qena Bend area over several years: 2013, 2015, 2016, 2019, 2022, and 2023. Six images were processed using computer vision methods made available by R libraries. The results of the k-means clustering of each scene retrieved from the multi-temporal images covering the Qena Bend of the Nile River were thus compared to visualize changes in landforms caused by the cumulative effects of geomorphological disasters and climate–environmental processes. The proposed method, tied together through the use of R scripts, runs effectively and performs favorably in computer vision tasks aimed at geospatial image processing and the analysis of remote sensing data. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

11 pages, 1259 KiB  
Article
Improved Feature Extraction and Similarity Algorithm for Video Object Detection
by Haotian You, Yufang Lu and Haihua Tang
Information 2023, 14(2), 115; https://doi.org/10.3390/info14020115 - 12 Feb 2023
Viewed by 2308
Abstract
Video object detection is an important research direction of computer vision. The task of video object detection is to detect and classify moving objects in a sequence of images. Based on the static image object detector, most of the existing video object detection [...] Read more.
Video object detection is an important research direction of computer vision. The task of video object detection is to detect and classify moving objects in a sequence of images. Based on the static image object detector, most of the existing video object detection methods use the unique temporal correlation of video to solve the problem of missed detection and false detection caused by moving object occlusion and blur. Another video object detection model guided by an optical flow network is widely used. Feature aggregation of adjacent frames is performed by estimating the optical flow field. However, there are many redundant computations for feature aggregation of adjacent frames. To begin with, this paper improved Faster RCNN by Feature Pyramid and Dynamic Region Aware Convolution. Then the S-SELSA module is proposed from the perspective of semantic and feature similarity. Feature similarity is obtained by a modified SSIM algorithm. The module can aggregate the features of frames globally to avoid redundancy. Finally, the experimental results on the ImageNet VID and DET datasets show that the mAP of the method proposed in this paper is 83.55%, which is higher than the existing methods. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

15 pages, 7815 KiB  
Article
Deep Learning and Vision-Based Early Drowning Detection
by Maad Shatnawi, Frdoos Albreiki, Ashwaq Alkhoori and Mariam Alhebshi
Information 2023, 14(1), 52; https://doi.org/10.3390/info14010052 - 16 Jan 2023
Cited by 8 | Viewed by 10366
Abstract
Drowning is one of the top five causes of death for children aged 1–14 worldwide. According to data from the World Health Organization (WHO), drowning is the third most common reason for unintentional fatalities. Designing a drowning detection system is becoming increasingly necessary [...] Read more.
Drowning is one of the top five causes of death for children aged 1–14 worldwide. According to data from the World Health Organization (WHO), drowning is the third most common reason for unintentional fatalities. Designing a drowning detection system is becoming increasingly necessary in order to ensure the safety of swimmers, particularly children. This paper presents a computer vision and deep learning-based early drowning detection approach. We utilized five convolutional neural network models and trained them on our data. These models are SqueezeNet, GoogleNet, AlexNet, ShuffleNet, and ResNet50. ResNet50 showed the best performance, as it achieved 100% prediction accuracy with a reasonable training time. When compared to other approaches, the proposed approach performed exceptionally well in terms of prediction accuracy and computational cost. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

13 pages, 846 KiB  
Article
Zero-Shot Blind Learning for Single-Image Super-Resolution
by Kazuhiro Yamawaki and Xian-Hua Han
Information 2023, 14(1), 33; https://doi.org/10.3390/info14010033 - 5 Jan 2023
Viewed by 2380
Abstract
Deep convolutional neural networks (DCNNs) have manifested significant performance gains for single-image super-resolution (SISR) in the past few years. Most of the existing methods are generally implemented in a fully supervised way using large-scale training samples and only learn the SR models restricted [...] Read more.
Deep convolutional neural networks (DCNNs) have manifested significant performance gains for single-image super-resolution (SISR) in the past few years. Most of the existing methods are generally implemented in a fully supervised way using large-scale training samples and only learn the SR models restricted to specific data. Thus, the adaptation of these models to real low-resolution (LR) images captured under uncontrolled imaging conditions usually leads to poor SR results. This study proposes a zero-shot blind SR framework via leveraging the power of deep learning, but without the requirement of the prior training using predefined imaged samples. It is well known that there are two unknown data: the underlying target high-resolution (HR) images and the degradation operations in the imaging procedure hidden in the observed LR images. Taking these in mind, we specifically employed two deep networks for respectively modeling the priors of both the target HR image and its corresponding degradation kernel and designed a degradation block to realize the observation procedure of the LR image. Via formulating the loss function as the approximation error of the observed LR image, we established a completely blind end-to-end zero-shot learning framework for simultaneously predicting the target HR image and the degradation kernel without any external data. In particular, we adopted a multi-scale encoder–decoder subnet to serve as the image prior learning network, a simple fully connected subnet to serve as the kernel prior learning network, and a specific depthwise convolutional block to implement the degradation procedure. We conducted extensive experiments on several benchmark datasets and manifested the great superiority and high generalization of our method over both SOTA supervised and unsupervised SR methods. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

26 pages, 8005 KiB  
Article
Intelligent Video Surveillance Systems for Vehicle Identification Based on Multinet Architecture
by Jacobo González-Cepeda, Álvaro Ramajo and José María Armingol
Information 2022, 13(7), 325; https://doi.org/10.3390/info13070325 - 6 Jul 2022
Cited by 5 | Viewed by 4092
Abstract
Security cameras have been proven to be particularly useful in preventing and combating crime through identification tasks. Here, two areas can be mainly distinguished: person and vehicle identification. Automatic license plate readers are the most widely used tool for vehicle identification. Although these [...] Read more.
Security cameras have been proven to be particularly useful in preventing and combating crime through identification tasks. Here, two areas can be mainly distinguished: person and vehicle identification. Automatic license plate readers are the most widely used tool for vehicle identification. Although these systems are very effective, they are not reliable enough in certain circumstances. For example, due to traffic jams, vehicle position or weather conditions, the sensors cannot capture an image of the entire license plate. However, there is still a lot of additional information in the image which may also be of interest, and that needs to be analysed quickly and accurately. The correct use of the processing mechanisms can significantly reduce analysis time, increasing the efficiency of video cameras significantly. To solve this problem, we have designed a solution based on two technologies: license plate recognition and vehicle re-identification. For its development and testing, we have also created several datasets recreating a real environment. In addition, during this article, it is also possible to read about some of the main artificial intelligence techniques for these technologies, as they have served as the starting point for this research. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

12 pages, 2302 KiB  
Article
Multi-Attention Module for Dynamic Facial Emotion Recognition
by Junnan Zhi, Tingting Song, Kang Yu, Fengen Yuan, Huaqiang Wang, Guangyang Hu and Hao Yang
Information 2022, 13(5), 207; https://doi.org/10.3390/info13050207 - 19 Apr 2022
Cited by 8 | Viewed by 3115
Abstract
Video-based dynamic facial emotion recognition (FER) is a challenging task, as one must capture and distinguish tiny facial movements representing emotional changes while ignoring the facial differences of different objects. Recent state-of-the-art studies have usually adopted more complex methods to solve this task, [...] Read more.
Video-based dynamic facial emotion recognition (FER) is a challenging task, as one must capture and distinguish tiny facial movements representing emotional changes while ignoring the facial differences of different objects. Recent state-of-the-art studies have usually adopted more complex methods to solve this task, such as large-scale deep learning models or multimodal analysis with reference to multiple sub-models. According to the characteristics of the FER task and the shortcomings of existing methods, in this paper we propose a lightweight method and design three attention modules that can be flexibly inserted into the backbone network. The key information for the three dimensions of space, channel, and time is extracted by means of convolution layer, pooling layer, multi-layer perception (MLP), and other approaches, and attention weights are generated. By sharing parameters at the same level, the three modules do not add too many network parameters while enhancing the focus on specific areas of the face, effective feature information of static images, and key frames. The experimental results on CK+ and eNTERFACE’05 datasets show that this method can achieve higher accuracy. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

21 pages, 36143 KiB  
Article
Low-Altitude Aerial Video Surveillance via One-Class SVM Anomaly Detection from Textural Features in UAV Images
by Danilo Avola, Luigi Cinque, Angelo Di Mambro, Anxhelo Diko, Alessio Fagioli, Gian Luca Foresti, Marco Raoul Marini, Alessio Mecca and Daniele Pannone
Information 2022, 13(1), 2; https://doi.org/10.3390/info13010002 - 22 Dec 2021
Cited by 22 | Viewed by 5385
Abstract
In recent years, small-scale Unmanned Aerial Vehicles (UAVs) have been used in many video surveillance applications, such as vehicle tracking, border control, dangerous object detection, and many others. Anomaly detection can represent a prerequisite of many of these applications thanks to its ability [...] Read more.
In recent years, small-scale Unmanned Aerial Vehicles (UAVs) have been used in many video surveillance applications, such as vehicle tracking, border control, dangerous object detection, and many others. Anomaly detection can represent a prerequisite of many of these applications thanks to its ability to identify areas and/or objects of interest without knowing them a priori. In this paper, a One-Class Support Vector Machine (OC-SVM) anomaly detector based on customized Haralick textural features for aerial video surveillance at low-altitude is presented. The use of a One-Class SVM, which is notoriously a lightweight and fast classifier, enables the implementation of real-time systems even when these are embedded in low-computational small-scale UAVs. At the same time, the use of textural features allows a vision-based system to detect micro and macro structures of an analyzed surface, thus allowing the identification of small and large anomalies, respectively. The latter aspect plays a key role in aerial video surveillance at low-altitude, i.e., 6 to 15 m, where the detection of common items, e.g., cars, is as important as the detection of little and undefined objects, e.g., Improvised Explosive Devices (IEDs). Experiments obtained on the UAV Mosaicking and Change Detection (UMCD) dataset show the effectiveness of the proposed system in terms of accuracy, precision, recall, and F1-score, where the model achieves a 100% precision, i.e., never misses an anomaly, but at the expense of a reasonable trade-off in its recall, which still manages to reach up to a 71.23% score. Moreover, when compared to classical Haralick textural features, the model obtains significantly higher performances, i.e., ≈20% on all metrics, further demonstrating the approach effectiveness. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

Review

Jump to: Research

17 pages, 2481 KiB  
Review
Facial Emotion Recognition Using Conventional Machine Learning and Deep Learning Methods: Current Achievements, Analysis and Remaining Challenges
by Amjad Rehman Khan
Information 2022, 13(6), 268; https://doi.org/10.3390/info13060268 - 25 May 2022
Cited by 55 | Viewed by 24562
Abstract
Facial emotion recognition (FER) is an emerging and significant research area in the pattern recognition domain. In daily life, the role of non-verbal communication is significant, and in overall communication, its involvement is around 55% to 93%. Facial emotion analysis is efficiently used [...] Read more.
Facial emotion recognition (FER) is an emerging and significant research area in the pattern recognition domain. In daily life, the role of non-verbal communication is significant, and in overall communication, its involvement is around 55% to 93%. Facial emotion analysis is efficiently used in surveillance videos, expression analysis, gesture recognition, smart homes, computer games, depression treatment, patient monitoring, anxiety, detecting lies, psychoanalysis, paralinguistic communication, detecting operator fatigue and robotics. In this paper, we present a detailed review on FER. The literature is collected from different reputable research published during the current decade. This review is based on conventional machine learning (ML) and various deep learning (DL) approaches. Further, different FER datasets for evaluation metrics that are publicly available are discussed and compared with benchmark results. This paper provides a holistic review of FER using traditional ML and DL methods to highlight the future gap in this domain for new researchers. Finally, this review work is a guidebook and very helpful for young researchers in the FER area, providing a general understating and basic knowledge of the current state-of-the-art methods, and to experienced researchers looking for productive directions for future work. Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Privacy-focused Evaluation Framework for Contact Tracing Mobile Apps: A Case Study of COVID-19 Pandemic
Authors: Ashad Kabir
Affiliation: School of Computing and Mathematics Charles Sturt University Panorama Ave, Bathurst, NSW 2795, Australia
Abstract: Digital contact tracing apps were used during the COVID-19 outbreak to track individuals who had direct contact with a diagnosed patient. This study's primary goal is to propose and implement an evaluation framework that can be used to evaluate privacy concerns and examine the feasibility and effectiveness of contact tracing applications in terms of privacy issues. A comparative analysis was conducted on 70 contact tracing apps from 44 countries where the data were collected from the app documentation, app stores, and related websites to point out their data privacy risks and safety status. A dataset has also been created to create a scoring model based on their privacy risk and safety scores. Additionally, we evaluated those apps based on user ratings and comments found in the app stores. After studying their behavior, we identified a set of privacy risks and designed an evaluation framework along with a privacy concern guideline for future app designers. The comparative analysis demonstrates that not all contact tracing apps are risk-free in terms of identifiable data collection, and data privacy policies. Future research will need to address the analysis of security mechanisms and cultural aspects for the development of a better contact tracing application.

Title: Automated Vehicle Identification by License Plate Recognition
Authors: Kumarmangal Roy1, Muneer Ahmad2, Norjihan Binti Abdul Ghani1, Jia Uddin2, Jungpil SHIN3 (Corresponding author)*
Affiliation: Malaysia, Korea, Malaysia, Korea, Japan
Abstract: Movement of vehicles in and out of the predefined enclosure is an important security protocol that we are encountering on daily basis. Identification of vehicles is a very important factor for security surveillance. Manual identification and managing the database of such vehicles is tedious. This kind of vehicle management for many vehicles is not only inconvenient but also time-consuming and requires the physical intervention of security personnel, which under the present situation of Covid-19 impact is not that suitable. Not only that, multiple entry-exit points, will be more error-prone. Rather a contactless option for a smart vehicle management system is suitable. The best way to streamline such a process is through the automatic identification of unique license number plates. In this research report, we propose a license number plate recognition approach using Haar Cascade Object detectors. Not only that we also compare the performance of Haar Cascade for number plate detection with that of MobileNet-SSD (light deep neural network architecture integrated as the base network with single-shot object detector architecture). Once the license plate is detected we use OCR for character identification. Most of the previous works in license recognition systems have limitations like light exposure, stationary backgrounds, indoor area, specific driveways, etc. Our current approach is robust and works well on live object detection. The main objective of this project is to create an intelligent pipeline for an automatic vehicle license number plate detection system that will provide smart authentication based on the legitimacy of vehicles by showcasing the performance on real-time data for Malaysian number plates.

Back to TopTop