Machine Learning for Pattern Recognition

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Evolutionary Algorithms and Machine Learning".

Deadline for manuscript submissions: closed (15 March 2024) | Viewed by 16762

Special Issue Editors


E-Mail Website
Guest Editor
Graduate Institute of Intelligent Robotics, Hwa Hsia University of Technology, New Taipei City 235, Taiwan
Interests: artificial intelligence; machine learning; image processing; biometrics; pattern recognition
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer and Communication Engineering, Ming Chuan University, Taoyuan 333, Taiwan
Interests: multimedia network services; computer network; wireless communication and network; image/video processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electronic Engineering, Chung Yuan Christian University, Taoyuan City 32023, Taiwan
Interests: wireless multimedia communication; digital signal processing; pattern recognition; voice, image, video and biomedical signal processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Electrical Engineering, Fu Jen Catholic University, New Taipei 24205, Taiwan
Interests: intelligent video surveillance; face recognition; deep learning for object detection; robotic vision; embedded computer vision; sleep healthcare; neuromorphic computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In the field of artificial intelligence, machine learning is a well-known framework for pattern recognition. Machine learning has made significant advances in pattern recognition due to Big Data revolution and developments in parallel processing units. Pattern recognition has been applied in a wide range of real-world domains, such as face detection/recognition, facial expression recognition, medical image analysis/recognition, gesture recognition, behavior recognition, advanced driver assistance systems (ADAS), etc. This Special Issue aims to provide a platform which brings together high-quality research, theories, algorithms, innovative ideas, and applications in the above-mentioned areas, among others.

Prof. Dr. Chih-Lung Lin
Prof. Dr. Bor-Jiunn Hwang
Prof. Dr. Shaou-Gang Miaou
Prof. Dr. Yuan-Kai Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • intelligence artificial
  • machine learning
  • deep learning
  • neural network
  • biometrics
  • pattern recognition
  • image/video processing
  • speech recognition
  • computer vision

Related Special Issue

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 4854 KiB  
Article
Point-Sim: A Lightweight Network for 3D Point Cloud Classification
by Jiachen Guo and Wenjie Luo
Algorithms 2024, 17(4), 158; https://doi.org/10.3390/a17040158 - 15 Apr 2024
Viewed by 348
Abstract
Analyzing point clouds with neural networks is a current research hotspot. In order to analyze the 3D geometric features of point clouds, most neural networks improve the network performance by adding local geometric operators and trainable parameters. However, deep learning usually requires a [...] Read more.
Analyzing point clouds with neural networks is a current research hotspot. In order to analyze the 3D geometric features of point clouds, most neural networks improve the network performance by adding local geometric operators and trainable parameters. However, deep learning usually requires a large amount of computational resources for training and inference, which poses challenges to hardware devices and energy consumption. Therefore, some researches have started to try to use a nonparametric approach to extract features. Point-NN combines nonparametric modules to build a nonparametric network for 3D point cloud analysis, and the nonparametric components include operations such as trigonometric embedding, farthest point sampling (FPS), k-nearest neighbor (k-NN), and pooling. However, Point-NN has some blindness in feature embedding using the trigonometric function during feature extraction. To eliminate this blindness as much as possible, we utilize a nonparametric energy function-based attention mechanism (ResSimAM). The embedded features are enhanced by calculating the energy of the features by the energy function, and then the ResSimAM is used to enhance the weights of the embedded features by the energy to enhance the features without adding any parameters to the original network; Point-NN needs to compute the similarity between each feature at the naive feature similarity matching stage; however, the magnitude difference of the features in vector space during the feature extraction stage may affect the final matching result. We use the Squash operation to squeeze the features. This nonlinear operation can make the features squeeze to a certain range without changing the original direction in the vector space, thus eliminating the effect of feature magnitude, and we can ultimately better complete the naive feature matching in the vector space. We inserted these modules into the network and build a nonparametric network, Point-Sim, which performs well in 3D classification tasks. Based on this, we extend the lightweight neural network Point-SimP by adding some trainable parameters for the point cloud classification task, which requires only 0.8 M parameters for high performance analysis. Experimental results demonstrate the effectiveness of our proposed algorithm in the point cloud shape classification task. The corresponding results on ModelNet40 and ScanObjectNN are 83.9% and 66.3% for 0 M parameters—without any training—and 93.3% and 86.6% for 0.8 M parameters. The Point-SimP reaches a test speed of 962 samples per second on the ModelNet40 dataset. The experimental results show that our proposed method effectively improves the performance on point cloud classification networks. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

20 pages, 11925 KiB  
Article
A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer
by Jorge Juarez-Lucero, Maria Guevara-Villa, Anabel Sanchez-Sanchez, Raquel Diaz-Hernandez and Leopoldo Altamirano-Robles
Algorithms 2024, 17(4), 149; https://doi.org/10.3390/a17040149 - 02 Apr 2024
Viewed by 566
Abstract
Sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) is used to identify protein presence, absence, or overexpression and usually, their interpretation is visual. Some published methods can localize the position of proteins using image analysis on images of SDS-PAGE gels. However, they cannot automatically determine [...] Read more.
Sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) is used to identify protein presence, absence, or overexpression and usually, their interpretation is visual. Some published methods can localize the position of proteins using image analysis on images of SDS-PAGE gels. However, they cannot automatically determine a particular protein band’s concentration or molecular weight. In this article, a new methodology to identify the number of samples present in an SDS-PAGE gel and the molecular weight of the recombinant protein is developed. SDS-PAGE images of different concentrations of pure GPN protein were created to produce homogeneous gels. Then, these images were analyzed using the developed methodology called Image Profile Based on Binarized Image Segmentation (IPBBIS). It is based on detecting the maximum intensity values of the analyzed bands and produces the segmentation of images filtered by a binary mask. The IPBBIS was developed to identify the number of samples in an SDS-PAGE gel and the molecular weight of the recombinant protein of interest, with a margin of error of 3.35%. An accuracy of 0.9850521 was achieved for homogeneous gels and 0.91736 for heterogeneous gels of low quality. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

22 pages, 2044 KiB  
Article
Relational Fisher Analysis: Dimensionality Reduction in Relational Data with Global Convergence
by Li-Na Wang, Guoqiang Zhong, Yaxin Shi and Mohamed Cheriet
Algorithms 2023, 16(11), 522; https://doi.org/10.3390/a16110522 - 15 Nov 2023
Viewed by 1335
Abstract
Most of the dimensionality reduction algorithms assume that data are independent and identically distributed (i.i.d.). In real-world applications, however, sometimes there exist relationships between data. Some relational learning methods have been proposed, but those with discriminative relationship analysis are lacking yet, as important [...] Read more.
Most of the dimensionality reduction algorithms assume that data are independent and identically distributed (i.i.d.). In real-world applications, however, sometimes there exist relationships between data. Some relational learning methods have been proposed, but those with discriminative relationship analysis are lacking yet, as important supervisory information is usually ignored. In this paper, we propose a novel and general framework, called relational Fisher analysis (RFA), which successfully integrates relational information into the dimensionality reduction model. For nonlinear data representation learning, we adopt the kernel trick to RFA and propose the kernelized RFA (KRFA). In addition, the convergence of the RFA optimization algorithm is proved theoretically. By leveraging suitable strategies to construct the relational matrix, we conduct extensive experiments to demonstrate the superiority of our RFA and KRFA methods over related approaches. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

19 pages, 4191 KiB  
Article
Parkinson’s Disease Classification Framework Using Vocal Dynamics in Connected Speech
by Sai Bharadwaj Appakaya, Ruchira Pratihar and Ravi Sankar
Algorithms 2023, 16(11), 509; https://doi.org/10.3390/a16110509 - 04 Nov 2023
Viewed by 1379
Abstract
Parkinson’s disease (PD) classification through speech has been an advancing field of research because of its ease of acquisition and processing. The minimal infrastructure requirements of the system have also made it suitable for telemonitoring applications. Researchers have studied the effects of PD [...] Read more.
Parkinson’s disease (PD) classification through speech has been an advancing field of research because of its ease of acquisition and processing. The minimal infrastructure requirements of the system have also made it suitable for telemonitoring applications. Researchers have studied the effects of PD on speech from various perspectives using different speech tasks. Typical speech deficits due to PD include voice monotony (e.g., monopitch), breathy or rough quality, and articulatory errors. In connected speech, these symptoms are more emphatic, which is also the basis for speech assessment in popular rating scales used for PD, like the Unified Parkinson’s Disease Rating Scale (UPDRS) and Hoehn and Yahr (HY). The current study introduces an innovative framework that integrates pitch-synchronous segmentation and an optimized set of features to investigate and analyze continuous speech from both PD patients and healthy controls (HC). Comparison of the proposed framework against existing methods has shown its superiority in classification performance and mitigation of overfitting in machine learning models. A set of optimal classifiers with unbiased decision-making was identified after comparing several machine learning models. The outcomes yielded by the classifiers demonstrate that the framework effectively learns the intrinsic characteristics of PD from connected speech, which can potentially offer valuable assistance in clinical diagnosis. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

18 pages, 7313 KiB  
Article
FenceTalk: Exploring False Negatives in Moving Object Detection
by Yun-Wei Lin, Yuh-Hwan Liu, Yi-Bing Lin and Jian-Chang Hong
Algorithms 2023, 16(10), 481; https://doi.org/10.3390/a16100481 - 17 Oct 2023
Viewed by 1336
Abstract
Deep learning models are often trained with a large amount of labeled data to improve the accuracy for moving object detection in new fields. However, the model may not be robust enough due to insufficient training data in the new field, resulting in [...] Read more.
Deep learning models are often trained with a large amount of labeled data to improve the accuracy for moving object detection in new fields. However, the model may not be robust enough due to insufficient training data in the new field, resulting in some moving objects not being successfully detected. Training with data that is not successfully detected by the pre-trained deep learning model can effectively improve the accuracy for the new field, but it is costly to retrieve the image data containing the moving objects from millions of images per day to train the model. Therefore, we propose FenceTalk, a moving object detection system, which compares the difference between the current frame and the background image based on the structural similarity index measure (SSIM). FenceTalk automatically selects suspicious images with moving objects that are not successfully detected by the Yolo model, so that the training data can be selected at a lower labor cost. FenceTalk can effectively define and update the background image in the field, reducing the misjudgment caused by changes in light and shadow, and selecting images containing moving objects with an optimal threshold. Our study has demonstrated its performance and generality using real data from different fields. For example, compared with the pre-trained Yolo model using the MS COCO dataset, the overall recall of FenceTalk increased from 72.36% to 98.39% for the model trained with the data picked out by SSIM. The recall of FenceTalk, combined with Yolo and SSIM, can reach more than 99%. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

14 pages, 2181 KiB  
Article
Neural Network Based Approach to Recognition of Meteor Tracks in the Mini-EUSO Telescope Data
by Mikhail Zotov, Dmitry Anzhiganov, Aleksandr Kryazhenkov, Dario Barghini, Matteo Battisti, Alexander Belov, Mario Bertaina, Marta Bianciotto, Francesca Bisconti, Carl Blaksley, Sylvie Blin, Giorgio Cambiè, Francesca Capel, Marco Casolino, Toshikazu Ebisuzaki, Johannes Eser, Francesco Fenu, Massimo Alberto Franceschi, Alessio Golzio, Philippe Gorodetzky, Fumiyoshi Kajino, Hiroshi Kasuga, Pavel Klimov, Massimiliano Manfrin, Laura Marcelli, Hiroko Miyamoto, Alexey Murashov, Tommaso Napolitano, Hiroshi Ohmori, Angela Olinto, Etienne Parizot, Piergiorgio Picozza, Lech Wiktor Piotrowski, Zbigniew Plebaniak, Guillaume Prévôt, Enzo Reali, Marco Ricci, Giulia Romoli, Naoto Sakaki, Kenji Shinozaki, Christophe De La Taille, Yoshiyuki Takizawa, Michal Vrábel and Lawrence Wienckeadd Show full author list remove Hide full author list
Algorithms 2023, 16(9), 448; https://doi.org/10.3390/a16090448 - 19 Sep 2023
Cited by 1 | Viewed by 1041
Abstract
Mini-EUSO is a wide-angle fluorescence telescope that registers ultraviolet (UV) radiation in the nocturnal atmosphere of Earth from the International Space Station. Meteors are among multiple phenomena that manifest themselves not only in the visible range but also in the UV. We present [...] Read more.
Mini-EUSO is a wide-angle fluorescence telescope that registers ultraviolet (UV) radiation in the nocturnal atmosphere of Earth from the International Space Station. Meteors are among multiple phenomena that manifest themselves not only in the visible range but also in the UV. We present two simple artificial neural networks that allow for recognizing meteor signals in the Mini-EUSO data with high accuracy in terms of a binary classification problem. We expect that similar architectures can be effectively used for signal recognition in other fluorescence telescopes, regardless of the nature of the signal. Due to their simplicity, the networks can be implemented in onboard electronics of future orbital or balloon experiments. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

13 pages, 4198 KiB  
Article
Automated Segmentation of Optical Coherence Tomography Images of the Human Tympanic Membrane Using Deep Learning
by Thomas P. Oghalai, Ryan Long, Wihan Kim, Brian E. Applegate and John S. Oghalai
Algorithms 2023, 16(9), 445; https://doi.org/10.3390/a16090445 - 17 Sep 2023
Viewed by 1326
Abstract
Optical Coherence Tomography (OCT) is a light-based imaging modality that is used widely in the diagnosis and management of eye disease, and it is starting to become used to evaluate for ear disease. However, manual image analysis to interpret the anatomical and pathological [...] Read more.
Optical Coherence Tomography (OCT) is a light-based imaging modality that is used widely in the diagnosis and management of eye disease, and it is starting to become used to evaluate for ear disease. However, manual image analysis to interpret the anatomical and pathological findings in the images it provides is complicated and time-consuming. To streamline data analysis and image processing, we applied a machine learning algorithm to identify and segment the key anatomical structure of interest for medical diagnostics, the tympanic membrane. Using 3D volumes of the human tympanic membrane, we used thresholding and contour finding to locate a series of objects. We then applied TensorFlow deep learning algorithms to identify the tympanic membrane within the objects using a convolutional neural network. Finally, we reconstructed the 3D volume to selectively display the tympanic membrane. The algorithm was able to correctly identify the tympanic membrane properly with an accuracy of ~98% while removing most of the artifacts within the images, caused by reflections and signal saturations. Thus, the algorithm significantly improved visualization of the tympanic membrane, which was our primary objective. Machine learning approaches, such as this one, will be critical to allowing OCT medical imaging to become a convenient and viable diagnostic tool within the field of otolaryngology. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

12 pages, 2021 KiB  
Article
A Pattern Recognition Analysis of Vessel Trajectories
by Paolo Massimo Buscema, Giulia Massini, Giovanbattista Raimondi, Giuseppe Caporaso, Marco Breda and Riccardo Petritoli
Algorithms 2023, 16(9), 414; https://doi.org/10.3390/a16090414 - 29 Aug 2023
Cited by 1 | Viewed by 989
Abstract
The automatic identification system (AIS) facilitates the monitoring of ship movements and provides essential input parameters for traffic safety. Previous studies have employed AIS data to detect behavioral anomalies and classify vessel types using supervised and unsupervised algorithms, including deep learning techniques. The [...] Read more.
The automatic identification system (AIS) facilitates the monitoring of ship movements and provides essential input parameters for traffic safety. Previous studies have employed AIS data to detect behavioral anomalies and classify vessel types using supervised and unsupervised algorithms, including deep learning techniques. The approach proposed in this work focuses on the recognition of vessel types through the “Take One Class at a Time” (TOCAT) classification strategy. This approach pivots on a collection of adaptive models rather than a single intricate algorithm. Using radar data, these models are trained by taking into account aspects such as identifiers, position, velocity, and heading. However, it purposefully excludes positional data to counteract the inconsistencies stemming from route variations and irregular sampling frequencies. Using the given data, we achieved a mean accuracy of 83% on a 6-class classification task. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

15 pages, 350 KiB  
Article
Efficient DNN Model for Word Lip-Reading
by Taiki Arakane and Takeshi Saitoh
Algorithms 2023, 16(6), 269; https://doi.org/10.3390/a16060269 - 27 May 2023
Cited by 5 | Viewed by 1755
Abstract
This paper studies various deep learning models for word-level lip-reading technology, one of the tasks in the supervised learning of video classification. Several public datasets have been published in the lip-reading research field. However, few studies have investigated lip-reading techniques using multiple datasets. [...] Read more.
This paper studies various deep learning models for word-level lip-reading technology, one of the tasks in the supervised learning of video classification. Several public datasets have been published in the lip-reading research field. However, few studies have investigated lip-reading techniques using multiple datasets. This paper evaluates deep learning models using four publicly available datasets, namely Lip Reading in the Wild (LRW), OuluVS, CUAVE, and Speech Scene by Smart Device (SSSD), which are representative datasets in this field. LRW is one of the large-scale public datasets and targets 500 English words released in 2016. Initially, the recognition accuracy of LRW was 66.1%, but many research groups have been working on it. The current the state of the art (SOTA) has achieved 94.1% by 3D-Conv + ResNet18 + {DC-TCN, MS-TCN, BGRU} + knowledge distillation + word boundary. Regarding the SOTA model, in this paper, we combine existing models such as ResNet, WideResNet, WideResNet, EfficientNet, MS-TCN, Transformer, ViT, and ViViT, and investigate the effective models for word lip-reading tasks using six deep learning models with modified feature extractors and classifiers. Through recognition experiments, we show that similar model structures of 3D-Conv + ResNet18 for feature extraction and MS-TCN model for inference are valid for four datasets with different scales. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

18 pages, 10352 KiB  
Article
Human Body Shapes Anomaly Detection and Classification Using Persistent Homology
by Steve de Rose, Philippe Meyer and Frédéric Bertrand
Algorithms 2023, 16(3), 161; https://doi.org/10.3390/a16030161 - 15 Mar 2023
Viewed by 2413
Abstract
Accurate sizing systems of a population permit the minimization of the production costs of the textile apparel industry and allow firms to satisfy their customers. Hence, information about human body shapes needs to be extracted in order to examine, compare and classify human [...] Read more.
Accurate sizing systems of a population permit the minimization of the production costs of the textile apparel industry and allow firms to satisfy their customers. Hence, information about human body shapes needs to be extracted in order to examine, compare and classify human morphologies. In this paper, we use topological data analysis to study human body shapes. Persistence theory applied to anthropometric point clouds together with clustering algorithms show that relevant information about shapes is extracted by persistent homology. In particular, the homologies of human body points have interesting interpretations in terms of human anatomy. In the first place, anomalies of scans are detected using complete-linkage hierarchical clusterings. Then, a discrimination index shows which type of clustering separates gender accurately and if it is worth restricting to body trunks or not. Finally, Ward-linkage hierarchical clusterings with Davies–Bouldin, Dunn and Silhouette indices are used to define eight male morphotypes and seven female morphotypes, which are different in terms of weight classes and ratios between bust, waist and hip circumferences. The techniques used in this work permit us to classify human bodies and detect scan anomalies directly on the full human body point clouds rather than the usual methods involving the extraction of body measurements from individuals or their scans. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

25 pages, 642 KiB  
Article
Generalizing the Alpha-Divergences and the Oriented Kullback–Leibler Divergences with Quasi-Arithmetic Means
by Frank Nielsen
Algorithms 2022, 15(11), 435; https://doi.org/10.3390/a15110435 - 17 Nov 2022
Viewed by 1987
Abstract
The family of α-divergences including the oriented forward and reverse Kullback–Leibler divergences is often used in signal processing, pattern recognition, and machine learning, among others. Choosing a suitable α-divergence can either be done beforehand according to some prior knowledge of the [...] Read more.
The family of α-divergences including the oriented forward and reverse Kullback–Leibler divergences is often used in signal processing, pattern recognition, and machine learning, among others. Choosing a suitable α-divergence can either be done beforehand according to some prior knowledge of the application domains or directly learned from data sets. In this work, we generalize the α-divergences using a pair of strictly comparable weighted means. Our generalization allows us to obtain in the limit case α1 the 1-divergence, which provides a generalization of the forward Kullback–Leibler divergence, and in the limit case α0, the 0-divergence, which corresponds to a generalization of the reverse Kullback–Leibler divergence. We then analyze the condition for a pair of weighted quasi-arithmetic means to be strictly comparable and describe the family of quasi-arithmetic α-divergences including its subfamily of power homogeneous α-divergences. In particular, we study the generalized quasi-arithmetic 1-divergences and 0-divergences and show that these counterpart generalizations of the oriented Kullback–Leibler divergences can be rewritten as equivalent conformal Bregman divergences using strictly monotone embeddings. Finally, we discuss the applications of these novel divergences to k-means clustering by studying the robustness property of the centroids. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition)
Show Figures

Figure 1

Back to TopTop