Topic Editors

Prof. Dr. Andrea Prati
Associate Professor, Department of Engineering and Architecture, University of Parma, Parco Area delle Scienze, 181/A 43124 Parma, Italy
Dr. Luis Javier García Villalba
Department of Software Engineering and Artificial Intelligence (DISIA), Faculty of Computer Science and Engineering, Office 431, Universidad Complutense de Madrid (UCM), 28040 Madrid, Spain
Prof. Dr. Vincent A. Cicirello
Professor of Computer Science, Stockton University, Galloway, NJ 08205, USA

Machine and Deep Learning

Abstract submission deadline
31 December 2022
Manuscript submission deadline
31 March 2023
Viewed by
127494

Topic Information

Dear Colleagues,

Our society is facing a new era of automation, not only in industry but also in our daily lives. Computers are everywhere, and their employment is no longer relegated to just industry or work but also to entertainment and leisure. Computing and artificial intelligence are not simply scientific lab experiments for publishing papers in major journals and conferences but opportunities to make our lives better.  

Among the different fields of artificial intelligence, machine learning is certainly one of the most studied in recent years. There has been a gigantic shift in the last few decades due to the birth of deep learning, which has opened unprecedented theoretic and application-based opportunities.

In this context, advances in machine and deep learning are discovered on a daily basis, but still much has to be learned. For instance, the functioning of deep learning architectures is still partially obscure and explaining it will foster new applications, algorithms and architectures. While deep learning is considered the hottest topic of artificial intelligence nowadays, still much interest is raised by “traditional” machine learning, especially in (but not limited to) new learning paradigms, extendibility to big/huge data applications, and optimization.

Even more diffused, then, are the (new) applications of machine and deep learning, to finance, healthcare, sustainability, climate science, neuroscience, to name a few. Continuing and improving the research in machine and deep learning will not only be a chance for new surprising discoveries, but also a way to contribute to our wellbeing and economical growth.

Prof. Dr. Andrea Prati
Dr. Luis Javier García Villalba
Prof. Dr. Vincent A. Cicirello
Topic Editors

Keywords

  • machine learning
  • deep learning
  • natural language processing
  • text mining
  • active learning
  • clustering
  • regression
  • data mining
  • web mining
  • online learning
  • ranking in machine learning
  • reinforcement learning
  • transfer learning
  • semi-supervised learning
  • zero- and few-shot learning
  • time series analysis
  • unsupervised learning
  • deep learning architectures
  • generative models
  • deep reinforcement learning
  • learning theory (bandits, game theory, statistical learning theory, etc.)
  • optimization (convex and non-convex optimization, matrix/tensor methods, sparsity, etc.)
  • probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
  • probabilistic inference (Bayesian methods, graphical models, Monte Carlo methods, etc.)
  • evolution-based methods
  • explanation-based learning
  • multi-agent learning
  • neuroscience and cognitive science (e.g., neural coding, brain–computer interfaces)
  • trustworthy machine learning (accountability, causality, fairness, privacy, robustness, etc.)
  • applications (e.g., speech processing, computational biology, computer vision, NLP)

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Applied Sciences
applsci
2.838 3.7 2011 17.4 Days 2300 CHF Submit
Big Data and Cognitive Computing
BDCC
- 6.1 2017 17 Days 1400 CHF Submit
Mathematics
mathematics
2.592 2.9 2013 17.8 Days 1800 CHF Submit
Electronics
electronics
2.690 3.7 2012 16.6 Days 2000 CHF Submit
Entropy
entropy
2.738 4.4 1999 18.7 Days 1800 CHF Submit

Preprints is a platform dedicated to making early versions of research outputs permanently available and citable. MDPI journals allow posting on preprint servers such as Preprints.org prior to publication. For more details about reprints, please visit https://www.preprints.org.

Published Papers (201 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
Article
Combination of Deep Cross-Stage Partial Network and Spatial Pyramid Pooling for Automatic Hand Detection
Big Data Cogn. Comput. 2022, 6(3), 85; https://doi.org/10.3390/bdcc6030085 - 09 Aug 2022
Abstract
The human hand is involved in many computer vision tasks, such as hand posture estimation, hand movement identification, human activity analysis, and other similar tasks, in which hand detection is an important preprocessing step. It is still difficult to correctly recognize some hands [...] Read more.
The human hand is involved in many computer vision tasks, such as hand posture estimation, hand movement identification, human activity analysis, and other similar tasks, in which hand detection is an important preprocessing step. It is still difficult to correctly recognize some hands in a cluttered environment because of the complex display variations of agile human hands and the fact that they have a wide range of motion. In this study, we provide a brief assessment of CNN-based object identification algorithms, specifically Densenet Yolo V2, Densenet Yolo V2 CSP, Densenet Yolo V2 CSP SPP, Resnet 50 Yolo V2, Resnet 50 CSP, Resnet 50 CSP SPP, Yolo V4 SPP, Yolo V4 CSP SPP, and Yolo V5. The advantages of CSP and SPP are thoroughly examined and described in detail in each algorithm. We show in our experiments that Yolo V4 CSP SPP provides the best level of precision available. The experimental results show that the CSP and SPP layers help improve the accuracy of CNN model testing performance. Our model leverages the advantages of CSP and SPP. Our proposed method Yolo V4 CSP SPP outperformed previous research results by an average of 8.88%, with an improvement from 87.6% to 96.48%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
RSS-Based Wireless LAN Indoor Localization and Tracking Using Deep Architectures
Big Data Cogn. Comput. 2022, 6(3), 84; https://doi.org/10.3390/bdcc6030084 - 08 Aug 2022
Abstract
Wireless Local Area Network (WLAN) positioning is a challenging task indoors due to environmental constraints and the unpredictable behavior of signal propagation, even at a fixed location. The aim of this work is to develop deep learning-based approaches for indoor localization and tracking [...] Read more.
Wireless Local Area Network (WLAN) positioning is a challenging task indoors due to environmental constraints and the unpredictable behavior of signal propagation, even at a fixed location. The aim of this work is to develop deep learning-based approaches for indoor localization and tracking by utilizing Received Signal Strength (RSS). The study proposes Multi-Layer Perceptron (MLP), One and Two Dimensional Convolutional Neural Networks (1D CNN and 2D CNN), and Long Short Term Memory (LSTM) deep networks architectures for WLAN indoor positioning based on the data obtained by actual RSS measurements from an existing WLAN infrastructure in a mobile user scenario. The results, using different types of deep architectures including MLP, CNNs, and LSTMs with existing WLAN algorithms, are presented. The Root Mean Square Error (RMSE) is used as the assessment criterion. The proposed LSTM Model 2 achieved a dynamic positioning RMSE error of 1.73m, which outperforms probabilistic WLAN algorithms such as Memoryless Positioning (RMSE: 10.35m) and Nonparametric Information (NI) filter with variable acceleration (RMSE: 5.2m) under the same experiment environment. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition
Entropy 2022, 24(8), 1046; https://doi.org/10.3390/e24081046 - 29 Jul 2022
Abstract
Cross-corpus speech emotion recognition (SER) is a challenging task, and its difficulty lies in the mismatch between the feature distributions of the training (source domain) and testing (target domain) data, leading to the performance degradation when the model deals with new domain data. [...] Read more.
Cross-corpus speech emotion recognition (SER) is a challenging task, and its difficulty lies in the mismatch between the feature distributions of the training (source domain) and testing (target domain) data, leading to the performance degradation when the model deals with new domain data. Previous works explore utilizing domain adaptation (DA) to eliminate the domain shift between the source and target domains and have achieved the promising performance in SER. However, these methods mainly treat cross-corpus tasks simply as the DA problem, directly aligning the distributions across domains in a common feature space. In this case, excessively narrowing the domain distance will impair the emotion discrimination of speech features since it is difficult to maintain the completeness of the emotion space only by an emotion classifier. To overcome this issue, we propose a progressively discriminative transfer network (PDTN) for cross-corpus SER in this paper, which can enhance the emotion discrimination ability of speech features while eliminating the mismatch between the source and target corpora. In detail, we design two special losses in the feature layers of PDTN, i.e., emotion discriminant loss Ld and distribution alignment loss La. By incorporating prior knowledge of speech emotion into feature learning (i.e., high and low valence speech emotion features have their respective cluster centers), we integrate a valence-aware center loss Lv and an emotion-aware center loss Lc as the Ld to guarantee the discriminative learning of speech emotions except an emotion classifier. Furthermore, a multi-layer distribution alignment loss La is adopted to more precisely eliminate the discrepancy of feature distributions between the source and target domains. Finally, through the optimization of PDTN by combining three losses, i.e., cross-entropy loss Le, Ld, and La, we can gradually eliminate the domain mismatch between the source and target corpora while maintaining the emotion discrimination of speech features. Extensive experimental results of six cross-corpus tasks on three datasets, i.e., Emo-DB, eNTERFACE, and CASIA, reveal that our proposed PDTN outperforms the state-of-the-art methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Article
Questioning the Anisotropy of Pedestrian Dynamics: An Empirical Analysis with Artificial Neural Networks
Appl. Sci. 2022, 12(15), 7563; https://doi.org/10.3390/app12157563 - 27 Jul 2022
Abstract
Identifying the factors that control the dynamics of pedestrians is a crucial step towards modeling and building various pedestrian-oriented simulation systems. In this article, we empirically explore the influential factors that control the single-file movement of pedestrians and their impact. Our goal in [...] Read more.
Identifying the factors that control the dynamics of pedestrians is a crucial step towards modeling and building various pedestrian-oriented simulation systems. In this article, we empirically explore the influential factors that control the single-file movement of pedestrians and their impact. Our goal in this context is to apply feed-forward neural networks to predict and understand the individual speeds for different densities of pedestrians. With artificial neural networks, we can approximate the fitting function that describes pedestrians’ movement without having modeling bias. Our analysis is focused on the distances and range of interactions across neighboring pedestrians. As indicated by previous research, we find that the speed of pedestrians depends on the distance to the predecessor. Yet, in contrast to classical purely anisotropic approaches—which are based on vision fields and assume that the interaction mainly depends on the distance in front—our results demonstrate that the distance to the follower also significantly influences movement. Using the distance to the follower combined with the subject pedestrian’s headway distance to predict the speed improves the estimation by 18% compared to the prediction using the space in front alone. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Deep Compressive Sensing on ECG Signals with Modified Inception Block and LSTM
Entropy 2022, 24(8), 1024; https://doi.org/10.3390/e24081024 - 25 Jul 2022
Abstract
In practical electrocardiogram (ECG) monitoring, there are some challenges in reducing the data burden and energy costs. Therefore, compressed sensing (CS) which can conduct under-sampling and reconstruction at the same time is adopted in the ECG monitoring application. Recently, deep learning used in [...] Read more.
In practical electrocardiogram (ECG) monitoring, there are some challenges in reducing the data burden and energy costs. Therefore, compressed sensing (CS) which can conduct under-sampling and reconstruction at the same time is adopted in the ECG monitoring application. Recently, deep learning used in CS methods improves the reconstruction performance significantly and can removes of some of the constraints in traditional CS. In this paper, we propose a deep compressive-sensing scheme for ECG signals, based on modified-Inception block and long short-term memory (LSTM). The framework is comprised of four modules: preprocessing; compression; initial; and final reconstruction. We adaptively compressed the normalized ECG signals, sequentially using three convolutional layers, and reconstructed the signals with a modified Inception block and LSTM. We conducted our experiments on the MIT-BIH Arrhythmia Database and Non-Invasive Fetal ECG Arrhythmia Database to validate the robustness of our model, adopting Signal-to-Noise Ratio (SNR) and percentage Root-mean-square Difference (PRD) as the evaluation metrics. The PRD of our scheme was the lowest and the SNR was the highest at all of the sensing rates in our experiments on both of the databases, and when the sensing rate was higher than 0.5, the PRD was lower than 2%, showing significant improvement in reconstruction performance compared to the comparative methods. Our method also showed good recovering quality in the noisy data. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
EEG-Based Schizophrenia Diagnosis through Time Series Image Conversion and Deep Learning
Electronics 2022, 11(14), 2265; https://doi.org/10.3390/electronics11142265 - 20 Jul 2022
Abstract
Schizophrenia, a mental disorder experienced by more than 20 million people worldwide, is emerging as a serious issue in society. Currently, the diagnosis of schizophrenia is based only on mental disorder diagnosis and/or diagnosis by a psychiatrist or mental health professional using DSM-5, [...] Read more.
Schizophrenia, a mental disorder experienced by more than 20 million people worldwide, is emerging as a serious issue in society. Currently, the diagnosis of schizophrenia is based only on mental disorder diagnosis and/or diagnosis by a psychiatrist or mental health professional using DSM-5, a diagnostic and statistical manual of mental disorders. Furthermore, patients in countries with insufficient access to healthcare are difficult to diagnose for schizophrenia and early diagnosis is even more problematic. While various studies are being conducted to solve the challenges of schizophrenia diagnosis, methodology is considered to be limited, and diagnostic accuracy needs to be improved. In this study, a new approach using EEG data and deep learning is proposed to increase objectivity and efficiency of schizophrenia diagnosis. Existing deep learning studies use EEG data to classify schizophrenic patients and healthy subjects by learning EEG in the form of graphs or tables. However, in this study, EEG, a time series data, was converted into an image to improve classification accuracy, and is then studied in deep learning models. This study used EEG data of 81 people, in which the difference in N100 EEG between schizophrenic patients and healthy patients had been analyzed in prior research. EEGs were converted into images using time series image conversion algorithms, Recurrence Plot (RP) and Gramian Angular Field (GAF), and converted EEG images were learned with Convolutional Neural Network (CNN) models built based on VGGNet. When the trained deep learning model was applied to the same data from prior research, it was demonstrated that classification accuracy improved when compared to previous studies. Among the two algorithms used for image conversion, the deep learning model that learned through GAF showed significantly higher classification accuracy. The results of this study suggest that the use of GAF and CNN models based on EEG results can be an effective way to increase objectivity and efficiency in diagnosing various mental disorders, including schizophrenia. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Machine Learning Sorting Method of Bauxite Based on SE-Enhanced Network
Appl. Sci. 2022, 12(14), 7178; https://doi.org/10.3390/app12147178 - 16 Jul 2022
Abstract
A fast and accurate bauxite recognition method combining an attention module and a clustering algorithm is proposed in this paper. By introducing the K-means clustering algorithm into the YOLOv4 network and embedding the SE attention module, we calculate the corresponding anchor box value, [...] Read more.
A fast and accurate bauxite recognition method combining an attention module and a clustering algorithm is proposed in this paper. By introducing the K-means clustering algorithm into the YOLOv4 network and embedding the SE attention module, we calculate the corresponding anchor box value, enhance the feature learning ability of the network to bauxite, automatically learn the importance of different channel features, and improve the accuracy of bauxite target detection. In the experiment, 2189 bauxite photos were taken and screened as the target detection datasets, and the targets were divided into four categories: No. 55, No. 65, No. 70, and Nos. 72–73. By selecting the category volume balanced datasets, the optimal YOLOv4 network model was obtained after training 7000 times, so that the average accuracy of bauxite sorting reached 99%, and the reasoning speed was better than 0.05 s. Realizing the high-speed and high-precision sorting of bauxite greatly improves the mining efficiency and accuracy of the bauxite industry. At the same time, the model provides key technical support for the practical application of the same type of engineering. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Multiscale Dense U-Net: A Fast Correction Method for Thermal Drift Artifacts in Laboratory NanoCT Scans of Semi-Conductor Chips
Entropy 2022, 24(7), 967; https://doi.org/10.3390/e24070967 - 13 Jul 2022
Abstract
The resolution of 3D structure reconstructed by laboratory nanoCT is often affected by changes in ambient temperature. Although correction methods based on projection alignment have been widely used, they are time-consuming and complex. Especially in piecewise samples (e.g., chips), the existing methods are [...] Read more.
The resolution of 3D structure reconstructed by laboratory nanoCT is often affected by changes in ambient temperature. Although correction methods based on projection alignment have been widely used, they are time-consuming and complex. Especially in piecewise samples (e.g., chips), the existing methods are semi-automatic because the projections lose attenuation information at some rotation angles. Herein, we propose a fast correction method that directly processes the reconstructed slices. Thus, the limitations of the existing methods are addressed. The method is named multiscale dense U-Net (MD-Unet), which is based on MIMO-Unet and achieves state-of-the-art artifacts correction performance in nanoCT. Experiments show that MD-Unet can significantly boost the correction performance (e.g., with three orders of magnitude improvement in correction speed compared with traditional methods), and MD-Unet+ improves 0.92 dB compared with MIMO-Unet in the chip dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Domain Adaptation with Data Uncertainty Measure Based on Evidence Theory
Entropy 2022, 24(7), 966; https://doi.org/10.3390/e24070966 - 13 Jul 2022
Abstract
Domain adaptation aims to learn a classifier for a target domain task by using related labeled data from the source domain. Because source domain data and target domain task may be mismatched, there is an uncertainty of source domain data with respect to [...] Read more.
Domain adaptation aims to learn a classifier for a target domain task by using related labeled data from the source domain. Because source domain data and target domain task may be mismatched, there is an uncertainty of source domain data with respect to the target domain task. Ignoring the uncertainty may lead to models with unreliable and suboptimal classification results for the target domain task. However, most previous works focus on reducing the gap in data distribution between the source and target domains. They do not consider the uncertainty of source domain data about the target domain task and cannot apply the uncertainty to learn an adaptive classifier. Aimed at this problem, we revisit the domain adaptation from source domain data uncertainty based on evidence theory and thereby devise an adaptive classifier with the uncertainty measure. Based on evidence theory, we first design an evidence net to estimate the uncertainty of source domain data about the target domain task. Second, we design a general loss function with the uncertainty measure for the adaptive classifier and extend the loss function to support vector machine. Finally, numerical experiments on simulation datasets and real-world applications are given to comprehensively demonstrate the effectiveness of the adaptive classifier with the uncertainty measure. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Attention-Shared Multi-Agent Actor–Critic-Based Deep Reinforcement Learning Approach for Mobile Charging Dynamic Scheduling in Wireless Rechargeable Sensor Networks
Entropy 2022, 24(7), 965; https://doi.org/10.3390/e24070965 - 12 Jul 2022
Abstract
The breakthrough of wireless energy transmission (WET) technology has greatly promoted the wireless rechargeable sensor networks (WRSNs). A promising method to overcome the energy constraint problem in WRSNs is mobile charging by employing a mobile charger to charge sensors via WET. Recently, more [...] Read more.
The breakthrough of wireless energy transmission (WET) technology has greatly promoted the wireless rechargeable sensor networks (WRSNs). A promising method to overcome the energy constraint problem in WRSNs is mobile charging by employing a mobile charger to charge sensors via WET. Recently, more and more studies have been conducted for mobile charging scheduling under dynamic charging environments, ignoring the consideration of the joint charging sequence scheduling and charging ratio control (JSSRC) optimal design. This paper will propose a novel attention-shared multi-agent actor–critic-based deep reinforcement learning approach for JSSRC (AMADRL-JSSRC). In AMADRL-JSSRC, we employ two heterogeneous agents named charging sequence scheduler and charging ratio controller with an independent actor network and critic network. Meanwhile, we design the reward function for them, respectively, by considering the tour length and the number of dead sensors. The AMADRL-JSSRC trains decentralized policies in multi-agent environments, using a centralized computing critic network to share an attention mechanism, and it selects relevant policy information for each agent at every charging decision. Simulation results demonstrate that the proposed AMADRL-JSSRC can efficiently prolong the lifetime of the network and reduce the number of death sensors compared with the baseline algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Multi-Level Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
Appl. Sci. 2022, 12(14), 6938; https://doi.org/10.3390/app12146938 - 08 Jul 2022
Abstract
Multi-agent reinforcement learning (MARL) has become more and more popular over recent decades, and the need for high-level cooperation is increasing every day because of the complexity of the real-world environment. However, the multi-agent credit assignment problem that serves as the main obstacle [...] Read more.
Multi-agent reinforcement learning (MARL) has become more and more popular over recent decades, and the need for high-level cooperation is increasing every day because of the complexity of the real-world environment. However, the multi-agent credit assignment problem that serves as the main obstacle to high-level coordination is still not addressed properly. Though lots of methods have been proposed, none of them have thought to perform credit assignments across multi-levels. In this paper, we aim to propose an approach to learning a better credit assignment scheme by credit assignment across multi-levels. First, we propose a hierarchical model that consists of the manager level and the worker level. The manager level incorporates the dilated Gated Recurrent Unit (GRU) to focus on high-level plans and the worker level uses GRU to execute primitive actions conditioned on high-level plans. Then, one centralized critic is designed for each level to learn each level’s credit assignment scheme. To this end, we construct a novel hierarchical MARL algorithm, named MLCA, which can achieve multi-level credit assignment. We also conduct experiments on three classical and challenging tasks to demonstrate the performance of the proposed algorithm against three baseline methods. The results show that our method gains great performance improvement across all maps that require high-level cooperation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
We Know You Are Living in Bali: Location Prediction of Twitter Users Using BERT Language Model
Big Data Cogn. Comput. 2022, 6(3), 77; https://doi.org/10.3390/bdcc6030077 - 07 Jul 2022
Abstract
Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works [...] Read more.
Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works attempted to predict location on English-language tweets. In this study, we attempted to predict the location of Indonesian tweets. We utilized machine learning approaches, i.e., long-short term memory (LSTM) and bidirectional encoder representations from transformers (BERT) to infer Twitter users’ home locations using display name in profile, user description, and user tweets. By concatenating display name, description, and aggregated tweet, the model achieved the best accuracy of 0.77. The performance of the IndoBERT model outperformed several baseline models. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Optical Flow-Aware-Based Multi-Modal Fusion Network for Violence Detection
Entropy 2022, 24(7), 939; https://doi.org/10.3390/e24070939 - 06 Jul 2022
Abstract
Violence detection aims to locate violent content in video frames. Improving the accuracy of violence detection is of great importance for security. However, the current methods do not make full use of the multi-modal vision and audio information, which affects the accuracy of [...] Read more.
Violence detection aims to locate violent content in video frames. Improving the accuracy of violence detection is of great importance for security. However, the current methods do not make full use of the multi-modal vision and audio information, which affects the accuracy of violence detection. We found that the violence detection accuracy of different kinds of videos is related to the change of optical flow. With this in mind, we propose an optical flow-aware-based multi-modal fusion network (OAMFN) for violence detection. Specifically, we use three different fusion strategies to fully integrate multi-modal features. First, the main branch concatenates RGB features and audio features and the optical flow branch concatenates optical flow features with RGB features and audio features, respectively. Then, the cross-modal information fusion module integrates the features of different combinations and applies weights to them to capture cross-modal information in audio and video. After that, the channel attention module extracts valuable information by weighting the integration features. Furthermore, an optical flow-aware-based score fusion strategy is introduced to fuse features of different modalities from two branches. Compared with methods on the XD-Violence dataset, our multi-modal fusion network yields APs that are 83.09% and 1.4% higher than those of the state-of-the-art methods in offline detection, and 78.09% and 4.42% higher than those of the state-of-the-art methods in online detection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

Article
Automatic Rice Disease Detection and Assistance Framework Using Deep Learning and a Chatbot
Electronics 2022, 11(14), 2110; https://doi.org/10.3390/electronics11142110 - 06 Jul 2022
Abstract
Agriculture not only supplies food but is also a source of income for a vast population of the world. Paddy plants usually produce a brown-coloured husk on the top and their seed, after de-husking and processing, yields edible rice which is a major [...] Read more.
Agriculture not only supplies food but is also a source of income for a vast population of the world. Paddy plants usually produce a brown-coloured husk on the top and their seed, after de-husking and processing, yields edible rice which is a major cereal food crop and staple food, and therefore, becomes the cornerstone of the food security for half the world’s people. However, with the increase in climate change and global warming, the quality and its production are highly degraded by the common diseases posed in rice plants due to bacteria and fungi (such as sheath rot, leaf blast, leaf smut, brown spot, and bacterial blight). Therefore, to accurately identify these diseases at an early stage, recently, recognition and classification of crop diseases is in burning demand. Hence, the present work proposes an automatic system in the form of a smartphone application (E-crop doctor) to detect diseases from paddy leaves which can also suggest pesticides to farmers. The application also has a chatbot named “docCrop” which provides 24 × 7 support to the farmers. The efficiency of the two most popular object detection algorithms (YOLOv3 tiny and YOLOv4 tiny) for smartphone applications was analysed for the detection of three diseases—brown spot, leaf blast, and hispa. The results reveal that YOLOv4 tiny achieved a mAP of 97.36% which is significantly higher by a margin of 17.59% than YOLOv3 tiny. Hence, YOLOv4 tiny is deployed for the development of the mobile application for use. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Research on Computer-Aided Diagnosis Method Based on Symptom Filtering and Weighted Network
Entropy 2022, 24(7), 931; https://doi.org/10.3390/e24070931 - 05 Jul 2022
Abstract
In the process of disease identification, as the number of diseases increases, the collection of both diseases and symptoms becomes larger. However, existing computer-aided diagnosis systems do not completely solve the dimensional disaster caused by the increasing data set. To address the above [...] Read more.
In the process of disease identification, as the number of diseases increases, the collection of both diseases and symptoms becomes larger. However, existing computer-aided diagnosis systems do not completely solve the dimensional disaster caused by the increasing data set. To address the above problems, we propose methods of using symptom filtering and a weighted network with the goal of deeper processing of the collected symptom information. Symptom filtering is similar to a filter in signal transmission, which can filter the collected symptom information, further reduce the dimensional space of the system, and make the important symptoms more prominent. The weighted network, on the other hand, mines deeper disease information by modeling the channels of symptom information, amplifying important information, and suppressing unimportant information. Compared with existing hierarchical reinforcement learning models, the feature extraction methods proposed in this paper can help existing models improve their accuracy by more than 10%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Topological Data Analysis Helps to Improve Accuracy of Deep Learning Models for Fake News Detection Trained on Very Small Training Sets
Big Data Cogn. Comput. 2022, 6(3), 74; https://doi.org/10.3390/bdcc6030074 - 05 Jul 2022
Abstract
Topological data analysis has recently found applications in various areas of science, such as computer vision and understanding of protein folding. However, applications of topological data analysis to natural language processing remain under-researched. This study applies topological data analysis to a particular natural [...] Read more.
Topological data analysis has recently found applications in various areas of science, such as computer vision and understanding of protein folding. However, applications of topological data analysis to natural language processing remain under-researched. This study applies topological data analysis to a particular natural language processing task: fake news detection. We have found that deep learning models are more accurate in this task than topological data analysis. However, assembling a deep learning model with topological data analysis significantly improves the model’s accuracy if the available training set is very small. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Joint Entity and Relation Extraction Network with Enhanced Explicit and Implicit Semantic Information
Appl. Sci. 2022, 12(12), 6231; https://doi.org/10.3390/app12126231 - 19 Jun 2022
Abstract
The main purpose of the joint entity and relation extraction is to extract entities from unstructured texts and extract the relation between labeled entities at the same time. At present, most existing joint entity and relation extraction networks ignore the utilization of explicit [...] Read more.
The main purpose of the joint entity and relation extraction is to extract entities from unstructured texts and extract the relation between labeled entities at the same time. At present, most existing joint entity and relation extraction networks ignore the utilization of explicit semantic information and explore implicit semantic information insufficiently. In this paper, we propose Joint Entity and Relation Extraction Network with Enhanced Explicit and Implicit Semantic Information (EINET). First, on the premise of using the pre-trained model, we introduce explicit semantics from Semantic Role Labeling (SRL), which contains rich semantic features about the entity types and relation of entities. Then, to enhance the implicit semantic information and extract richer features of the entity and local context, we adopt different Bi-directional Long Short-Term Memory (Bi-LSTM) networks to encode entities and local contexts, respectively. In addition, we propose to integrate global semantic information and local context length representation in relation extraction to further improve the model performance. Our model achieves competitive results on three publicly available datasets. Compared with the baseline model on Conll04, EINET obtains improvements by 2.37% in F1 for named entity recognition and 3.43% in F1 for relation extraction. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
An fMRI Sequence Representation Learning Framework for Attention Deficit Hyperactivity Disorder Classification
Appl. Sci. 2022, 12(12), 6211; https://doi.org/10.3390/app12126211 - 18 Jun 2022
Abstract
For attention deficit hyperactivity disorder (ADHD), a common neurological disease, accurate identification is the basis for treatment. In this paper, a novel end-to-end representation learning framework for ADHD classification of functional magnetic resonance imaging (fMRI) sequences is proposed. With such a framework, the [...] Read more.
For attention deficit hyperactivity disorder (ADHD), a common neurological disease, accurate identification is the basis for treatment. In this paper, a novel end-to-end representation learning framework for ADHD classification of functional magnetic resonance imaging (fMRI) sequences is proposed. With such a framework, the complexity of the sequence representation learning neural network decreases, the overfitting problem of deep learning for small samples cases is solved effectively, and superior classification performance is achieved. Specifically, a data conversion module was designed to convert a two-dimensional sequence into a three-dimensional image, which expands the modeling area and greatly reduces the computational complexity. The transfer learning method was utilized to freeze or fine-tune the parameters of the pre-trained neural network to reduce the risk of overfitting in the cases with small samples. Hierarchical feature extraction can be performed automatically by combining the sequence representation learning modules with a weighted cross-entropy loss. Experiments were conducted both with individual imaging sites and combining them, and the results showed that the classification average accuracies with the proposed framework were 73.73% and 72.02%, respectively, which are much higher than those of the existing methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Multi-Lingual Speech Recognition-Based Framework to Human-Drone Interaction
Electronics 2022, 11(12), 1829; https://doi.org/10.3390/electronics11121829 - 09 Jun 2022
Abstract
In recent years, human–drone interaction has received increasing interest from the scientific community. When interacting with a drone, humans assume a variety of roles, the nature of which are determined by the drone’s application and degree of autonomy. Common methods of controlling drone [...] Read more.
In recent years, human–drone interaction has received increasing interest from the scientific community. When interacting with a drone, humans assume a variety of roles, the nature of which are determined by the drone’s application and degree of autonomy. Common methods of controlling drone movements include by RF remote control and ground control station. These devices are often difficult to manipulate and may even require some training. An alternative is to use innovative methods called natural user interfaces that allow users to interact with drones in an intuitive manner using speech. However, using only one language of interacting may limit the number of users, especially if different languages are spoken in the same region. Moreover, environmental and propellers noise make speech recognition a complicated task. The goal of this work is to use a multilingual speech recognition system that includes English, Arabic, and Amazigh to control the movement of drones. The reason for selecting these languages is that they are widely spoken in many regions, particularly in the Middle East and North Africa (MENA) zone. To achieve this goal, a two-stage approach is proposed. During the first stage, a deep learning based model for multilingual speech recognition is designed. Then, the developed model is deployed in real settings using a quadrotor UAV. The network was trained using 38,850 records including commands and unknown words mixed with noise to improve robustness. An average class accuracy of more than 93% has been achieved. After that, experiments were conducted involving 16 participants giving voice commands in order to test the efficiency of the designed system. The achieved accuracy is about 93.76% for English recognition and 88.55%, 82.31% for Arabic and Amazigh, respectively. Finally, hardware implementation of the designed system on a quadrotor UAV was made. Real time tests have shown that the approach is very promising as an alternative form of human–drone interaction while offering the benefit of control simplicity. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
P2P Lending Default Prediction Based on AI and Statistical Models
Entropy 2022, 24(6), 801; https://doi.org/10.3390/e24060801 - 08 Jun 2022
Abstract
Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are [...] Read more.
Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still subject to risks. This paper proposes prediction models to mitigate the risks of default and asymmetric information on P2P lending platforms. Specifically, we designed sophisticated procedures to pre-process mass data extracted from Lending Club in 2018 Q3–2019 Q2. After that, three statistical models, namely, Logistic Regression, Bayesian Classifier, and Linear Discriminant Analysis (LDA), and five AI models, namely, Decision Tree, Random Forest, LightGBM, Artificial Neural Network (ANN), and Convolutional Neural Network (CNN), were utilized for data analysis. The loan statuses of Lending Club’s customers were rationally classified. To evaluate the models, we adopted the confusion matrix series of metrics, AUC-ROC curve, Kolmogorov–Smirnov chart (KS), and Student’s t-test. Empirical studies show that LightGBM produces the best performance and is 2.91% more accurate than the other models, resulting in a revenue improvement of nearly USD 24 million for Lending Club. Student’s t-test proves that the differences between models are statistically significant. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
ConAs-GRNs: Sentiment Classification with Construction-Assisted Multi-Scale Graph Reasoning Networks
Electronics 2022, 11(12), 1825; https://doi.org/10.3390/electronics11121825 - 08 Jun 2022
Abstract
Traditional neural networks have limited capabilities in modeling the refined global and contextual semantics of emotional texts and usually ignore the dependencies between different emotional words. To address this limitation, this paper proposes a construction-assisted multi-scale graph reasoning network (ConAs-GRNs), which explores the [...] Read more.
Traditional neural networks have limited capabilities in modeling the refined global and contextual semantics of emotional texts and usually ignore the dependencies between different emotional words. To address this limitation, this paper proposes a construction-assisted multi-scale graph reasoning network (ConAs-GRNs), which explores the details of the contextual semantics as well as the emotional dependencies between emotional texts from multiple aspects by focusing on the salient emotional information. In this network, an emotional construction-based multi-scale topological graph is used to describe multiple aspects of emotional dependency, and a sentence dependency tree is utilized to construct a relationship graph based on emotional words and texts. Then, the transfer learning and pooling learning on the topology map is performed. In our case, a weighted edge reduction strategy is used to aggregate the adjacency information which enables the internal transfer of semantic information in a single graph. Moreover, to implement the inter-graph transfer of semantic information, we rely on the construction structure to coordinate the heterogeneous graph information. The extensive experiments conducted on two baseline datasets, SemEval 2014 and ACL-14, demonstrate that the proposed ConAs-GRNs can effectively coordinate and integrate the heterogeneous information from within constructions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
An Active Learning Algorithm Based on the Distribution Principle of Bhattacharyya Distance
Mathematics 2022, 10(11), 1927; https://doi.org/10.3390/math10111927 - 04 Jun 2022
Abstract
Active learning is a method that can actively select examples with much information from a large number of unlabeled samples to query labeled by experts, so as to obtain a high-precision classifier with a small number of samples. Most of the current research [...] Read more.
Active learning is a method that can actively select examples with much information from a large number of unlabeled samples to query labeled by experts, so as to obtain a high-precision classifier with a small number of samples. Most of the current research uses the basic principles to optimize the classifier at each iteration, but the batch query with the largest amount of information in each round does not represent the overall distribution of the sample, that is, it may fall into partial optimization and ignore the whole, which will may affect or reduce its accuracy. In order to solve this problem, a special distance measurement method—Bhattacharyya Distance—is used in this paper. By using this distance and designing a new set of query decision logic, we can improve the accuracy of the model. Our method embodies the query of the samples with the most representative distribution and the largest amount of information to realize the classification task based on a small number of samples. We perform theoretical proofs and experimental analysis. Finally, we use different data sets and compare them with other classification algorithms to evaluate the performance and efficiency of our algorithm. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A List-Ranking Framework Based on Linear and Non-Linear Fusion for Recommendation from Implicit Feedback
Entropy 2022, 24(6), 778; https://doi.org/10.3390/e24060778 - 31 May 2022
Abstract
Although most list-ranking frameworks are based on multilayer perceptrons (MLP), they still face limitations within the method itself in the field of recommender systems in two respects: (1) MLP suffer from overfitting when dealing with sparse vectors. At the same time, the model [...] Read more.
Although most list-ranking frameworks are based on multilayer perceptrons (MLP), they still face limitations within the method itself in the field of recommender systems in two respects: (1) MLP suffer from overfitting when dealing with sparse vectors. At the same time, the model itself tends to learn in-depth features of user–item interaction behavior but ignores some low-rank and shallow information present in the matrix. (2) Existing ranking methods cannot effectively deal with the problem of ranking between items with the same rating value and the problem of inconsistent independence in reality. We propose a list ranking framework based on linear and non-linear fusion for recommendation from implicit feedback, named RBLF. First, the model uses dense vectors to represent users and items through one-hot encoding and embedding. Second, to jointly learn shallow and deep user–item interaction, we use the interaction grabbing layer to capture the user–item interaction behavior through dense vectors of users and items. Finally, RBLF uses the Bayesian collaborative ranking to better fit the characteristics of implicit feedback. Eventually, the experiments show that the performance of RBLF obtains a significant improvement. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Fast Multi-Scale Generative Adversarial Network for Image Compressed Sensing
Entropy 2022, 24(6), 775; https://doi.org/10.3390/e24060775 - 31 May 2022
Abstract
Recently, deep neural network-based image compressed sensing methods have achieved impressive success in reconstruction quality. However, these methods (1) have limitations in sampling pattern and (2) usually have the disadvantage of high computational complexity. To this end, a fast multi-scale generative adversarial network [...] Read more.
Recently, deep neural network-based image compressed sensing methods have achieved impressive success in reconstruction quality. However, these methods (1) have limitations in sampling pattern and (2) usually have the disadvantage of high computational complexity. To this end, a fast multi-scale generative adversarial network (FMSGAN) is implemented in this paper. Specifically, (1) an effective multi-scale sampling structure is proposed. It contains four different kernels with varying sizes so that decompose, and sample images effectively, which is capable of capturing different levels of spatial features at multiple scales. (2) An efficient lightweight multi-scale residual structure for deep image reconstruction is proposed to balance receptive field size and computational complexity. The key idea is to apply smaller convolution kernel sizes in the multi-scale residual structure to reduce the number of operations while maintaining the receptive field. Meanwhile, the channel attention structure is employed for enriching useful information. Moreover, perceptual loss is combined with MSE loss and adversarial loss as the optimization function to recover a finer image. Numerous experiments show that our FMSGAN achieves state-of-the-art image reconstruction quality with low computational complexity. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Automatic Classification of 15 Leads ECG Signal of Myocardial Infarction Using One Dimension Convolutional Neural Network
Appl. Sci. 2022, 12(11), 5603; https://doi.org/10.3390/app12115603 - 31 May 2022
Cited by 1
Abstract
Impaired blood flow caused by coronary artery occlusion due to thrombus can cause damage to the heart muscle which is often called Myocardial Infarction (MI). To avoid the complexity of MI diseases such as heart failure or arrhythmias that can cause death, it [...] Read more.
Impaired blood flow caused by coronary artery occlusion due to thrombus can cause damage to the heart muscle which is often called Myocardial Infarction (MI). To avoid the complexity of MI diseases such as heart failure or arrhythmias that can cause death, it is necessary to diagnose and detect them early. An electrocardiogram (ECG) signal is a diagnostic medium that can be used to detect acute MI. Diagnostics with the help of data science is very useful in detecting MI in ECG signals. The purpose of study is to propose an automatic classification framework for Myocardial Infarction (MI) with 15 lead ECG signals consisting of 12 standard leads and 3 Frank leads. This research contributes to the improvement of classification performance for 10 MI classes and normal classes. The PTB dataset trained with the proposed 1D-CNN architecture was able to produce average accuracy, sensitivity, specificity, precision and F1-score of 99.98%, 99.91%, 99.99%, 99.91, and 99.91%. From the evaluation results, it can be concluded that the proposed 1D-CNN architecture is able to provide excellent performance in detecting MI attacks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Specific Emitter Identification Based on Ensemble Neural Network and Signal Graph
Appl. Sci. 2022, 12(11), 5496; https://doi.org/10.3390/app12115496 - 28 May 2022
Abstract
Specific emitter identification (SEI) is a technology for extracting fingerprint features from a signal and identifying the emitter. In this paper, the author proposes an SEI method based on ensemble neural networks (ENN) and signal graphs, with the following innovations: First, a signal [...] Read more.
Specific emitter identification (SEI) is a technology for extracting fingerprint features from a signal and identifying the emitter. In this paper, the author proposes an SEI method based on ensemble neural networks (ENN) and signal graphs, with the following innovations: First, a signal graph is used to show signal data in a non-Euclidean space. Namely, sequence signal data is constructed into a signal graph to transform the sequence signal from a Euclidian space to a non-Euclidean space. Hence, the graph feature (the feature of the non-Euclidean space) of the signal can be extracted from the signal graph. Second, the ensemble neural network is integrated with a graph feature extractor and a sequence feature extractor, making it available to extract both graph and sequence simultaneously. This ensemble neural network also fuses graph features with sequence features, obtaining an ensemble feature that has both features in Euclidean space and non-Euclidean space. Therefore, the ensemble feature contains more effective information for the identification of the emitter. The study results demonstrate that this SEI method has higher SEI accuracy and robustness than traditional machine learning methods and common deep learning methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
GTAD: Graph and Temporal Neural Network for Multivariate Time Series Anomaly Detection
Entropy 2022, 24(6), 759; https://doi.org/10.3390/e24060759 - 27 May 2022
Cited by 1
Abstract
The rapid development of smart factories, combined with the increasing complexity of production equipment, has resulted in a large number of multivariate time series that can be recorded using sensors during the manufacturing process. The anomalous patterns of industrial production may be hidden [...] Read more.
The rapid development of smart factories, combined with the increasing complexity of production equipment, has resulted in a large number of multivariate time series that can be recorded using sensors during the manufacturing process. The anomalous patterns of industrial production may be hidden by these time series. Previous LSTM-based and machine-learning-based approaches have made fruitful progress in anomaly detection. However, these multivariate time series anomaly detection algorithms do not take into account the correlation and time dependence between the sequences. In this study, we proposed a new algorithm framework, namely, graph attention network and temporal convolutional network for multivariate time series anomaly detection (GTAD), to address this problem. Specifically, we first utilized temporal convolutional networks, including causal convolution and dilated convolution, to capture temporal dependencies, and then used graph neural networks to obtain correlations between sensors. Finally, we conducted sufficient experiments on three public benchmark datasets, and the results showed that the proposed method outperformed the baseline method, achieving detection results with F1 scores higher than 95% on all datasets. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Novel Hierarchical Adaptive Feature Fusion Method for Meta-Learning
Appl. Sci. 2022, 12(11), 5458; https://doi.org/10.3390/app12115458 - 27 May 2022
Abstract
Meta-learning aims to teach the machine how to learn. Embedding model-based meta-learning performs well in solving the few-shot problem. The methods use an embedding model, usually a convolutional neural network, to extract features from samples and use a classifier to measure the features [...] Read more.
Meta-learning aims to teach the machine how to learn. Embedding model-based meta-learning performs well in solving the few-shot problem. The methods use an embedding model, usually a convolutional neural network, to extract features from samples and use a classifier to measure the features extracted from a particular stage of the embedding model. However, the feature of the embedding model at the low stage contains richer visual information, while the feature at the high stage contains richer semantic information. Existing methods fail to consider the impact of the information carried by the features at different stages on the performance of the classifier. Therefore, we propose a meta-learning method based on adaptive feature fusion and weight optimization. The main innovations of the method are as follows: firstly, a feature fusion strategy is used to fuse the feature of each stage of the embedding model based on certain weights, effectively utilizing the information carried by different stage features. Secondly, the particle swarm optimization algorithm was used to optimize the weight of feature fusion, and determine each stage feature’s weight in the process of feature fusion. Compared to current mainstream baseline methods on multiple few-shot image recognition benchmarks, the method performs better. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Surrogate Model-Based Parameter Tuning of Simulated Annealing Algorithm for the Shape Optimization of Automotive Rubber Bumpers
Appl. Sci. 2022, 12(11), 5451; https://doi.org/10.3390/app12115451 - 27 May 2022
Abstract
A design engineer has to deal with increasingly complex design tasks on a daily basis, for which the available design time is shrinking. Market competitiveness can be improved by using optimization if the design process can be automated. If there is limited information [...] Read more.
A design engineer has to deal with increasingly complex design tasks on a daily basis, for which the available design time is shrinking. Market competitiveness can be improved by using optimization if the design process can be automated. If there is limited information about the behavior of the objective function, global search methods such as simulated annealing (SA) should be used. This algorithm requires the selection of a number of parameters based on the task. A procedure for reducing the time spent on tuning the SA algorithm for computationally expensive, simulation-driven optimization tasks was developed. The applicability of the method was demonstrated by solving a shape optimization problem of a rubber bumper built into air spring structures of lorries. Due to the time-consuming objective function call, a support vector regression (SVR) surrogate model was used to test the performance of the optimization algorithm. To perform the SVR training, samples were taken using the maximin Latin hypercube design. The SA algorithm with an adaptive search space and different cooling schedules was implemented. Subsequently, the SA parameters were fine-tuned using the trained SVR surrogate model. An optimal design was found using the adapted SA algorithm with negligible error from a technical aspect. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Bayesian Network Model Averaging Classifiers by Subbagging
Entropy 2022, 24(5), 743; https://doi.org/10.3390/e24050743 - 23 May 2022
Abstract
When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved [...] Read more.
When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved by maximizing the conditional log likelihood (CLL) of a class variable given the feature variables. Nevertheless, because ML has asymptotic consistency, the performance of Bayesian network structures achieved by maximizing ML is not necessarily worse than that achieved by maximizing CLL for large data. However, the error of learning structures by maximizing the ML becomes much larger for small sample sizes. That large error degrades the classification accuracy. As a method to resolve this shortcoming, model averaging has been proposed to marginalize the class variable posterior over all structures. However, the posterior standard error of each structure in the model averaging becomes large as the sample size becomes small; it subsequently degrades the classification accuracy. The main idea of this study is to improve the classification accuracy using subbagging, which is modified bagging using random sampling without replacement, to reduce the posterior standard error of each structure in model averaging. Moreover, to guarantee asymptotic consistency, we use the K-best method with the ML score. The experimentally obtained results demonstrate that our proposed method provides more accurate classification than earlier BNC methods and the other state-of-the-art ensemble methods do. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Classification of Defective Fabrics Using Capsule Networks
Appl. Sci. 2022, 12(10), 5285; https://doi.org/10.3390/app12105285 - 23 May 2022
Cited by 1
Abstract
Fabric quality has an important role in the textile sector. Fabric defect, which is a highly important factor that influences the fabric quality, has become a concept that researchers are trying to minimize. Due to the limited capacity of human resources, human-based defect [...] Read more.
Fabric quality has an important role in the textile sector. Fabric defect, which is a highly important factor that influences the fabric quality, has become a concept that researchers are trying to minimize. Due to the limited capacity of human resources, human-based defect detection results in low performance and significant loss of time. To overcome human-based limited capacity, computer vision-based methods have emerged. Thanks to new additions to these methods over time, fabric defect detection methods have begun to show almost one hundred percent performance. Convolutional Neural Networks (CNNs) play a leading role in this high-performance success. However, Convolutional Neural Networks cause information loss in the pooling process. Capsule Networks is a useful technique for minimizing information loss. This paper proposes Capsule Networks, a new generation method that represents an alternative to Convolutional Neural Networks for deep learning tasks. TILDA dataset as source data for training and testing phases are employed. The model is trained for 100, 200, and 270 epoch times. Model performance is evaluated based on accuracy, recall, and precision performance metrics. Compared to mainstream deep learning algorithms, this method offers improved performance in terms of accuracy. This method has been performed under different circumstances and has achieved a performance value of 98.7%. The main contributions of this study are to use Capsule Networks in the fabric defect detection domain and to obtain a significant performance result. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Review
Negation and Speculation in NLP: A Survey, Corpora, Methods, and Applications
Appl. Sci. 2022, 12(10), 5209; https://doi.org/10.3390/app12105209 - 21 May 2022
Abstract
Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in [...] Read more.
Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in various natural languages and domains. Furthermore, we discuss the ongoing research into recent rule-based, supervised, and transfer learning techniques for the detection of negating and speculative content. Many English corpora for various domains are now annotated with negation and speculation; moreover, the availability of annotated corpora in other languages has started to increase. However, this growth is insufficient to address these important phenomena in languages with limited resources. The use of cross-lingual models and translation of the well-known languages are acceptable alternatives. We also highlight the lack of consistent annotation guidelines and the shortcomings of the existing techniques, and suggest alternatives that may speed up progress in this research direction. Adding more syntactic features may alleviate the limitations of the existing techniques, such as cue ambiguity and detecting the discontinuous scopes. In some NLP applications, inclusion of a system that is negation- and speculation-aware improves performance, yet this aspect is still not addressed or considered an essential step. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Discriminative-Based Geometric Deep Learning Model for Cross Domain Recommender Systems
Appl. Sci. 2022, 12(10), 5202; https://doi.org/10.3390/app12105202 - 20 May 2022
Cited by 1
Abstract
Recommender systems (RS) have been widely deployed in many real-world applications, but usually suffer from the long-standing user/item cold-start problem. As a promising approach, cross-domain recommendation (CDR), which has attracted a surge of interest, aims to transfer the user preferences observed in the [...] Read more.
Recommender systems (RS) have been widely deployed in many real-world applications, but usually suffer from the long-standing user/item cold-start problem. As a promising approach, cross-domain recommendation (CDR), which has attracted a surge of interest, aims to transfer the user preferences observed in the source domain to make recommendations in the target domain. Traditional machine learning and deep learning methods are not designed to learn from complex data representations such as graphs, manifolds and 3D objects. However, current trends in data generation include these complex data representations. In addition, existing research works do not consider the complex dimensions and the locality structure of items, which however, contain more discriminative information essential for improving the performance accuracy of the recommender system. Furthermore, similar outcomes between test samples and their neighboring training data restrained in the kernel space are not fully realized from the recommended objects belonging to the same object category to capture the embedded discriminative information effectively. These challenges leave the problem of sparsity and the cold-start of items/users unsolved and hence impede the performance of the cross-domain recommender system, causing it to suggest less relevant and undistinguished items to the user. To handle these challenges, we propose a novel deep learning (DL) method, Discriminative Geometric Deep Learning (D-GDL) for cross-domain recommender systems. In the proposed D-GDL, a discriminative function based on sparse local sensitivity is introduced into the structure of the DL network. In the D-GDL, a local representation learning (i.e., a local sensitivity-based deep convolutional belief network) is introduced into the structure of the DL network to effectively capture the local geometric and visual information from the structure of the recommended 3D objects. A kernel-based method (i.e., a local sensitivity deep belief network) is also incorporated into the structure of the DL framework to map the complex structure of recommended objects into high dimensional feature space and achieve an effective recognition result. An improved kernel density estimator is created to serve as a weighing function in building a high dimensional feature space, which makes it more resistant to geometric noise and computation performance. The experiment results show that the proposed D-GDL significantly outperforms the state-of-the-art methods in both sparse and dense settings for cross-domain recommendation tasks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Novel Exploit Feature-Map-Based Detection of Adversarial Attacks
Appl. Sci. 2022, 12(10), 5161; https://doi.org/10.3390/app12105161 - 20 May 2022
Abstract
In machine learning (ML), adversarial attack (targeted or untargeted) in the presence of noise disturbs the model prediction. This research suggests that adversarial perturbations on pictures lead to noise in the features constructed by any networks. As a result, adversarial assaults against image [...] Read more.
In machine learning (ML), adversarial attack (targeted or untargeted) in the presence of noise disturbs the model prediction. This research suggests that adversarial perturbations on pictures lead to noise in the features constructed by any networks. As a result, adversarial assaults against image categorization systems may present obstacles and possibilities for studying convolutional neural networks (CNNs). According to this research, adversarial perturbations on pictures cause noise in the features created by neural networks. Motivated by adversarial perturbation on image pixel attacks observation, we developed a novel exploit feature map that describes adversarial attacks by performing individual object feature-map visual description. Specifically, a novel detection algorithm calculates each object’s class activation map weight and makes a combined activation map. When checked with different networks like VGGNet19 and ResNet50, in both white-box and black-box attack situations, the unique exploit feature-map significantly improves the state-of-the-art in adversarial resilience. Further, it will clearly exploit attacks on ImageNet under various algorithms like Fast Gradient Sign Method (FGSM), DeepFool, Projected Gradient Descent (PGD), and Backward Pass Differentiable Approximation (BPDA). Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Random Noise vs. State-of-the-Art Probabilistic Forecasting Methods: A Case Study on CRPS-Sum Discrimination Ability
Appl. Sci. 2022, 12(10), 5104; https://doi.org/10.3390/app12105104 - 19 May 2022
Abstract
The recent developments in the machine-learning domain have enabled the development of complex multivariate probabilistic forecasting models. To evaluate the predictive power of these complex methods, it is pivotal to have a precise evaluation method to gauge the performance and predictability power of [...] Read more.
The recent developments in the machine-learning domain have enabled the development of complex multivariate probabilistic forecasting models. To evaluate the predictive power of these complex methods, it is pivotal to have a precise evaluation method to gauge the performance and predictability power of these complex methods. To do so, several evaluation metrics have been proposed in the past (such as the energy score, Dawid–Sebastiani score, and variogram score); however, these cannot reliably measure the performance of a probabilistic forecaster. Recently, CRPS-Sum has gained a lot of prominence as a reliable metric for multivariate probabilistic forecasting. This paper presents a systematic evaluation of CRPS-Sum to understand its discrimination ability. We show that the statistical properties of target data affect the discrimination ability of CRPS-Sum. Furthermore, we highlight that CRPS-Sum calculation overlooks the performance of the model on each dimension. These flaws can lead us to an incorrect assessment of model performance. Finally, with experiments on real-world datasets, we demonstrate that the shortcomings of CRPS-Sum provide a misleading indication of the probabilistic forecasting performance method. We illustrate that it is easily possible to have a better CRPS-Sum for a dummy model, which looks like random noise, in comparison to the state-of-the-art method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Triple Relation Network for Joint Entity and Relation Extraction
Electronics 2022, 11(10), 1535; https://doi.org/10.3390/electronics11101535 - 11 May 2022
Abstract
Recent methods of extracting relational triples mainly focus on the overlapping problem and achieve considerable performance. Most previous approaches extract triples solely conditioned on context words, but ignore the potential relations among the extracted entities, which will cause incompleteness in succeeding Knowledge Graphs’ [...] Read more.
Recent methods of extracting relational triples mainly focus on the overlapping problem and achieve considerable performance. Most previous approaches extract triples solely conditioned on context words, but ignore the potential relations among the extracted entities, which will cause incompleteness in succeeding Knowledge Graphs’ (KGs) construction. Since relevant triples give a clue for establishing implicit connections among entities, we propose a Triple Relation Network (Trn) to jointly extract triples, especially handling extracting implicit triples. Specifically, we design an attention-based entity pair encoding module to identify all normal entity pairs directly. To construct implicit connections among these extracted entities in triples, we utilize our triple reasoning module to calculate relevance between two triples. Then, we select the top-K relevant triple pairs and transform them into implicit entity pairs to predict the corresponding implicit relations. We utilize a bipartite matching objective to match normal triples and implicit triples with the corresponding labels. Extensive experiments demonstrate the effectiveness of the proposed method on two public benchmarks, and our proposed model significantly outperforms previous strong baselines. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Novel Two-Stage Deep Learning Structure for Network Flow Anomaly Detection
Electronics 2022, 11(10), 1531; https://doi.org/10.3390/electronics11101531 - 11 May 2022
Cited by 1
Abstract
Unknown cyber-attacks have appeared constantly. Several anomaly detection techniques based on semi-supervised learning have been proposed to detect these unknown cyber-attacks. Among them, the Denoising Auto-Encoder (DAE) scheme performs better than others in accuracy but is not good enough in precision. This paper [...] Read more.
Unknown cyber-attacks have appeared constantly. Several anomaly detection techniques based on semi-supervised learning have been proposed to detect these unknown cyber-attacks. Among them, the Denoising Auto-Encoder (DAE) scheme performs better than others in accuracy but is not good enough in precision. This paper proposes a novel two-stage deep learning structure for network flow anomaly detection by combining the models of Gate Recurrent Unit (GRU) and DAE. By using supervised anomaly detection with a selection mechanism to assist semi-supervised anomaly detection, the precision and accuracy of the anomaly detection system are improved. In the proposed structure, we first use the GRU model to analyze the network flow and then take the outcome from the Softmax function as a confidence score. When the score is more than or equal to the predefined confidence threshold, the GRU model outputs the flow as a positive result, no matter the flow is classified as normal or abnormal. When the score is less than the confidence threshold, GRU model outputs the flow as a negative result and passes the flow to DAE model for flow classification. DAE then determines a reconstruction error threshold by learning the pattern of normal flows. Accordingly, the flow is normal or abnormal depending on whether it is under or over the reconstruction error threshold. A comparative experiment is performed using NSL-KDD dataset as benchmark. The results revealed that the precision using the proposed scheme is 0.83% better than DAE. The accuracy using the proposed approach is 90.21%, which is better than Random Forest, Naïve Bayes, One-Dimensional Convolutional Neural Network, two-stage Auto-Encoder, etc. In addition, the proposed approach is also applied to the environment of software defined network (SDN). By adopting our approach in SDN environment, the precision and F-measure are significantly improved. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
STAGCN: Spatial–Temporal Attention Graph Convolution Network for Traffic Forecasting
Mathematics 2022, 10(9), 1599; https://doi.org/10.3390/math10091599 - 08 May 2022
Abstract
Traffic forecasting plays an important role in intelligent transportation systems. However, the prediction task is highly challenging due to the mixture of global and local spatiotemporal dependencies involved in traffic data. Existing graph neural networks (GNNs) typically capture spatial dependencies with the predefined [...] Read more.
Traffic forecasting plays an important role in intelligent transportation systems. However, the prediction task is highly challenging due to the mixture of global and local spatiotemporal dependencies involved in traffic data. Existing graph neural networks (GNNs) typically capture spatial dependencies with the predefined or learnable static graph structure, ignoring the hidden dynamic patterns in traffic networks. Meanwhile, most recurrent neural networks (RNNs) or convolutional neural networks (CNNs) cannot effectively capture temporal correlations, especially for long-term temporal dependencies. In this paper, we propose a spatial–temporal attention graph convolution network (STAGCN), which acquires a static graph and a dynamic graph from data without any prior knowledge. The static graph aims to model global space adaptability, and the dynamic graph is designed to capture local dynamics in the traffic network. A gated temporal attention module is further introduced for long-term temporal dependencies, where a causal-trend attention mechanism is proposed to increase the awareness of causality and local trends in time series. Extensive experiments on four real-world traffic flow datasets demonstrate that STAGCN achieves an outstanding prediction accuracy improvement over existing solutions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
A Novel Anti-Risk Method for Portfolio Trading Using Deep Reinforcement Learning
Electronics 2022, 11(9), 1506; https://doi.org/10.3390/electronics11091506 - 07 May 2022
Cited by 1
Abstract
In the past decade, the application of deep reinforcement learning (DRL) in portfolio management has attracted extensive attention. However, most classical RL algorithms do not consider the exogenous and noise of financial time series data, which may lead to treacherous trading decisions. To [...] Read more.
In the past decade, the application of deep reinforcement learning (DRL) in portfolio management has attracted extensive attention. However, most classical RL algorithms do not consider the exogenous and noise of financial time series data, which may lead to treacherous trading decisions. To address this issue, we propose a novel anti-risk portfolio trading method based on deep reinforcement learning (DRL). It consists of a stacked sparse denoising autoencoder (SSDAE) network and an actor–critic based reinforcement learning (RL) agent. SSDAE will carry out off-line training first, while the decoder will used for on-line feature extraction in each state. The SSDAE network is used for the noise resistance training of financial data. The actor–critic algorithm we use is advantage actor–critic (A2C) and consists of two networks: the actor network learns and implements an investment policy, which is then evaluated by the critic network to determine the best action plan by continuously redistributing various portfolio assets, taking Sharp ratio as the optimization function. Through extensive experiments, the results show that our proposed method is effective and superior to the Dow Jones Industrial Average index (DJIA), several variants of our proposed method, and a state-of-the-art (SOTA) method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Multi-Scale Upsampling GAN Based Hole-Filling Framework for High-Quality 3D Cultural Heritage Artifacts
Appl. Sci. 2022, 12(9), 4581; https://doi.org/10.3390/app12094581 - 30 Apr 2022
Abstract
With the rapid development of 3D scanners, the cultural heritage artifacts can be stored as a point cloud and displayed through the Internet. However, due to natural and human factors, many cultural relics had some surface damage when excavated. As a result, the [...] Read more.
With the rapid development of 3D scanners, the cultural heritage artifacts can be stored as a point cloud and displayed through the Internet. However, due to natural and human factors, many cultural relics had some surface damage when excavated. As a result, the holes caused by these damages still exist in the generated point cloud model. This work proposes a multi-scale upsampling GAN (MU-GAN) based framework for completing these holes. Firstly, a 3D mesh model based on the original point cloud is reconstructed, and the method of detecting holes is presented. Secondly, the point cloud patch contains hole regions and is extracted from the point cloud. Then the patch is input into the MU-GAN to generate a high-quality dense point cloud. Finally, the empty areas on the original point cloud are filled with the generated dense point cloud patches. A series of real-world experiments are conducted on real scan data to demonstrate that the proposed framework can fill the holes of 3D heritage models with grained details. We hope that our work can provide a useful tool for cultural heritage protection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Visual and Phonological Feature Enhanced Siamese BERT for Chinese Spelling Error Correction
Appl. Sci. 2022, 12(9), 4578; https://doi.org/10.3390/app12094578 - 30 Apr 2022
Abstract
Chinese Spelling Check (CSC) aims to detect and correct spelling errors in Chinese. Most CSC models rely on human-defined confusion sets to narrow the search space, failing to resolve errors outside the confusion set. However, most spelling errors in current benchmark datasets are [...] Read more.
Chinese Spelling Check (CSC) aims to detect and correct spelling errors in Chinese. Most CSC models rely on human-defined confusion sets to narrow the search space, failing to resolve errors outside the confusion set. However, most spelling errors in current benchmark datasets are character pairs in similar pronunciations. Errors in similar shapes and errors which are visually and phonologically irrelevant are not considered. Furthermore, widely-used automatically generated training data in CSC tasks leads to label leakage and unfair comparison between different methods. In this work, we propose a feature (visual and phonological) enhanced siamese BERT to (1) correct spelling errors without using confusion sets; (2) integrate phonological and visual features for CSC by a glyph graph; (3) improve performance for unseen spelling errors. To evaluate CSC methods fairly and comprehensively, we build a large-scale CSC dataset in which the number of samples in different error types is the same. The experimental results show that the proposed approach achieves better performance compared with previous state-of-the-art methods on three benchmark datasets and the new error-type balanced dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Review
A Systematic Literature Review of Learning-Based Traffic Accident Prediction Models Based on Heterogeneous Sources
Appl. Sci. 2022, 12(9), 4529; https://doi.org/10.3390/app12094529 - 29 Apr 2022
Abstract
Statistics affirm that almost half of deaths in traffic accidents were vulnerable road users, such as pedestrians, cyclists, and motorcyclists. Despite the efforts in technological infrastructure and traffic policies, the number of victims remains high and beyond expectation. Recent research establishes that determining [...] Read more.
Statistics affirm that almost half of deaths in traffic accidents were vulnerable road users, such as pedestrians, cyclists, and motorcyclists. Despite the efforts in technological infrastructure and traffic policies, the number of victims remains high and beyond expectation. Recent research establishes that determining the causes of traffic accidents is not an easy task because their occurrence depends on one or many factors. Traffic accidents can be caused by, for instance, mechanical problems, adverse weather conditions, mental and physical fatigue, negligence, potholes in the road, among others. At present, the use of learning-based prediction models as mechanisms to reduce the number of traffic accidents is a reality. In that way, the success of prediction models depends mainly on how data from different sources can be integrated and correlated. This study aims to report models, algorithms, data sources, attributes, data collection services, driving simulators, evaluation metrics, percentages of data for training/validation/testing, and others. We found that the performance of a prediction model depends mainly on the quality of its data and a proper data split configuration. The use of real data predominates over data generated by simulators. This work made it possible to determine that future research must point to developing traffic accident prediction models that use deep learning. It must also focus on exploring and using data sources, such as driver data and light conditions, and solve issues related to this type of solution, such as high dimensionality in data and information imbalance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Credit Card Fraud Detection Using a New Hybrid Machine Learning Architecture
Mathematics 2022, 10(9), 1480; https://doi.org/10.3390/math10091480 - 28 Apr 2022
Abstract
The negative effect of financial crimes on financial institutions has grown dramatically over the years. To detect crimes such as credit card fraud, several single and hybrid machine learning approaches have been used. However, these approaches have significant limitations as no further investigation [...] Read more.
The negative effect of financial crimes on financial institutions has grown dramatically over the years. To detect crimes such as credit card fraud, several single and hybrid machine learning approaches have been used. However, these approaches have significant limitations as no further investigation on different hybrid algorithms for a given dataset were studied. This research proposes and investigates seven hybrid machine learning models to detect fraudulent activities with a real word dataset. The developed hybrid models consisted of two phases, state-of-the-art machine learning algorithms were used first to detect credit card fraud, then, hybrid methods were constructed based on the best single algorithm from the first phase. Our findings indicated that the hybrid model Adaboost + LGBM is the champion model as it displayed the highest performance. Future studies should focus on studying different types of hybridization and algorithms in the credit card domain. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Belief Entropy Tree and Random Forest: Learning from Data with Continuous Attributes and Evidential Labels
Entropy 2022, 24(5), 605; https://doi.org/10.3390/e24050605 - 26 Apr 2022
Cited by 1
Abstract
As well-known machine learning methods, decision trees are widely applied in classification and recognition areas. In this paper, with the uncertainty of labels handled by belief functions, a new decision tree method based on belief entropy is proposed and then extended to random [...] Read more.
As well-known machine learning methods, decision trees are widely applied in classification and recognition areas. In this paper, with the uncertainty of labels handled by belief functions, a new decision tree method based on belief entropy is proposed and then extended to random forest. With the Gaussian mixture model, this tree method is able to deal with continuous attribute values directly, without pretreatment of discretization. Specifically, the tree method adopts belief entropy, a kind of uncertainty measurement based on the basic belief assignment, as a new attribute selection tool. To improve the classification performance, we constructed a random forest based on the basic trees and discuss different prediction combination strategies. Some numerical experiments on UCI machine learning data set were conducted, which indicate the good classification accuracy of the proposed method in different situations, especially on data with huge uncertainty. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Research on Modulation Signal Recognition Based on CLDNN Network
Electronics 2022, 11(9), 1379; https://doi.org/10.3390/electronics11091379 - 26 Apr 2022
Abstract
Modulated signal recognition and classification occupies an important position in electronic information warfare, intelligent wireless communication, and fast modulation and demodulation. To address the shortcomings of existing recognition methods, such as high manual involvement, few recognition types, and a low recognition rate under [...] Read more.
Modulated signal recognition and classification occupies an important position in electronic information warfare, intelligent wireless communication, and fast modulation and demodulation. To address the shortcomings of existing recognition methods, such as high manual involvement, few recognition types, and a low recognition rate under a low signal-to-noise ratio, we propose an attention mechanism short-link convolution long short-term memory deep neural networks (ASCLDNN) recognition model. The network is optimized for modulated signal recognition and incorporates an attention mechanism to achieve higher accuracy by adding weights to important signals. The experimental results show that ASCLDNN can recognize 11 signal modulations with high accuracy at a low signal-to-noise ratio and no confusion for specific signals. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Communication
Learning Local Distribution for Extremely Efficient Single-Image Super-Resolution
Electronics 2022, 11(9), 1348; https://doi.org/10.3390/electronics11091348 - 24 Apr 2022
Abstract
Achieving balance between efficiency and performance is a key problem for convolution neural network (CNN)-based single-image super-resolution (SISR) algorithms. Existing methods tend to directly output high-resolution (HR) pixels or residuals to reconstruct the HR image and focus a lot of attention on designing [...] Read more.
Achieving balance between efficiency and performance is a key problem for convolution neural network (CNN)-based single-image super-resolution (SISR) algorithms. Existing methods tend to directly output high-resolution (HR) pixels or residuals to reconstruct the HR image and focus a lot of attention on designing powerful CNN backbones. However, this reconstruction way requires the CNN backbone to have good ability to fit the mapping function from LR pixels to HR pixels, which certainly held these methods back from achieving extreme efficiency and from working in embedded environments. In this work, we propose a novel distribution learning architecture to estimate the local distribution and reconstruct HR pixels by sampling the local distribution with the corresponding 2D coordinates. We also improve the backbone structure to better support the proposed distribution learning architecture. The experimental results demonstrate that the proposed method achieves state-of-the-art performance for extremely efficient SISR and exhibits a good balance between efficiency and performance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Investigating How Reproducibility and Geometrical Representation in UMAP Dimensionality Reduction Impact the Stratification of Breast Cancer Tumors
Appl. Sci. 2022, 12(9), 4247; https://doi.org/10.3390/app12094247 - 22 Apr 2022
Abstract
Advances in next-generation sequencing have provided high-dimensional RNA-seq datasets, allowing the stratification of some tumor patients based on their transcriptomic profiles. Machine learning methods have been used to reduce and cluster high-dimensional data. Recently, uniform manifold approximation and projection (UMAP) was applied to [...] Read more.
Advances in next-generation sequencing have provided high-dimensional RNA-seq datasets, allowing the stratification of some tumor patients based on their transcriptomic profiles. Machine learning methods have been used to reduce and cluster high-dimensional data. Recently, uniform manifold approximation and projection (UMAP) was applied to project genomic datasets in low-dimensional Euclidean latent space. Here, we evaluated how different representations of the UMAP embedding can impact the analysis of breast cancer (BC) stratification. We projected BC RNA-seq data on Euclidean, spherical, and hyperbolic spaces, and stratified BC patients via clustering algorithms. We also proposed a pipeline to yield more reproducible clustering outputs. The results show how the selection of the latent space can affect downstream stratification results and suggest that the exploration of different geometrical representations is recommended to explore data structure and samples’ relationships. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Semi-Supervised Cross-Subject Emotion Recognition Based on Stacked Denoising Autoencoder Architecture Using a Fusion of Multi-Modal Physiological Signals
Entropy 2022, 24(5), 577; https://doi.org/10.3390/e24050577 - 20 Apr 2022
Abstract
In recent decades, emotion recognition has received considerable attention. As more enthusiasm has shifted to the physiological pattern, a wide range of elaborate physiological emotion data features come up and are combined with various classifying models to detect one’s emotional states. To circumvent [...] Read more.
In recent decades, emotion recognition has received considerable attention. As more enthusiasm has shifted to the physiological pattern, a wide range of elaborate physiological emotion data features come up and are combined with various classifying models to detect one’s emotional states. To circumvent the labor of artificially designing features, we propose to acquire affective and robust representations automatically through the Stacked Denoising Autoencoder (SDA) architecture with unsupervised pre-training, followed by supervised fine-tuning. In this paper, we compare the performances of different features and models through three binary classification tasks based on the Valence-Arousal-Dominance (VAD) affection model. Decision fusion and feature fusion of electroencephalogram (EEG) and peripheral signals are performed on hand-engineered features; data-level fusion is performed on deep-learning methods. It turns out that the fusion data perform better than the two modalities. To take advantage of deep-learning algorithms, we augment the original data and feed it directly into our training model. We use two deep architectures and another generative stacked semi-supervised architecture as references for comparison to test the method’s practical effects. The results reveal that our scheme slightly outperforms the other three deep feature extractors and surpasses the state-of-the-art of hand-engineered features. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM)
Appl. Sci. 2022, 12(9), 4140; https://doi.org/10.3390/app12094140 - 20 Apr 2022
Cited by 1
Abstract
Arabic is one of the official languages recognized by the United Nations (UN) and is widely used in the middle east, and parts of Asia, Africa, and other countries. Social media activity currently dominates the textual communication on the Internet and potentially represents [...] Read more.
Arabic is one of the official languages recognized by the United Nations (UN) and is widely used in the middle east, and parts of Asia, Africa, and other countries. Social media activity currently dominates the textual communication on the Internet and potentially represents people’s views about specific issues. Opinion mining is an important task for understanding public opinion polarity towards an issue. Understanding public opinion leads to better decisions in many fields, such as public services and business. Language background plays a vital role in understanding opinion polarity. Variation is not only due to the vocabulary but also cultural background. The sentence is a time series signal; therefore, sequence gives a significant correlation to the meaning of the text. A recurrent neural network (RNN) is a variant of deep learning where the sequence is considered. Long short-term memory (LSTM) is an implementation of RNN with a particular gate to keep or ignore specific word signals during a sequence of inputs. Text is unstructured data, and it cannot be processed further by a machine unless an algorithm transforms the representation into a readable machine learning format as a vector of numerical values. Transformation algorithms range from the Term Frequency–Inverse Document Frequency (TF-IDF) transform to advanced word embedding. Word embedding methods include GloVe, word2vec, BERT, and fastText. This research experimented with those algorithms to perform vector transformation of the Arabic text dataset. This study implements and compares the GloVe and fastText word embedding algorithms and long short-term memory (LSTM) implemented in single-, double-, and triple-layer architectures. Finally, this research compares their accuracy for opinion mining on an Arabic dataset. It evaluates the proposed algorithm with the ASAD dataset of 55,000 annotated tweets in three classes. The dataset was augmented to achieve equal proportions of positive, negative, and neutral classes. According to the evaluation results, the triple-layer LSTM with fastText word embedding achieved the best testing accuracy, at 90.9%, surpassing all other experimental scenarios. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
BTENet: Back-Fat Thickness Estimation Network for Automated Grading of the Korean Commercial Pig
Electronics 2022, 11(9), 1296; https://doi.org/10.3390/electronics11091296 - 19 Apr 2022
Abstract
For the automated grading of the Korean commercial pig, we propose deep neural networks called the back-fat thickness estimation network (BTENet). The proposed BTENet contains segmentation and thickness estimation modules to simultaneously perform a back-fat area segmentation and a thickness estimation. The segmentation [...] Read more.
For the automated grading of the Korean commercial pig, we propose deep neural networks called the back-fat thickness estimation network (BTENet). The proposed BTENet contains segmentation and thickness estimation modules to simultaneously perform a back-fat area segmentation and a thickness estimation. The segmentation module estimates a back-fat area mask from an input image. Through both the input image and estimated back-fat mask, the thickness estimation module predicts a real back-fat thickness in millimeters by effectively analyzing the back-fat area. To train BTENet, we also build a large-scale pig image dataset called PigBT. Experimental results validate that the proposed BTENet achieves the reliable thickness estimation (Pearson’s correlation coefficient: 0.915; mean absolute error: 1.275 mm; mean absolute percentage error: 6.4%). Therefore, we expect that BTENet will accelerate a new phase for the automated grading system of the Korean commercial pig. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Multiview Clustering of Adaptive Sparse Representation Based on Coupled P Systems
Entropy 2022, 24(4), 568; https://doi.org/10.3390/e24040568 - 18 Apr 2022
Abstract
A multiview clustering (MVC) has been a significant technique to dispose data mining issues. Most of the existing studies on this topic adopt a fixed number of neighbors when constructing the similarity matrix of each view, like single-view clustering. However, this may reduce [...] Read more.
A multiview clustering (MVC) has been a significant technique to dispose data mining issues. Most of the existing studies on this topic adopt a fixed number of neighbors when constructing the similarity matrix of each view, like single-view clustering. However, this may reduce the clustering effect due to the diversity of multiview data sources. Moreover, most MVC utilizes iterative optimization to obtain clustering results, which consumes a significant amount of time. Therefore, this paper proposes a multiview clustering of adaptive sparse representation based on coupled P system (MVCS-CP) without iteration. The whole algorithm flow runs in the coupled P system. Firstly, the natural neighbor search algorithm without parameters automatically determines the number of neighbors of each view. In turn, manifold learning and sparse representation are employed to construct the similarity matrix, which preserves the internal geometry of the views. Next, a soft thresholding operator is introduced to form the unified graph to gain the clustering results. The experimental results on nine real datasets indicate that the MVCS-CP outperforms other state-of-the-art comparison algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Fault Diagnosis of Induction Motors with Imbalanced Data Using Deep Convolutional Generative Adversarial Network
Appl. Sci. 2022, 12(8), 4080; https://doi.org/10.3390/app12084080 - 18 Apr 2022
Cited by 1
Abstract
A homemade defective model of an induction motor was created by the laboratory team to acquire the vibration acceleration signals of five operating states of an induction motor under different loads. Two major learning models, namely a deep convolutional generative adversarial network (DCGAN) [...] Read more.
A homemade defective model of an induction motor was created by the laboratory team to acquire the vibration acceleration signals of five operating states of an induction motor under different loads. Two major learning models, namely a deep convolutional generative adversarial network (DCGAN) and a convolutional neural network, were applied for fault diagnosis of the induction motor to the problem of an imbalanced training dataset. Two datasets were studied and analyzed: a sufficient and balanced training dataset and insufficient and imbalanced training data. When the training datasets were adequate and balanced, time–frequency analysis was advantageous for fault diagnosis at different loads, with the diagnostic accuracy achieving 95.06% and 96.38%. For the insufficient and imbalanced training dataset, regardless of the signal preprocessing method, the more imbalanced the training dataset, the lower the diagnostic accuracy was for the testing dataset. Samples generated by DCGAN were found to exhibit 80% similarity with the actual data through comparison. By oversampling the imbalanced dataset, DCGAN achieved a 90% diagnostic accuracy, close to the accuracy achieved using a balanced dataset. Among all oversampling techniques, the pro-balanced method yielded the optimal result. The diagnostic accuracy reached 85% in the cross-load test, indicating that the generated data had successfully learned the different fault features that validate the DCGAN’s ability to learn parts of input signals. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
MSPNet: Multi-Scale Strip Pooling Network for Road Extraction from Remote Sensing Images
Appl. Sci. 2022, 12(8), 4068; https://doi.org/10.3390/app12084068 - 18 Apr 2022
Abstract
Extracting roads from remote sensing images can support a range of geo-information applications. However, it is challenging due to factors such as the complex distribution of ground objects and occlusion of buildings, trees, shadows, etc. Pixel-wise classification often fails to predict road connectivity [...] Read more.
Extracting roads from remote sensing images can support a range of geo-information applications. However, it is challenging due to factors such as the complex distribution of ground objects and occlusion of buildings, trees, shadows, etc. Pixel-wise classification often fails to predict road connectivity and thus produces fragmented road segments. In this paper, we propose a multi-scale strip pooling network (MSPNet) to learn the linear features of roads. Motivated by the strip pooling being more aligned with the shape of roads, which are long-span and narrow, we develop a multi-scale strip pooling (MSP) module that utilizes strip pooling layers with long but narrow kernel shapes to capture multi-scale long-range context from horizontal and vertical directions. The proposed MSP module focuses on establishing relationships along the road region to guarantee the connectivity of roads. Considering the complex distribution of ground objects, the spatial pyramid pooling is applied to enhance the learning ability of complex features in different sub-regions. In addition, to alleviate the problem caused by an imbalanced distribution of road and non-road pixels, we use binary cross-entropy and dice-coefficient loss functions to jointly train our proposed deep learning model. Then, we perform ablation experiments to adjust the loss contributions to suit the task of road extraction. Comparative experiments on a popular benchmark DeepGlobe dataset demonstrate that our proposed MSPNet establishes new competitive results in both IoU and F1-score. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Amodal Segmentation Just Like Doing a Jigsaw
Appl. Sci. 2022, 12(8), 4061; https://doi.org/10.3390/app12084061 - 17 Apr 2022
Abstract
Amodal segmentation is a new direction of instance segmentation while considering the segmentation of the visible and occluded parts of the instance. The existing state-of-the-art method uses multi-task branches to predict the amodal part and the visible part separately and subtract the visible [...] Read more.
Amodal segmentation is a new direction of instance segmentation while considering the segmentation of the visible and occluded parts of the instance. The existing state-of-the-art method uses multi-task branches to predict the amodal part and the visible part separately and subtract the visible part from the amodal part to obtain the occluded part. However, the amodal part contains visible information. Therefore, the separated prediction method will generate duplicate information. Different from this method, we propose a method of amodal segmentation based on the idea of the jigsaw. The method uses multi-task branches to predict the two naturally decoupled parts of visible and occluded, which is like getting two matching jigsaw pieces. Then put the two jigsaw pieces together to get the amodal part. This makes each branch focus on the modeling of the object. And we believe that there are certain rules in the occlusion relationship in the real world. This is a kind of occlusion context information. This jigsaw method can better model the occlusion relationship and use the occlusion context information, which is important for amodal segmentation. Experiments on two widely used amodally annotated datasets prove that our method exceeds existing state-of-the-art methods. In particular, on the amodal mask metric, our method outperforms the baseline by 5 percentage points on the COCOA cls dataset and 2 percentage points on the KINS dataset. The source code of this work will be made public soon. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Automatic Classification of Normal–Abnormal Heart Sounds Using Convolution Neural Network and Long-Short Term Memory
Electronics 2022, 11(8), 1246; https://doi.org/10.3390/electronics11081246 - 14 Apr 2022
Abstract
The phonocardiogram (PCG) is an important analysis method for the diagnosis of cardiovascular disease, which is usually performed by experienced medical experts. Due to the high ratio of patients to doctors, there is a pressing need for a real-time automated phonocardiogram classification system [...] Read more.
The phonocardiogram (PCG) is an important analysis method for the diagnosis of cardiovascular disease, which is usually performed by experienced medical experts. Due to the high ratio of patients to doctors, there is a pressing need for a real-time automated phonocardiogram classification system for the diagnosis of cardiovascular disease. This paper proposes a deep neural-network structure based on a one-dimensional convolutional neural network (1D-CNN) and a long short-term memory network (LSTM), which can directly classify unsegmented PCG to identify abnormal signal. The PCG data were filtered and put into the model for analysis. A total of 3099 pieces of heart-sound recordings were used, while another 100 patients’ heart-sound data collected by our group and diagnosed by doctors were used to test and verify the model. Results show that the CNN-LSTM model provided a good overall balanced accuracy of 0.86 ± 0.01 with a sensitivity of 0.87 ± 0.02, and specificity of 0.89 ± 0.02. The F1-score was 0.91 ± 0.01, and the receiver-operating characteristic (ROC) plot produced an area under the curve (AUC) value of 0.92 ± 0.01. The sensitivity, specificity and accuracy of the 100 patients’ data were 0.83 ± 0.02, 0.80 ± 0.02 and 0.85 ± 0.03, respectively. The proposed model does not require feature engineering and heart-sound segmentation, which possesses reliable performance in classification of abnormal PCG; and is fast and suitable for real-time diagnosis application. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

Article
Partial Atrous Cascade R-CNN
Electronics 2022, 11(8), 1241; https://doi.org/10.3390/electronics11081241 - 14 Apr 2022
Abstract
Deep-learning-based segmentation methods have achieved excellent results. As two main tasks in computer vision, instance segmentation and semantic segmentation are closely related and mutually beneficial. Spatial context information from the semantic features can also improve the accuracy of instance segmentation. Inspired by this, [...] Read more.
Deep-learning-based segmentation methods have achieved excellent results. As two main tasks in computer vision, instance segmentation and semantic segmentation are closely related and mutually beneficial. Spatial context information from the semantic features can also improve the accuracy of instance segmentation. Inspired by this, we propose a novel instance segmentation framework named partial atrous cascade R-CNN (PAC), which effectively improves the accuracy of the segmentation boundary. The proposed network innovates in two aspects: (1) A semantic branch with a partial atrous spatial pyramid extraction (PASPE) module is proposed in this paper. The module consists of atrous convolution layers with multi-dilation rates. By expanding the receptive field of the convolutional layer, multi-scale semantic features are greatly enriched. Experiments shows that the new branch obtains more accurate segmentation contours. (2) The proposed mask quality (MQ) module scores the intersection over union (IoU) between the predicted mask and the ground truth mask. Benefiting from the modified mask quality score, the quality of the segmentation results is judged credibly. Our proposed network is trained and tested on the MS COCO dataset. Compared with the benchmark, it brings consistent and noticeable improvements in the case of using the same backbone. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Detecting Deepfake Voice Using Explainable Deep Learning Techniques
Appl. Sci. 2022, 12(8), 3926; https://doi.org/10.3390/app12083926 - 13 Apr 2022
Abstract
Fake media, generated by methods such as deepfakes, have become indistinguishable from real media, but their detection has not improved at the same pace. Furthermore, the absence of interpretability on deepfake detection models makes their reliability questionable. In this paper, we present a [...] Read more.
Fake media, generated by methods such as deepfakes, have become indistinguishable from real media, but their detection has not improved at the same pace. Furthermore, the absence of interpretability on deepfake detection models makes their reliability questionable. In this paper, we present a human perception level of interpretability for deepfake audio detection. Based on their characteristics, we implement several explainable artificial intelligence (XAI) methods used for image classification on an audio-related task. In addition, by examining the human cognitive process of XAI on image classification, we suggest the use of a corresponding data format for providing interpretability. Using this novel concept, a fresh interpretation using attribution scores can be provided. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Express Construction for GANs from Latent Representation to Data Distribution
Appl. Sci. 2022, 12(8), 3910; https://doi.org/10.3390/app12083910 - 13 Apr 2022
Cited by 1
Abstract
Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to [...] Read more.
Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to the data distribution is discontinuous, but deep neural networks (DNNs) can only approximate continuous ones. Instead, the latent representation is a better raw material used to construct a transportation map point to the data distribution than random noise. Because it is a low-dimensional mapping related to the data distribution, the construction procedure seems more like expansion rather than starting all over. Besides, we can also search for more transportation maps in this way with smoother transformation. Thus, we have proposed a new training methodology for GANs in this paper to search for more transportation maps and speed the training up, named Express Construction. The key idea is to train GANs with two independent phases for successively yielding latent representation and data distribution. To this end, an Auto-Encoder is trained to map the real data into the latent space, and two couples of generators and discriminators are used to produce them. To the best of our knowledge, we are the first to decompose the training procedure of GAN models into two more uncomplicated phases, thus tackling the mode collapse problem without much more computational cost. We also provide theoretical steps toward understanding the training dynamics of this procedure and prove assumptions. No extra hyper-parameters have been used in the proposed method, which indicates that Express Construction can be used to train any GAN models. Extensive experiments are conducted to verify the performance of realistic image generation and the resistance to mode collapse. The results show that the proposed method is lightweight, effective, and less prone to mode collapse. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Breast and Lung Anticancer Peptides Classification Using N-Grams and Ensemble Learning Techniques
Big Data Cogn. Comput. 2022, 6(2), 40; https://doi.org/10.3390/bdcc6020040 - 12 Apr 2022
Abstract
Anticancer peptides (ACPs) are short protein sequences; they perform functions like some hormones and enzymes inside the body. The role of any protein or peptide is related to its structure and the sequence of amino acids that make up it. There are 20 [...] Read more.
Anticancer peptides (ACPs) are short protein sequences; they perform functions like some hormones and enzymes inside the body. The role of any protein or peptide is related to its structure and the sequence of amino acids that make up it. There are 20 types of amino acids in humans, and each of them has a particular characteristic according to its chemical structure. Current machine and deep learning models have been used to classify ACPs problems. However, these models have neglected Amino Acid Repeats (AARs) that play an essential role in the function and structure of peptides. Therefore, in this paper, ACPs offer a promising route for novel anticancer peptides by extracting AARs based on N-Grams and k-mers using two peptides’ datasets. These datasets pointed to breast and lung cancer cells assembled and curated manually from the Cancer Peptide and Protein Database (CancerPPD). Every dataset consists of a sequence of peptides and their synthesis and anticancer activity on breast and lung cancer cell lines. Five different feature selection methods were used in this paper to improve classification performance and reduce the experimental costs. After that, ACPs were classified using four classifiers, namely AdaBoost, Random Forest Tree (RFT), Multi-class Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP). These classifiers were evaluated by applying five well-known evaluation metrics. Experimental results showed that the breast and lung ACPs classification process provided an accurate performance that reached 89.25% and 92.56%, respectively. In terms of AUC, it reached 95.35% and 96.92% for both breast and lung ACPs, respectively. The proposed classifiers performed competently somewhat equally in AUC, accuracy, precision, F-measures, and recall, except for Multi-class SVM-based feature selection, which showed superior performance. As a result, this paper significantly improved the predictive performance that can effectively distinguish ACPs as virtual inactive, experimental inactive, moderately active, and very active. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Criteria Selection Using Machine Learning (ML) for Communication Technology Solution of Electrical Distribution Substations
Appl. Sci. 2022, 12(8), 3878; https://doi.org/10.3390/app12083878 - 12 Apr 2022
Abstract
In the future, as populations grow and more end-user applications become available, the current traditional electrical distribution substation will not be able to fully accommodate new applications that may arise. Consequently, there will be numerous difficulties, including network congestion, latency, jitter, and, in [...] Read more.
In the future, as populations grow and more end-user applications become available, the current traditional electrical distribution substation will not be able to fully accommodate new applications that may arise. Consequently, there will be numerous difficulties, including network congestion, latency, jitter, and, in the worst-case scenario, network failure, among other things. Thus, the purpose of this study is to assist decision makers in selecting the most appropriate communication technologies for an electrical distribution substation through an examination of the criteria’s in-fluence on the selection process. In this study, nine technical criteria were selected and processed using machine learning (ML) software, RapidMiner, to find the most optimal technical criteria. Several ML techniques were studied, and Naïve Bayes was chosen, as it showed the highest performance among the rest. From this study, the criteria were ranked in order of importance from most important to least important based on the average value obtained from the output. Seven technical criteria were identified as being important and should be evaluated in order to determine the most appropriate communication technology solution for electrical distribution substation as a result of this study. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Entropy-Enhanced Attention Model for Explanation Recommendation
Entropy 2022, 24(4), 535; https://doi.org/10.3390/e24040535 - 11 Apr 2022
Abstract
Most of the existing recommendation systems using deep learning are based on the method of RNN (Recurrent Neural Network). However, due to some inherent defects of RNN, recommendation systems based on RNN are not only very time consuming but also unable to capture [...] Read more.
Most of the existing recommendation systems using deep learning are based on the method of RNN (Recurrent Neural Network). However, due to some inherent defects of RNN, recommendation systems based on RNN are not only very time consuming but also unable to capture the long-range dependencies between user comments. Through the sentiment analysis of user comments, we can better capture the characteristics of user interest. Information entropy can reduce the adverse impact of noise words on the construction of user interests. Information entropy is used to analyze the user information content and filter out users with low information entropy to achieve the purpose of filtering noise data. A self-attention recommendation model based on entropy regularization is proposed to analyze the emotional polarity of the data set. Specifically, to model the mixed interactions from user comments, a multi-head self-attention network is introduced. The loss function of the model is used to realize the interpretability of recommendation systems. The experiment results show that our model outperforms the baseline methods in terms of MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain) on several datasets, and it achieves good interpretability. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
On the Black-Box Challenge for Fraud Detection Using Machine Learning (II): Nonlinear Analysis through Interpretable Autoencoders
Appl. Sci. 2022, 12(8), 3856; https://doi.org/10.3390/app12083856 - 11 Apr 2022
Cited by 1
Abstract
Artificial intelligence (AI) has recently intensified in the global economy due to the great competence that it has demonstrated for analysis and modeling in many disciplines. This situation is accelerating the shift towards a more automated society, where these new techniques can be [...] Read more.
Artificial intelligence (AI) has recently intensified in the global economy due to the great competence that it has demonstrated for analysis and modeling in many disciplines. This situation is accelerating the shift towards a more automated society, where these new techniques can be consolidated as a valid tool to face the difficult challenge of credit fraud detection (CFD). However, tight regulations do not make it easy for financial entities to comply with them while using modern techniques. From a methodological perspective, autoencoders have demonstrated their effectiveness in discovering nonlinear features across several problem domains. However, autoencoders are opaque and often seen as black boxes. In this work, we propose an interpretable and agnostic methodology for CFD. This type of approach allows a double advantage: on the one hand, it can be applied together with any machine learning (ML) technique, and on the other hand, it offers the necessary traceability between inputs and outputs, hence escaping from the black-box model. We first applied the state-of-the-art feature selection technique defined in the companion paper. Second, we proposed a novel technique, based on autoencoders, capable of evaluating the relationship among input and output of a sophisticated ML model for each and every one of the samples that are submitted to the analysis, through a single transaction-level explanation (STE) approach. This technique allows each instance to be analyzed individually by applying small fluctuations of the input space and evaluating how it is triggered in the output, thereby shedding light on the underlying dynamics of the model. Based on this, an individualized transaction ranking (ITR) can be formulated, leveraging on the contributions of each feature through STE. These rankings represent a close estimate of the most important features playing a role in the decision process. The results obtained in this work were consistent with previous published papers, and showed that certain features, such as living beyond means, lack or absence of transaction trail, and car loans, have strong influence on the model outcome. Additionally, this proposal using the latent space outperformed, in terms of accuracy, our previous results, which already improved prior published papers, by 5.5% and 1.5% for the datasets under study, from a baseline of 76% and 93%. The contribution of this paper is twofold, as far as a new outperforming CFD classification model is presented, and at the same time, we developed a novel methodology, applicable across classification techniques, that allows to breach black-box models, erasingthe dependencies and, eventually, undesirable biases. We conclude that it is possible to develop an effective, individualized, unbiased, and traceable ML technique, not only to comply with regulations, but also to be able to cope with transaction-level inquiries from clients and authorities. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Enhance Domain-Invariant Transferability of Adversarial Examples via Distance Metric Attack
Mathematics 2022, 10(8), 1249; https://doi.org/10.3390/math10081249 - 11 Apr 2022
Abstract
A general foundation of fooling a neural network without knowing the details (i.e., black-box attack) is the attack transferability of adversarial examples across different models. Many works have been devoted to enhancing the task-specific transferability of adversarial examples, whereas the cross-task transferability is [...] Read more.
A general foundation of fooling a neural network without knowing the details (i.e., black-box attack) is the attack transferability of adversarial examples across different models. Many works have been devoted to enhancing the task-specific transferability of adversarial examples, whereas the cross-task transferability is nearly out of the research scope. In this paper, to enhance the above two types of transferability of adversarial examples, we are the first to regard the transferability issue as a heterogeneous domain generalisation problem, which can be addressed by a general pipeline based on the domain-invariant feature extractor pre-trained on ImageNet. Specifically, we propose a distance metric attack (DMA) method that aims to increase the latent layer distance between the adversarial example and the benign example along the opposite direction guided by the cross-entropy loss. With the help of a simple loss, DMA can effectively enhance the domain-invariant transferability (for both the task-specific case and the cross-task case) of the adversarial examples. Additionally, DMA can be used to measure the robustness of the latent layers in a deep model. We empirically find that the models with similar structures have consistent robustness at depth-similar layers, which reveals that model robustness is closely related to model structure. Extensive experiments on image classification, object detection, and semantic segmentation demonstrate that DMA can improve the success rate of black-box attack by more than 10% on the task-specific attack and by more than 5% on cross-task attack. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Research on Identification Technology of Field Pests with Protective Color Characteristics
Appl. Sci. 2022, 12(8), 3810; https://doi.org/10.3390/app12083810 - 10 Apr 2022
Cited by 2
Abstract
Accurate identification of field pests has crucial decision-making significance for integrated pest control. Most current research focuses on the identification of pests on the sticky card or the case of great differences between the target and the background. There is little research on [...] Read more.
Accurate identification of field pests has crucial decision-making significance for integrated pest control. Most current research focuses on the identification of pests on the sticky card or the case of great differences between the target and the background. There is little research on field pest identification with protective color characteristics. Aiming at the problem that it is difficult to identify pests with protective color characteristics in the complex field environment, a field pest identification method based on near-infrared imaging technology and YOLOv5 is proposed in this paper. Firstly, an appropriate infrared filter and ring light source have been selected to build an image acquisition system according to the wavelength with the largest spectral reflectance difference between the spectral curves of the pest (Pieris rapae) and its host plants (cabbage), which are formed by specific spectral characteristics. Then, field pest images have been collected to construct a data set, which has been trained and tested through YOLOv5. Experimental results demonstrate that the average time required to detect one pest image is 0.56 s, and the mAP reaches 99.7%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
MULDASA: Multifactor Lexical Sentiment Analysis of Social-Media Content in Nonstandard Arabic Social Media
Appl. Sci. 2022, 12(8), 3806; https://doi.org/10.3390/app12083806 - 09 Apr 2022
Cited by 1
Abstract
The semantically complicated Arabic natural vocabulary, and the shortage of available techniques and skills to capture Arabic emotions from text hinder Arabic sentiment analysis (ASA). Evaluating Arabic idioms that do not follow a conventional linguistic framework, such as contemporary standard Arabic (MSA), complicates [...] Read more.
The semantically complicated Arabic natural vocabulary, and the shortage of available techniques and skills to capture Arabic emotions from text hinder Arabic sentiment analysis (ASA). Evaluating Arabic idioms that do not follow a conventional linguistic framework, such as contemporary standard Arabic (MSA), complicates an incredibly difficult procedure. Here, we define a novel lexical sentiment analysis approach for studying Arabic language tweets (TTs) from specialized digital media platforms. Many elements comprising emoji, intensifiers, negations, and other nonstandard expressions such as supplications, proverbs, and interjections are incorporated into the MULDASA algorithm to enhance the precision of opinion classifications. Root words in multidialectal sentiment LX are associated with emotions found in the content under study via a simple stemming procedure. Furthermore, a feature–sentiment correlation procedure is incorporated into the proposed technique to exclude viewpoints expressed that seem to be irrelevant to the area of concern. As part of our research into Saudi Arabian employability, we compiled a large sample of TTs in 6 different Arabic dialects. This research shows that this sentiment categorization method is useful, and that using all of the characteristics listed earlier improves the ability to accurately classify people’s feelings. The classification accuracy of the proposed algorithm improved from 83.84% to 89.80%. Our approach also outperformed two existing research projects that employed a lexical approach for the sentiment analysis of Saudi dialects. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Review
Predicting Stock Price Changes Based on the Limit Order Book: A Survey
Mathematics 2022, 10(8), 1234; https://doi.org/10.3390/math10081234 - 09 Apr 2022
Abstract
This survey starts with a general overview of the strategies for stock price change predictions based on market data and in particular Limit Order Book (LOB) data. The main discussion is devoted to the systematic analysis, comparison, and critical evaluation of the state-of-the-art [...] Read more.
This survey starts with a general overview of the strategies for stock price change predictions based on market data and in particular Limit Order Book (LOB) data. The main discussion is devoted to the systematic analysis, comparison, and critical evaluation of the state-of-the-art studies in the research area of stock price movement predictions based on LOB data. LOB and Order Flow data are two of the most valuable information sources available to traders on the stock markets. Academic researchers are actively exploring the application of different quantitative methods and algorithms for this type of data to predict stock price movements. With the advancements in machine learning and subsequently in deep learning, the complexity and computational intensity of these models was growing, as well as the claimed predictive power. Some researchers claim accuracy of stock price movement prediction well in excess of 80%. These models are now commonly employed by automated market-making programs to set bids and ask quotes. If these results were also applicable to arbitrage trading strategies, then those algorithms could make a fortune for their developers. Thus, the open question is whether these results could be used to generate buy and sell signals that could be exploited with active trading. Therefore, this survey paper is intended to answer this question by reviewing these results and scrutinising their reliability. The ultimate conclusion from this analysis is that although considerable progress was achieved in this direction, even the state-of-art models can not guarantee a consistent profit in active trading. Taking this into account several suggestions for future research in this area were formulated along the three dimensions: input data, model’s architecture, and experimental setup. In particular, from the input data perspective, it is critical that the dataset is properly processed, up-to-date, and its size is sufficient for the particular model training. From the model architecture perspective, even though deep learning models are demonstrating a stronger performance than classical models, they are also more prone to over-fitting. To avoid over-fitting it is suggested to optimize the feature space, as well as a number of layers and neurons, and apply dropout functionality. The over-fitting problem can be also addressed by optimising the experimental setup in several ways: Introducing the early stopping mechanism; Saving the best weights of the model achieved during the training; Testing the model on the out-of-sample data, which should be separated from the validation and training samples. Finally, it is suggested to always conduct the trading simulation under realistic market conditions considering transactions costs, bid–ask spreads, and market impact. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
MDA-Unet: A Multi-Scale Dilated Attention U-Net for Medical Image Segmentation
Appl. Sci. 2022, 12(7), 3676; https://doi.org/10.3390/app12073676 - 06 Apr 2022
Cited by 1
Abstract
The advanced development of deep learning methods has recently made significant improvements in medical image segmentation. Encoder–decoder networks, such as U-Net, have addressed some of the challenges in medical image segmentation with an outstanding performance, which has promoted them to be the most [...] Read more.
The advanced development of deep learning methods has recently made significant improvements in medical image segmentation. Encoder–decoder networks, such as U-Net, have addressed some of the challenges in medical image segmentation with an outstanding performance, which has promoted them to be the most dominating deep learning architecture in this domain. Despite their outstanding performance, we argue that they still lack some aspects. First, there is incompatibility in U-Net’s skip connection between the encoder and decoder features due to the semantic gap between low-processed encoder features and highly processed decoder features, which adversely affects the final prediction. Second, it lacks capturing multi-scale context information and ignores the contribution of all semantic information through the segmentation process. Therefore, we propose a model named MDA-Unet, a novel multi-scale deep learning segmentation model. MDA-Unet improves upon U-Net and enhances its performance in segmenting medical images with variability in the shape and size of the region of interest. The model is integrated with a multi-scale spatial attention module, where spatial attention maps are derived from a hybrid hierarchical dilated convolution module that captures multi-scale context information. To ease the training process and reduce the gradient vanishing problem, residual blocks are deployed instead of the basic U-net blocks. Through a channel attention mechanism, the high-level decoder features are used to guide the low-level encoder features to promote the selection of meaningful context information, thus ensuring effective fusion. We evaluated our model on 2 different datasets: a lung dataset of 2628 axial CT images and an echocardiographic dataset of 2000 images, each with its own challenges. Our model has achieved a significant gain in performance with a slight increase in the number of trainable parameters in comparison with the basic U-Net model, providing a dice score of 98.3% on the lung dataset and 96.7% on the echocardiographic dataset, where the basic U-Net has achieved 94.2% on the lung dataset and 93.9% on the echocardiographic dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Article
Fully Automatic Segmentation, Identification and Preoperative Planning for Nasal Surgery of Sinuses Using Semi-Supervised Learning and Volumetric Reconstruction
Mathematics 2022, 10(7), 1189; https://doi.org/10.3390/math10071189 - 06 Apr 2022
Abstract
The aim of this study is to develop an automatic segmentation algorithm based on paranasal sinus CT images, which realizes automatic identification and segmentation of the sinus boundary and its inflamed proportions, as well as the reconstruction of normal sinus and inflamed site