Next Issue
Volume 15, May
Previous Issue
Volume 15, March
 
 

Algorithms, Volume 15, Issue 4 (April 2022) – 33 articles

Cover Story (view full-size image): We address the challenge of determining a valid set of parameters for a dynamic line scan thermography setup. Via Gaussian process emulation, we are able to predict the behavior of dynamic line scan thermography without performing a large number of simulations or experiments. Applications depending on many input parameters are hard to optimize and even harder to develop an intuition for. Using Gaussian process emulation, we are able to identify regions of parameter combinations that should be avoided. At the same time, the emulation algorithm provides an insight in the most suitable regions of parameter sets for a desired outcome. We demonstrate that a trained emulator can be queried to find solutions for design questions relevant in a real-world engineering design challenge. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
23 pages, 591 KiB  
Article
Research and Challenges of Reinforcement Learning in Cyber Defense Decision-Making for Intranet Security
by Wenhao Wang, Dingyuanhao Sun, Feng Jiang, Xingguo Chen and Cheng Zhu
Algorithms 2022, 15(4), 134; https://doi.org/10.3390/a15040134 - 18 Apr 2022
Cited by 7 | Viewed by 4701
Abstract
In recent years, cyber attacks have shown diversified, purposeful, and organized characteristics, which pose significant challenges to cyber defense decision-making on internal networks. Due to the continuous confrontation between attackers and defenders, only using data-based statistical or supervised learning methods cannot cope with [...] Read more.
In recent years, cyber attacks have shown diversified, purposeful, and organized characteristics, which pose significant challenges to cyber defense decision-making on internal networks. Due to the continuous confrontation between attackers and defenders, only using data-based statistical or supervised learning methods cannot cope with increasingly severe security threats. It is urgent to rethink network defense from the perspective of decision-making, and prepare for every possible situation. Reinforcement learning has made great breakthroughs in addressing complicated decision-making problems. We propose a framework that defines four modules based on the life cycle of threats: pentest, design, response, recovery. Our aims are to clarify the problem boundary of network defense decision-making problems, to study the problem characteristics in different contexts, to compare the strengths and weaknesses of existing research, and to identify promising challenges for future work. Our work provides a systematic view for understanding and solving decision-making problems in the application of reinforcement learning to cyber defense. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

15 pages, 4078 KiB  
Article
Study of the Algorithm for Wind Shear Detection with Lidar Based on Shear Intensity Factor
by Shijun Zhao and Yulong Shan
Algorithms 2022, 15(4), 133; https://doi.org/10.3390/a15040133 - 18 Apr 2022
Viewed by 2184
Abstract
Low-level wind shear is a vital weather process affecting aircraft safety while taking off and landing and is known as the “aircraft killer” in the aviation industry. As a result, effective monitoring and warning are required. Several ramps detection algorithms for low-level wind [...] Read more.
Low-level wind shear is a vital weather process affecting aircraft safety while taking off and landing and is known as the “aircraft killer” in the aviation industry. As a result, effective monitoring and warning are required. Several ramps detection algorithms for low-level wind shear based on glide path scanning of lidar have been developed, including double and simple ramp detection, with the ramp length extension and contraction strategies corresponding to the algorithm. However, current algorithms must be improved to determine the maximum shear value and location. In this paper, a new efficient algorithm based on the shear intensity factor value is presented, in which wind speed changes and distance are both considered when calculating wind shear. Simultaneously, the effectiveness of the improved algorithm has been validated through numerical simulation experiments. Results reveal that the improved algorithm can determine the maximum intensity value and wind shear location more accurately than the traditional algorithm. In addition, the new algorithm improved the detection ability of lidar for weak wind shear. Full article
Show Figures

Figure 1

15 pages, 4449 KiB  
Article
Optimizing Finite-Difference Operator in Seismic Wave Numerical Modeling
by Hui Li, Yuan Fang, Zhiguo Huang, Mengyao Zhang and Qing Wei
Algorithms 2022, 15(4), 132; https://doi.org/10.3390/a15040132 - 18 Apr 2022
Viewed by 2081
Abstract
The finite-difference method is widely used in seismic wave numerical simulation, imaging, and waveform inversion. In the finite-difference method, the finite difference operator is used to replace the differential operator approximately, which can be obtained by truncating the spatial convolution series. The properties [...] Read more.
The finite-difference method is widely used in seismic wave numerical simulation, imaging, and waveform inversion. In the finite-difference method, the finite difference operator is used to replace the differential operator approximately, which can be obtained by truncating the spatial convolution series. The properties of the truncated window function, such as the main and side lobes of the window function’s amplitude response, determine the accuracy of finite-difference, which subsequently affects the seismic imaging and inversion results significantly. Although numerical dispersion is inevitable in this process, it can be suppressed more effectively by using higher precision finite-difference operators. In this paper, we use the krill herd algorithm, in contrast with the standard PSO and CDPSO (a variant of PSO), to optimize the finite-difference operator. Numerical simulation results verify that the krill herd algorithm has good performance in improving the precision of the differential operator. Full article
Show Figures

Figure 1

35 pages, 6328 KiB  
Article
Multi-Fidelity Gradient-Based Optimization for High-Dimensional Aeroelastic Configurations
by Andrew S. Thelen, Dean E. Bryson, Bret K. Stanford and Philip S. Beran
Algorithms 2022, 15(4), 131; https://doi.org/10.3390/a15040131 - 16 Apr 2022
Cited by 8 | Viewed by 3262
Abstract
The simultaneous optimization of aircraft shape and internal structural size for transonic flight is excessively costly. The analysis of the governing physics is expensive, in particular for highly flexible aircraft, and the search for optima using analysis samples can scale poorly with design [...] Read more.
The simultaneous optimization of aircraft shape and internal structural size for transonic flight is excessively costly. The analysis of the governing physics is expensive, in particular for highly flexible aircraft, and the search for optima using analysis samples can scale poorly with design space size. This paper has a two-fold purpose targeting the scalable reduction of analysis sampling. First, a new algorithm is explored for computing design derivatives by analytically linking objective definition, geometry differentiation, mesh construction, and analysis. The analytic computation of design derivatives enables the accurate use of more efficient gradient-based optimization methods. Second, the scalability of a multi-fidelity algorithm is assessed for optimization in high dimensions. This method leverages a multi-fidelity model during the optimization line search for further reduction of sampling costs. The multi-fidelity optimization is demonstrated for cases of aerodynamic and aeroelastic design considering both shape and structural sizing separately and in combination with design spaces ranging from 17 to 321 variables, which would be infeasible using typical, surrogate-based methods. The multi-fidelity optimization consistently led to a reduction in high-fidelity evaluations compared to single-fidelity optimization for the aerodynamic shape problems, but frequently resulted in a cost penalty for cases involving structural sizing. While the multi-fidelity optimizer was successfully applied to problems with hundreds of variables, the results underscore the importance of accurately computing gradients and motivate the extension of the approach to constrained optimization methods. Full article
Show Figures

Figure 1

24 pages, 6106 KiB  
Article
Machine Learning Algorithms: An Experimental Evaluation for Decision Support Systems
by Hugo Silva and Jorge Bernardino
Algorithms 2022, 15(4), 130; https://doi.org/10.3390/a15040130 - 15 Apr 2022
Cited by 4 | Viewed by 3206
Abstract
Decision support systems with machine learning can help organizations improve operations and lower costs with more precision and efficiency. This work presents a review of state-of-the-art machine learning algorithms for binary classification and makes a comparison of the related metrics between them with [...] Read more.
Decision support systems with machine learning can help organizations improve operations and lower costs with more precision and efficiency. This work presents a review of state-of-the-art machine learning algorithms for binary classification and makes a comparison of the related metrics between them with their application to a public diabetes and human resource datasets. The two mainly used categories that allow the learning process without requiring explicit programming are supervised and unsupervised learning. For that, we use Scikit-learn, the free software machine learning library for Python language. The best-performing algorithm was Random Forest for supervised learning, while in unsupervised clustering techniques, Balanced Iterative Reducing and Clustering Using Hierarchies and Spectral Clustering algorithms presented the best results. The experimental evaluation shows that the application of unsupervised clustering algorithms does not translate into better results than with supervised algorithms. However, the application of unsupervised clustering algorithms, as the preprocessing of the supervised techniques, can translate into a boost of performance. Full article
(This article belongs to the Special Issue Algorithms for Machine Learning and Pattern Recognition Tasks)
Show Figures

Figure 1

25 pages, 4928 KiB  
Article
Convolutional-Neural-Network-Based Handwritten Character Recognition: An Approach with Massive Multisource Data
by Nazmus Saqib, Khandaker Foysal Haque, Venkata Prasanth Yanambaka and Ahmed Abdelgawad
Algorithms 2022, 15(4), 129; https://doi.org/10.3390/a15040129 - 14 Apr 2022
Cited by 21 | Viewed by 7912
Abstract
Neural networks have made big strides in image classification. Convolutional neural networks (CNN) work successfully to run neural networks on direct images. Handwritten character recognition (HCR) is now a very powerful tool to detect traffic signals, translate language, and extract information from documents, [...] Read more.
Neural networks have made big strides in image classification. Convolutional neural networks (CNN) work successfully to run neural networks on direct images. Handwritten character recognition (HCR) is now a very powerful tool to detect traffic signals, translate language, and extract information from documents, etc. Although handwritten character recognition technology is in use in the industry, present accuracy is not outstanding, which compromises both performance and usability. Thus, the character recognition technologies in use are still not very reliable and need further improvement to be extensively deployed for serious and reliable tasks. On this account, characters of the English alphabet and digit recognition are performed by proposing a custom-tailored CNN model with two different datasets of handwritten images, i.e., Kaggle and MNIST, respectively, which are lightweight but achieve higher accuracies than state-of-the-art models. The best two models from the total of twelve designed are proposed by altering hyper-parameters to observe which models provide the best accuracy for which dataset. In addition, the classification reports (CRs) of these two proposed models are extensively investigated considering the performance matrices, such as precision, recall, specificity, and F1 score, which are obtained from the developed confusion matrix (CM). To simulate a practical scenario, the dataset is kept unbalanced and three more averages for the F measurement (micro, macro, and weighted) are calculated, which facilitates better understanding of the performances of the models. The highest accuracy of 99.642% is achieved for digit recognition, with the model using ‘RMSprop’, at a learning rate of 0.001, whereas the highest detection accuracy for alphabet recognition is 99.563%, which is obtained with the proposed model using ‘ADAM’ optimizer at a learning rate of 0.00001. The macro F1 and weighted F1 scores for the best two models are 0.998, 0.997:0.992, and 0.996, respectively, for digit and alphabet recognition. Full article
Show Figures

Figure 1

17 pages, 1130 KiB  
Article
A Fuzzy Grouping Genetic Algorithm for Solving a Real-World Virtual Machine Placement Problem in a Healthcare-Cloud
by Nawaf Alharbe, Abeer Aljohani and Mohamed Ali Rakrouki
Algorithms 2022, 15(4), 128; https://doi.org/10.3390/a15040128 - 14 Apr 2022
Cited by 7 | Viewed by 2366
Abstract
Due to the large-scale development of cloud computing, data center electricity energy costs have increased rapidly. Energy saving has become a major research direction of virtual machine placement problems. At the same time, the multi-dimensional resources on the cloud should be used in [...] Read more.
Due to the large-scale development of cloud computing, data center electricity energy costs have increased rapidly. Energy saving has become a major research direction of virtual machine placement problems. At the same time, the multi-dimensional resources on the cloud should be used in a balanced manner in order to avoid resources waste. In this context, this paper addresses a real-world virtual machine placement problem arising in a Healthcare-Cloud (H-Cloud) of hospitals chain in Saudi Arabia, considering server power consumption and resource utilization. As a part of optimizing both objectives, user service quality has to be taken into account. In fact, user quality of service (QoS) is also considered by measuring the Service-Level Agreement (SLA) violation rate. This problem is modeled as a multi-objective virtual machine placement problem with the objective of minimizing power consumption, resource utilization, and SLA violation rate. To solve this challenging problem, a fuzzy grouping genetic algorithm (FGGA) is proposed. Considering that multiple optimization objectives may have different degrees of influence on the problem, the fitness function of the proposed algorithm is calculated with fuzzy logic-based function. The experimental results show the effectiveness of the proposed algorithm. Full article
Show Figures

Figure 1

21 pages, 3240 KiB  
Article
A Statistical Approach to Discovering Process Regime Shifts and Their Determinants
by Atiq W. Siddiqui and Syed Arshad Raza
Algorithms 2022, 15(4), 127; https://doi.org/10.3390/a15040127 - 13 Apr 2022
Cited by 2 | Viewed by 2690
Abstract
Systematic behavioral regime shifts inevitably emerge in real-world processes in response to various determinants, thus resulting in temporally dynamic responses. These determinants can be technical, such as process handling, design, or policy elements; or environmental, socio-economic or socio-technical in nature. This work proposes [...] Read more.
Systematic behavioral regime shifts inevitably emerge in real-world processes in response to various determinants, thus resulting in temporally dynamic responses. These determinants can be technical, such as process handling, design, or policy elements; or environmental, socio-economic or socio-technical in nature. This work proposes a novel two-stage methodology in which the first stage involves statistically identifying and dating all regime shifts in the time series process event logs. The second stage entails identifying contender determinants, which are statistically and temporally evaluated for their role in forming new behavioral regimes. The methodology is general, allowing varying process evaluation bases while putting minimal restrictions on process output data distribution. We demonstrated the efficacy of our approach via three cases of technical, socio-economic and socio-technical nature. The results show the presence of regime shifts in the output logs of these cases. Various determinants were identified and analyzed for their role in their formation. We found that some of the determinants indeed caused specific regime shifts, whereas others had no impact on their formation. Full article
(This article belongs to the Special Issue Process Mining and Its Applications)
Show Figures

Figure 1

17 pages, 360 KiB  
Article
A Truly Robust Signal Temporal Logic: Monitoring Safety Properties of Interacting Cyber-Physical Systems under Uncertain Observation
by Bernd Finkbeiner, Martin Fränzle, Florian Kohn and Paul Kröger
Algorithms 2022, 15(4), 126; https://doi.org/10.3390/a15040126 - 11 Apr 2022
Cited by 4 | Viewed by 2569
Abstract
Signal Temporal Logic is a linear-time temporal logic designed for classifying the time-dependent signals originating from continuous-state or hybrid-state dynamical systems according to formal specifications. It has been conceived as a tool for systematizing the monitoring of cyber-physical systems, supporting the automatic translation [...] Read more.
Signal Temporal Logic is a linear-time temporal logic designed for classifying the time-dependent signals originating from continuous-state or hybrid-state dynamical systems according to formal specifications. It has been conceived as a tool for systematizing the monitoring of cyber-physical systems, supporting the automatic translation of complex safety specifications into monitoring algorithms, faithfully representing their semantics. Almost all algorithms hitherto suggested do, however, assume perfect identity between the sensor readings, informing the monitor about the system state and the actual ground truth. Only recently have Visconti et al. addressed the issue of inexact measurements, taking up the simple model of interval-bounded per-sample error that is unrelated, in the sense of chosen afresh, across samples. We expand their analysis by decomposing the error into an unknown yet fixed offset and an independent per-sample error and show that in this setting, monitoring of temporal properties no longer coincides with collecting Boolean combinations of state predicates evaluated in each time instant over best-possible per-sample state estimates, but can be genuinely more informative in that it infers determinate truth values for monitoring conditions that interval-based evaluation remains inconclusive about. For the model-free as well as for the linear model-based case, we provide optimal evaluation algorithms based on affine arithmetic and SAT modulo theory, solving over linear arithmetic. The resulting algorithms provide conclusive monitoring verdicts in many cases where state estimations inherently remain inconclusive. In their model-based variants, they can simultaneously address the issues of uncertain sensing and partial observation. Full article
(This article belongs to the Special Issue Algorithms for Reliable Estimation, Identification and Control II)
Show Figures

Figure 1

24 pages, 450 KiB  
Article
Computational Approaches for Grocery Home Delivery Services
by Christian Truden, Kerstin Maier, Anna Jellen and Philipp Hungerländer
Algorithms 2022, 15(4), 125; https://doi.org/10.3390/a15040125 - 9 Apr 2022
Cited by 4 | Viewed by 2590
Abstract
The steadily growing popularity of grocery home-delivery services is most likely based on the convenience experienced by its customers. However, the perishable nature of the products imposes certain requirements during the delivery process. The customer must be present when the delivery arrives so [...] Read more.
The steadily growing popularity of grocery home-delivery services is most likely based on the convenience experienced by its customers. However, the perishable nature of the products imposes certain requirements during the delivery process. The customer must be present when the delivery arrives so that the delivery process can be completed without interrupting the cold chain. Therefore, the grocery retailer and the customer must mutually agree on a time window during which the delivery can be guaranteed. This concept is referred to as the attended home delivery (AHD) problem in the scientific literature. The phase during which customers place orders, usually through a web service, constitutes the computationally most challenging part of the logistical processes behind such services. The system must determine potential delivery time windows that can be offered to incoming customers and incrementally build the delivery schedule as new orders are placed. Typically, the underlying optimization problem is a vehicle routing problem with a time windows. This work is concerned with a case given by an international grocery retailer’s online shopping service. We present an analysis of several efficient solution methods that can be employed to AHD services. A framework for the operational planning tools required to tackle the order placement process is provided. However, the basic framework can easily be adapted to be used for many similar vehicle routing applications. We provide a comprehensive computational study comparing several algorithmic strategies, combining heuristics utilizing local search operations and mixed-integer linear programs, tackling the booking process. Finally, we analyze the scalability and suitability of the approaches. Full article
Show Figures

Figure 1

17 pages, 1269 KiB  
Review
Point Cloud Upsampling Algorithm: A Systematic Review
by Yan Zhang, Wenhan Zhao, Bo Sun, Ying Zhang and Wen Wen
Algorithms 2022, 15(4), 124; https://doi.org/10.3390/a15040124 - 8 Apr 2022
Cited by 11 | Viewed by 5683
Abstract
Point cloud upsampling algorithms can improve the resolution of point clouds and generate dense and uniform point clouds, and are an important image processing technology. Significant progress has been made in point cloud upsampling research in recent years. This paper provides a comprehensive [...] Read more.
Point cloud upsampling algorithms can improve the resolution of point clouds and generate dense and uniform point clouds, and are an important image processing technology. Significant progress has been made in point cloud upsampling research in recent years. This paper provides a comprehensive survey of point cloud upsampling algorithms. We classify existing point cloud upsampling algorithms into optimization-based methods and deep learning-based methods, and analyze the advantages and limitations of different algorithms from a modular perspective. In addition, we cover some other important issues such as public datasets and performance evaluation metrics. Finally, we conclude this survey by highlighting several future research directions and open issues that should be further addressed. Full article
Show Figures

Figure 1

10 pages, 556 KiB  
Article
Cloud Computing in Free Route Airspace Research
by Peter Szabó, Miroslava Ferencová and Vladimír Železník
Algorithms 2022, 15(4), 123; https://doi.org/10.3390/a15040123 - 7 Apr 2022
Cited by 2 | Viewed by 2229
Abstract
We use technical documentation, data structures, data, and algorithms in our research. These objects support our work, but we cannot offer a unique citation for each object. This paper proposes a method (for citation and reference management) to cite such supportive resources using [...] Read more.
We use technical documentation, data structures, data, and algorithms in our research. These objects support our work, but we cannot offer a unique citation for each object. This paper proposes a method (for citation and reference management) to cite such supportive resources using Cloud Computing. According to the method, the publication cites only one source in the Cloud, and this source contains the Cloud schema, which describes the Cloud infrastructure. When we make a citation using the Cloud schema, we can pinpoint a cited object exactly. The proposed method supports open research; all research—Cloud items—is freely available. To illustrate the method, we applied it in the case of free route airspace (FRA) modelling. FRA is a new concept of Air Traffic Management and it is also the subject of our research. Full article
(This article belongs to the Special Issue Advances in Cloud and Edge Computing)
Show Figures

Figure 1

21 pages, 579 KiB  
Article
Neuroevolution for Parameter Adaptation in Differential Evolution
by Vladimir Stanovov, Shakhnaz Akhmedova and Eugene Semenkin
Algorithms 2022, 15(4), 122; https://doi.org/10.3390/a15040122 - 7 Apr 2022
Cited by 6 | Viewed by 2189
Abstract
Parameter adaptation is one of the key research fields in the area of evolutionary computation. In this study, the application of neuroevolution of augmented topologies to design efficient parameter adaptation techniques for differential evolution is considered. The artificial neural networks in this study [...] Read more.
Parameter adaptation is one of the key research fields in the area of evolutionary computation. In this study, the application of neuroevolution of augmented topologies to design efficient parameter adaptation techniques for differential evolution is considered. The artificial neural networks in this study are used for setting the scaling factor and crossover rate values based on the available information about the algorithm performance and previous successful values. The training is performed on a set of benchmark problems, and the testing and comparison is performed on several different benchmarks to evaluate the generalizing ability of the approach. The neuroevolution is enhanced with lexicase selection to handle the noisy fitness landscape of the benchmarking results. The experimental results show that it is possible to design efficient parameter adaptation techniques comparable to state-of-the-art methods, although such an automatic search for heuristics requires significant computational effort. The automatically designed solutions can be further analyzed to extract valuable knowledge about parameter adaptation. Full article
(This article belongs to the Special Issue Metaheuristic Algorithms and Applications)
Show Figures

Figure 1

26 pages, 622 KiB  
Article
Combinatorial Integral Approximation Decompositions for Mixed-Integer Optimal Control
by Clemens Zeile, Tobias Weber and Sebastian Sager
Algorithms 2022, 15(4), 121; https://doi.org/10.3390/a15040121 - 31 Mar 2022
Cited by 3 | Viewed by 2334
Abstract
Solving mixed-integer nonlinear programs (MINLPs) is hard from both a theoretical and practical perspective. Decomposing the nonlinear and the integer part is promising from a computational point of view. In general, however, no bounds on the objective value gap can be established and [...] Read more.
Solving mixed-integer nonlinear programs (MINLPs) is hard from both a theoretical and practical perspective. Decomposing the nonlinear and the integer part is promising from a computational point of view. In general, however, no bounds on the objective value gap can be established and iterative procedures with potentially many subproblems are necessary. The situation is different for mixed-integer optimal control problems with binary variables that switch over time. Here, a priori bounds were derived for a decomposition into one continuous nonlinear control problem and one mixed-integer linear program, the combinatorial integral approximation (CIA) problem. In this article, we generalize and extend the decomposition idea. First, we derive different decompositions and analyze the implied a priori bounds. Second, we propose several strategies to recombine promising candidate solutions for the binary control functions in the original problem. We present the extensions for ordinary differential equations-constrained problems. These extensions are transferable in a straightforward way, though, to recently suggested variants for certain partial differential equations, for algebraic equations, for additional combinatorial constraints, and for discrete time problems. We implemented all algorithms and subproblems in AMPL for a proof-of-concept study. Numerical results show the improvement compared to the standard CIA decomposition with respect to objective function value and compared to general-purpose MINLP solvers with respect to runtime. Full article
Show Figures

Figure 1

22 pages, 52325 KiB  
Article
Multi-Level Fusion Model for Person Re-Identification by Attribute Awareness
by Shengyu Pei and Xiaoping Fan
Algorithms 2022, 15(4), 120; https://doi.org/10.3390/a15040120 - 30 Mar 2022
Viewed by 2150
Abstract
Existing person re-recognition (Re-ID) methods usually suffer from poor generalization capability and over-fitting problems caused by insufficient training samples. We find that high-level attributes, semantic information, and part-based local information alignment are useful for person Re-ID networks. In this study, we propose a [...] Read more.
Existing person re-recognition (Re-ID) methods usually suffer from poor generalization capability and over-fitting problems caused by insufficient training samples. We find that high-level attributes, semantic information, and part-based local information alignment are useful for person Re-ID networks. In this study, we propose a person re-recognition network with part-based attribute-enhanced features. The model includes a multi-task learning module, local information alignment module, and global information learning module. The ResNet based on non-local and instance batch normalization (IBN) learns more discriminative feature representations. The multi-task module, local module, and global module are used in parallel for feature extraction. To better prevent over-fitting, the local information alignment module transforms pedestrian attitude alignment into local information alignment to assist in attribute recognition. Extensive experiments are carried out on the Market-1501 and DukeMTMC-reID datasets, whose results demonstrate that the effectiveness of the method is superior to most current algorithms. Full article
Show Figures

Figure 1

13 pages, 737 KiB  
Article
False Information Detection via Multimodal Feature Fusion and Multi-Classifier Hybrid Prediction
by Yi Liang, Turdi Tohti and Askar Hamdulla
Algorithms 2022, 15(4), 119; https://doi.org/10.3390/a15040119 - 29 Mar 2022
Cited by 5 | Viewed by 2545
Abstract
In the existing false information detection methods, the quality of the extracted single-modality features is low, the information between different modalities cannot be fully fused, and the original information will be lost when the information of different modalities is fused. This paper proposes [...] Read more.
In the existing false information detection methods, the quality of the extracted single-modality features is low, the information between different modalities cannot be fully fused, and the original information will be lost when the information of different modalities is fused. This paper proposes a false information detection via multimodal feature fusion and multi-classifier hybrid prediction. In this method, first, bidirectional encoder representations for transformers are used to extract the text features, and S win-transformer is used to extract the picture features, and then, the trained deep autoencoder is used as an early fusion method of multimodal features to fuse text features and visual features, and the low-dimensional features are taken as the joint features of the multimodalities. The original features of each modality are concatenated into the joint features to reduce the loss of original information. Finally, the text features, image features and joint features are processed by three classifiers to obtain three probability distributions, and the three probability distributions are added proportionally to obtain the final prediction result. Compared with the attention-based multimodal factorized bilinear pooling, the model achieves 4.3% and 1.2% improvement in accuracy on Weibo dataset and Twitter dataset. The experimental results show that the proposed model can effectively integrate multimodal information and improve the accuracy of false information detection. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

13 pages, 826 KiB  
Article
Boosting Iris Recognition by Margin-Based Loss Functions
by Reihan Alinia Lat, Sebelan Danishvar, Hamed Heravi and Morad Danishvar
Algorithms 2022, 15(4), 118; https://doi.org/10.3390/a15040118 - 29 Mar 2022
Cited by 8 | Viewed by 3490
Abstract
In recent years, the topic of contactless biometric identification has gained considerable traction due to the COVID-19 pandemic. One of the most well-known identification technologies is iris recognition. Determining the classification threshold for large datasets of iris images remains challenging. To solve this [...] Read more.
In recent years, the topic of contactless biometric identification has gained considerable traction due to the COVID-19 pandemic. One of the most well-known identification technologies is iris recognition. Determining the classification threshold for large datasets of iris images remains challenging. To solve this issue, it is essential to extract more discriminatory features from iris images. Choosing the appropriate loss function to enhance discrimination power is one of the most significant factors in deep learning networks. This paper proposes a novel iris identification framework that integrates the light-weight MobileNet architecture with customized ArcFace and Triplet loss functions. By combining two loss functions, it is possible to improve the compactness within a class and the discrepancies between classes. To reduce the amount of preprocessing, the normalization step is omitted and segmented iris images are used directly. In contrast to the original SoftMax loss, the EER for the combined loss from ArcFace and Triplet is decreased from 1.11% to 0.45%, and the TPR is increased from 99.77% to 100%. In CASIA-Iris-Thousand, EER decreased from 4.8% to 1.87%, while TPR improved from 97.42% to 99.66%. Experiments have demonstrated that the proposed approach with customized loss using ArcFace and Triplet can significantly improve state-of-the-art and achieve outstanding results. Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
Show Figures

Figure 1

15 pages, 3712 KiB  
Article
Performance of Parallel K-Means Algorithms in Java
by Libero Nigro
Algorithms 2022, 15(4), 117; https://doi.org/10.3390/a15040117 - 29 Mar 2022
Cited by 11 | Viewed by 2915
Abstract
K-means is a well-known clustering algorithm often used for its simplicity and potential efficiency. Its properties and limitations have been investigated by many works reported in the literature. K-means, though, suffers from computational problems when dealing with large datasets with many dimensions and [...] Read more.
K-means is a well-known clustering algorithm often used for its simplicity and potential efficiency. Its properties and limitations have been investigated by many works reported in the literature. K-means, though, suffers from computational problems when dealing with large datasets with many dimensions and great number of clusters. Therefore, many authors have proposed and experimented different techniques for the parallel execution of K-means. This paper describes a novel approach to parallel K-means which, today, is based on commodity multicore machines with shared memory. Two reference implementations in Java are developed and their performances are compared. The first one is structured according to a map/reduce schema that leverages the built-in multi-threaded concurrency automatically provided by Java to parallel streams. The second one, allocated on the available cores, exploits the parallel programming model of the Theatre actor system, which is control-based, totally lock-free, and purposely relies on threads as coarse-grain “programming-in-the-large” units. The experimental results confirm that some good execution performance can be achieved through the implicit and intuitive use of Java concurrency in parallel streams. However, better execution performance can be guaranteed by the modular Theatre implementation which proves more adequate for an exploitation of the computational resources. Full article
Show Figures

Figure 1

17 pages, 414 KiB  
Article
A Multitask Learning Framework for Abuse Detection and Emotion Classification
by Yucheng Huang, Rui Song, Fausto Giunchiglia and Hao Xu
Algorithms 2022, 15(4), 116; https://doi.org/10.3390/a15040116 - 28 Mar 2022
Cited by 4 | Viewed by 2337
Abstract
The rapid development of online social media makes abuse detection a hot topic in the field of emotional computing. However, most natural language processing (NLP) methods only focus on linguistic features of posts and ignore the influence of users’ emotions. To tackle the [...] Read more.
The rapid development of online social media makes abuse detection a hot topic in the field of emotional computing. However, most natural language processing (NLP) methods only focus on linguistic features of posts and ignore the influence of users’ emotions. To tackle the problem, we propose a multitask framework combining abuse detection and emotion classification (MFAE) to expand the representation capability of the algorithm on the basis of the existing pretrained language model. Specifically, we use bidirectional encoder representation from transformers (BERT) as the encoder to generate sentence representation. Then, we used two different decoders for emotion classification and abuse detection, respectively. To further strengthen the influence of the emotion classification task on abuse detection, we propose a cross-attention (CA) component in the decoder, which further improves the learning effect of our multitask learning framework. Experimental results on five public datasets show that our method is superior to other state-of-the-art methods. Full article
Show Figures

Figure 1

0 pages, 1906 KiB  
Article
Deep Learning Study of an Electromagnetic Calorimeter
by Elihu Sela, Shan Huang and David Horn
Algorithms 2022, 15(4), 115; https://doi.org/10.3390/a15040115 - 28 Mar 2022
Cited by 3 | Viewed by 2081
Abstract
The accurate and precise extraction of information from a modern particle detector, such as an electromagnetic calorimeter, may be complicated and challenging. In order to overcome the difficulties, we process the simulated detector outputs using the deep-learning methodology. Our algorithmic approach makes use [...] Read more.
The accurate and precise extraction of information from a modern particle detector, such as an electromagnetic calorimeter, may be complicated and challenging. In order to overcome the difficulties, we process the simulated detector outputs using the deep-learning methodology. Our algorithmic approach makes use of a known network architecture, which has been modified to fit the problems at hand. The results are of high quality (biases of order 1 to 2%) and, moreover, indicate that most of the information may be derived from only a fraction of the detector. We conclude that such an analysis helps us understand the essential mechanism of the detector and should be performed as part of its design procedure. Full article
Show Figures

Figure 1

13 pages, 2486 KiB  
Article
EEG Pattern Classification of Picking and Coordination Using Anonymous Random Walks
by Inon Zuckerman, Dor Mizrahi and Ilan Laufer
Algorithms 2022, 15(4), 114; https://doi.org/10.3390/a15040114 - 26 Mar 2022
Cited by 5 | Viewed by 2084
Abstract
Tacit coordination games are games where players are trying to select the same solution without any communication between them. Various theories have attempted to predict behavior in tacit coordination games. Until now, research combining tacit coordination games with electrophysiological measures was mainly based [...] Read more.
Tacit coordination games are games where players are trying to select the same solution without any communication between them. Various theories have attempted to predict behavior in tacit coordination games. Until now, research combining tacit coordination games with electrophysiological measures was mainly based on spectral analysis. In contrast, EEG coherence enables the examination of functional and morphological connections between brain regions. Hence, we aimed to differentiate between different cognitive conditions using coherence patterns. Specifically, we have designed a method that predicts the class label of coherence graph patterns extracted out of multi-channel EEG epochs taken from three conditions: a no-task condition and two cognitive tasks, picking and coordination. The classification process was based on a coherence graph extracted out of the EEG record. To assign each graph into its appropriate label, we have constructed a hierarchical classifier. First, we have distinguished between the resting-state condition and the other two cognitive tasks by using a bag of node degrees. Next, to distinguish between the two cognitive tasks, we have implemented an anonymous random walk. Our classification model achieved a total accuracy value of 96.55%. Full article
(This article belongs to the Special Issue Algorithms in Decision Support Systems Vol. 2)
Show Figures

Figure 1

20 pages, 5171 KiB  
Article
An LSM-Tree Index for Spatial Data
by Junjun He and Huahui Chen
Algorithms 2022, 15(4), 113; https://doi.org/10.3390/a15040113 - 25 Mar 2022
Cited by 2 | Viewed by 5689
Abstract
An LSM-tree (log-structured merge-tree) is a hierarchical, orderly and disk-oriented data storage structure which makes full use of the characteristics of disk sequential writing, which are much better than those of random writing. However, an LSM-tree can only be queried by a key [...] Read more.
An LSM-tree (log-structured merge-tree) is a hierarchical, orderly and disk-oriented data storage structure which makes full use of the characteristics of disk sequential writing, which are much better than those of random writing. However, an LSM-tree can only be queried by a key and cannot meet the needs of a spatial query. To improve the query efficiency of spatial data stored in LSM-trees, the traditional method is to introduce stand-alone tree-like secondary indexes, the problem with which is the read amplification brought about by dual index queries. Moreover, when more spatial data are stored, the index tree becomes increasingly large, bringing the problems of a lower query efficiency and a higher index update cost. To address the above problems, this paper proposes an ER-tree(embedded R-tree) index structure based on the orderliness of LSM-tree data. By building an SER-tree(embedded R-tree on an SSTable) index structure for each storage component, we optimised dual index queries into single and organised SER-tree indexes into an ER-tree index with a binary linked list. The experiments showed that the query performance of the ER-tree index was effectively improved compared to that of stand-alone R-tree indexes. Full article
(This article belongs to the Section Databases and Data Structures)
Show Figures

Figure 1

2 pages, 146 KiB  
Editorial
Editorial Paper for the Special Issue “Algorithms in Hyperspectral Data Analysis”
by Raffaele Pizzolante
Algorithms 2022, 15(4), 112; https://doi.org/10.3390/a15040112 - 25 Mar 2022
Viewed by 1783
Abstract
This Special Issue contains four papers focused on hyperspectral data analysis [...] Full article
(This article belongs to the Special Issue Algorithms in Hyperspectral Data Analysis)
15 pages, 1758 KiB  
Communication
A Variable Step Size Normalized Least-Mean-Square Algorithm Based on Data Reuse
by Alexandru-George Rusu, Constantin Paleologu, Jacob Benesty and Silviu Ciochină
Algorithms 2022, 15(4), 111; https://doi.org/10.3390/a15040111 - 24 Mar 2022
Cited by 11 | Viewed by 3043
Abstract
The principal issue in acoustic echo cancellation (AEC) is to estimate the impulse response between the loudspeaker and microphone of a hands-free communication device. This application can be addressed as a system identification problem, which can be solved by using an adaptive filter. [...] Read more.
The principal issue in acoustic echo cancellation (AEC) is to estimate the impulse response between the loudspeaker and microphone of a hands-free communication device. This application can be addressed as a system identification problem, which can be solved by using an adaptive filter. The most common one for AEC is the normalized least-mean-square (NLMS) algorithm. It is known that the overall performance of this algorithm is controlled by the value of its normalized step size parameter. In order to obtain a proper compromise between the main performance criteria (e.g., convergence rate/tracking versus accuracy/robustness), this specific term of the NLMS algorithm can be further controlled and designed as a variable parameter. This represents the main motivation behind the development of variable step size algorithms. In this paper, we propose a variable step size NLMS (VSS-NLMS) algorithm that exploits the data reuse mechanism, which aims to improve the convergence rate/tracking of the algorithm by reusing the same set of data (i.e., the input and reference signals) several times. Nevertheless, we involved an equivalent version of the data reuse NLMS, which provides the convergence modes of the algorithm. Based on this approach, a sequence of normalized step sizes can be a priori scheduled, which is advantageous in terms of the computational complexity. The simulation results in the context of AEC supported the good performance features of the proposed VSS-NLMS algorithm. Full article
Show Figures

Figure 1

14 pages, 3066 KiB  
Article
Numerical Simulation of Micro-Bubbles Dispersion by Surface Waves
by Oleg A. Druzhinin and Wu-Ting Tsai
Algorithms 2022, 15(4), 110; https://doi.org/10.3390/a15040110 - 24 Mar 2022
Cited by 4 | Viewed by 2078
Abstract
This paper presents an algorithm for numerical modeling of bubble dispersion occurring in the near-surface layer of the upper ocean under the action of non-breaking two-dimensional (2D) surface waves. The algorithm is based on a Eulerian-Lagrangian approach where full, 3D Navier-Stokes equations for [...] Read more.
This paper presents an algorithm for numerical modeling of bubble dispersion occurring in the near-surface layer of the upper ocean under the action of non-breaking two-dimensional (2D) surface waves. The algorithm is based on a Eulerian-Lagrangian approach where full, 3D Navier-Stokes equations for the carrier flow induced by a waved water surface are solved in a Eulerian frame, and the trajectories of individual bubbles are simultaneously tracked in a Lagrangian frame, taking into account the impact of the bubbles on the carrier flow. The bubbles diameters are considered in the range from 200 to 400 microns (thus, micro-bubbles), and the effects related to the bubbles deformation and dissolution in water are neglected. The algorithm allows evaluation of the instantaneous as well as statistically stationary, phase-averaged profiles of the carrier-flow turbulence, bubble concentration (void fraction) and void-fraction fluxes for different flow regimes, both with and without wind-induced surface drift. The simulations results show that bubbles are capable of enhancing the carrier-flow turbulence, as compared to the bubble-free flow, and that the vertical water velocity fluctuations are mostly augmented, and increasingly so by larger bubbles. The results also show that the bubbles dynamics are governed by buoyancy, the surrounding fluid acceleration force and the drag force whereas the impact of the lift force remains negligible. Full article
Show Figures

Figure 1

22 pages, 977 KiB  
Article
Skeptical Learning—An Algorithm and a Platform for Dealing with Mislabeling in Personal Context Recognition
by Wanyi Zhang, Mattia Zeni, Andrea Passerini and Fausto Giunchiglia
Algorithms 2022, 15(4), 109; https://doi.org/10.3390/a15040109 - 24 Mar 2022
Viewed by 2265
Abstract
Mobile Crowd Sensing (MCS) is a novel IoT paradigm where sensor data, as collected by the user’s mobile devices, are integrated with user-generated content, e.g., annotations, self-reports, or images. While providing many advantages, the human involvement also brings big challenges, where the most [...] Read more.
Mobile Crowd Sensing (MCS) is a novel IoT paradigm where sensor data, as collected by the user’s mobile devices, are integrated with user-generated content, e.g., annotations, self-reports, or images. While providing many advantages, the human involvement also brings big challenges, where the most critical is possibly the poor quality of human-provided content, most often due to the inaccurate input from non-expert users. In this paper, we propose Skeptical Learning, an interactive machine learning algorithm where the machine checks the quality of the user feedback and tries to fix it when a problem arises. In this context, the user feedback consists of answers to machine generated questions, at times defined by the machine. The main idea is to integrate three core elements, which are (i) sensor data, (ii) user answers, and (iii) existing prior knowledge of the world, and to enable a second round of validation with the user any time these three types of information jointly generate an inconsistency. The proposed solution is evaluated in a project focusing on a university student life scenario. The main goal of the project is to recognize the locations and transportation modes of the students. The results highlight an unexpectedly high pervasiveness of user mistakes in the university students life project. The results also shows the advantages provided by Skeptical Learning in dealing with the mislabeling issues in an interactive way and improving the prediction performance. Full article
Show Figures

Figure 1

18 pages, 602 KiB  
Article
Trinity: Neural Network Adaptive Distributed Parallel Training Method Based on Reinforcement Learning
by Yan Zeng, Jiyang Wu, Jilin Zhang, Yongjian Ren and Yunquan Zhang
Algorithms 2022, 15(4), 108; https://doi.org/10.3390/a15040108 - 24 Mar 2022
Cited by 1 | Viewed by 2588
Abstract
Deep learning, with increasingly large datasets and complex neural networks, is widely used in computer vision and natural language processing. A resulting trend is to split and train large-scale neural network models across multiple devices in parallel, known as parallel model training. Existing [...] Read more.
Deep learning, with increasingly large datasets and complex neural networks, is widely used in computer vision and natural language processing. A resulting trend is to split and train large-scale neural network models across multiple devices in parallel, known as parallel model training. Existing parallel methods are mainly based on expert design, which is inefficient and requires specialized knowledge. Although automatically implemented parallel methods have been proposed to solve these problems, these methods only consider a single optimization aspect of run time. In this paper, we present Trinity, an adaptive distributed parallel training method based on reinforcement learning, to automate the search and tuning of parallel strategies. We build a multidimensional performance evaluation model and use proximal policy optimization to co-optimize multiple optimization aspects. Our experiment used the CIFAR10 and PTB datasets based on InceptionV3, NMT, NASNet and PNASNet models. Compared with Google’s Hierarchical method, Trinity achieves up to 5% reductions in runtime, communication, and memory overhead, and up to a 40% increase in parallel strategy search speeds. Full article
(This article belongs to the Special Issue Performance Optimization and Performance Evaluation)
Show Figures

Figure 1

16 pages, 8774 KiB  
Article
KMC3 and CHTKC: Best Scenarios, Deficiencies, and Challenges in High-Throughput Sequencing Data Analysis
by Deyou Tang, Daqiang Tan, Weihao Xiao, Jiabin Lin and Juan Fu
Algorithms 2022, 15(4), 107; https://doi.org/10.3390/a15040107 - 24 Mar 2022
Viewed by 2301
Abstract
Background: K-mer frequency counting is an upstream process of many bioinformatics data analysis workflows. KMC3 and CHTKC are the representative partition-based k-mer counting and non-partition-based k-mer counting algorithms, respectively. This paper evaluates the two algorithms and presents their best applicable scenarios and potential [...] Read more.
Background: K-mer frequency counting is an upstream process of many bioinformatics data analysis workflows. KMC3 and CHTKC are the representative partition-based k-mer counting and non-partition-based k-mer counting algorithms, respectively. This paper evaluates the two algorithms and presents their best applicable scenarios and potential improvements using multiple hardware contexts and datasets. Results: KMC3 uses less memory and runs faster than CHTKC on a regular configuration server. CHTKC is efficient on high-performance computing platforms with high available memory, multi-thread, and low IO bandwidth. When tested with various datasets, KMC3 is less sensitive to the number of distinct k-mers and is more efficient for tasks with relatively low sequencing quality and long k-mer. CHTKC performs better than KMC3 in counting assignments with large-scale datasets, high sequencing quality, and short k-mer. Both algorithms are affected by IO bandwidth, and decreasing the influence of the IO bottleneck is critical as our tests show improvement by filtering and compressing consecutive first-occurring k-mers in KMC3. Conclusions: KMC3 is more competitive for running counter on ordinary hardware resources, and CHTKC is more competitive for counting k-mers in super-scale datasets on higher-performance computing platforms. Reducing the influence of the IO bottleneck is essential for optimizing the k-mer counting algorithm, and filtering and compressing low-frequency k-mers is critical in relieving IO impact. Full article
(This article belongs to the Special Issue Performance Optimization and Performance Evaluation)
Show Figures

Figure 1

19 pages, 2867 KiB  
Article
Forecast of Medical Costs in Health Companies Using Models Based on Advanced Analytics
by Daniel Ricardo Sandoval Serrano, Juan Carlos Rincón, Julián Mejía-Restrepo, Edward Rolando Núñez-Valdez and Vicente García-Díaz
Algorithms 2022, 15(4), 106; https://doi.org/10.3390/a15040106 - 23 Mar 2022
Viewed by 2746
Abstract
Forecasting medical costs is crucial for planning, budgeting, and efficient decision making in the health industry. This paper introduces a proposal to forecast costs through techniques such as a standard model of long short-term memory (LSTM); and patient grouping through k-means clustering in [...] Read more.
Forecasting medical costs is crucial for planning, budgeting, and efficient decision making in the health industry. This paper introduces a proposal to forecast costs through techniques such as a standard model of long short-term memory (LSTM); and patient grouping through k-means clustering in the Keralty group, one of Colombia’s leading healthcare companies. It is important to highlight its implications for the prediction of cost time series in the health sector from a retrospective analysis of the information of services invoiced to health companies. It starts with the selection of sociodemographic variables related to the patient, such as age, gender and marital status, and it is complemented with health variables such as patient comorbidities (cohorts) and induced variables, such as service provision frequency and time elapsed since the last consultation (hereafter referred to as “recency”). Our results suggest that greater accuracy can be achieved by first clustering and then using LSTM networks. This implies that a correct segmentation of the population according to the usage of services represented in costs must be performed beforehand. Through the analysis, a cost projection from 1 to 3 months can be conducted, allowing a comparison with historical data. The reliability of the model is validated by different metrics such as RMSE and Adjusted R2. Overall, this study is intended to be useful for healthcare managers in developing a strategy for medical cost forecasting. We conclude that the use of analytical tools allows the organization to make informed decisions and to develop strategies for optimizing resources with the identified population. Full article
(This article belongs to the Special Issue Algorithms in Decision Support Systems Vol. 2)
Show Figures

Figure 1

13 pages, 323 KiB  
Article
Analyzing Markov Boundary Discovery Algorithms in Ideal Conditions Using the d-Separation Criterion
by Camil Băncioiu and Remus Brad
Algorithms 2022, 15(4), 105; https://doi.org/10.3390/a15040105 - 23 Mar 2022
Viewed by 2178
Abstract
This article proposes the usage of the d-separation criterion in Markov Boundary Discovery algorithms, instead of or alongside the statistical tests of conditional independence these algorithms usually rely on. This is a methodological improvement applicable when designing, studying or improving such algorithms, but [...] Read more.
This article proposes the usage of the d-separation criterion in Markov Boundary Discovery algorithms, instead of or alongside the statistical tests of conditional independence these algorithms usually rely on. This is a methodological improvement applicable when designing, studying or improving such algorithms, but it is not applicable for productive use, because computing the d-separation criterion requires complete knowledge of a Bayesian network. Yet Bayesian networks can be made available to the algorithms when studied in controlled conditions. This approach has the effect of removing sources of suboptimal behavior, allowing the algorithms to perform at their theoretical best and providing insights about their properties. The article also discusses an extension of this approach, namely to use d-separation as a complement to the usual statistical tests performed on synthetic datasets in order to ascertain the overall accuracy of the tests chosen by the algorithms, for further insights into their behavior. To exemplify these two approaches, two Markov Boundary Discovery algorithms were used, namely the Incremental Association Markov Blanket algorithm and the Iterative Parent–Child-Based Search of Markov Blanket algorithm. Firstly, these algorithms were configured to use d-separation alone as their conditional independence test, computed on known Bayesian networks. Subsequently, the algorithms were configured to use the statistical G-test complemented by d-separation to evaluate their behavior on synthetic data. Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop