Special Issue: Regularization Techniques for Machine Learning and Their Applications

: Over the last decade, learning theory performed signiﬁcant progress in the development of sophisticated algorithms and their theoretical foundations. The theory builds on concepts that exploit ideas and methodologies from mathematical areas such as optimization theory. Regularization is probably the key to address the challenging problem of overﬁtting, which usually occurs in high-dimensional learning. Its primary goal is to make the machine learning algorithm “learn” and not “memorize” by penalizing the algorithm to reduce its generalization error in order to avoid the risk of overﬁtting. As a result, the variance of the model is signiﬁcantly reduced, without substantial increase in its bias and without losing any important properties in the data.


Introduction
This article is the editorial of the Regularization Techniques for Machine Learning and Their Applications" Special Issue of the journal Electronics. During the last decades, research on regularization methods has significantly grown overtime due to the need to develop more accurate and reliable prediction models. The main objective of this Special Issue is to present the recent advances related to all kinds of regularization techniques, methodologies, and algorithms and investigate the impact of their application in a variety of hard real-world benchmarks. The response of the scientific community has been considerable, as many original research papers have been submitted for consideration. Among the ten (10) originally submitted papers, six (6) were finally accepted as full papers (acceptance ration = 60%), after going through a careful blind peer-review process based on novelty and quality criteria. All accepted papers possess elements of significant contribution, highlighting interesting regularization-based approaches, and cover a variety of multidisciplinary application domains.

Regularization Techniques for Machine Learning and Their Applications
The first paper is authored by Qiang et al. [1], entitled sDeepFM: Multi-Scale Stacking Feature Interactions for Click-Through Rate Prediction. In this work, the authors studied the problem of forecasting the click-through rate of advertisements and highlighted the difficulties of these task. More specifically, based on the research, they stated that the features built for training a prediction model are relatively simple, in some cases they cannot be automatically constructed or the high-order combination features are considerably difficult to learn under sparse data. In order to address these difficulties, the authors proposed a novel structure multi-scale stacking pooling (MSSP) for developing multi-scale features based on various receptive fields. For evaluation purposes, the authors combined the proposed MSSP with the classical deep neural network and formed a unified framework called sDeepFM. The reported numerical experiments on two real-world datasets demonstrated that the proposed sDeepFM framework outperformed state-of-the-art prediction models.
The second paper is entitled An Advanced Pruning Method in the Architecture of Extreme Learning Machines Using L1-Regularization and Bootstrapping, and it is authored by Campos Souza et al. [2]. The authors proposed a new regularization method, named Pruning ELM Using Bootstrapped Lasso BR-ELM, for improving the performance of an Extreme Learning Machine (ELM) and reducing overfitting. The proposed method is based on regression and re-sampling techniques for pruning the architecture of an ELM and select the most relevant neurons to the output of the model. More specifically, it is based on the Bolasso method [3], which consists of a ensembled variant of Lasso and focuses on shrinking the output weight parameters of the neurons to zero. Then, through the usage of a subset of candidate regressors, a number of neurons in the hidden layer of the ELM are rejected. Finally, the authors presented a broad experimental analysis regarding the evaluation of the proposed method against related state-of-the-art methods on four (4) synthetic and seventeen (17) real-world benchmarks. The reported results, as well as the conducted statistical analysis, provide empirical evidence about the efficiency of the proposed approach.
The third paper is authored by Livieris et al. [4], entitled An advanced CNN-LSTM model for cryptocurrency forecasting. This paper presents an innovative multiple-input deep neural network model for the forecasting cryptocurrency price and movement. The motivation behind the proposed work was to develop a prediction model, which will be able to independently exploit information from each cryptocurrency, and then process the extracted information for achieving reliable and accurate predictions. Additionally, the suitability of the cryptocurrency data was imposed by the utilization of a novel strategy presented in [5], which guarantees the enforcement of stationarity property. A comprehensive experimental evaluation was conducted utilizing data from three cryptocurrencies with the highest market capitalization, i.e., Ripple (XRP), Etherium (ETH) and Bitcoin (BTC). The presented experimental analysis highlighted that the proposed model is efficiently able to exploit mixed cryptocurrency data, and reduces overfitting in comparison with traditional state-ofthe-art deep neural networks.
The fourth paper is entitled Application of Deep Neural Network to the Reconstruction of Two-Phase Material Imaging by Capacitively Coupled Electrical Resistance Tomography, and it is authored by Chen et al. [6]. In this work, the authors proposed a new framework for image reconstruction for capacitively coupled electrical resistance tomography (CCERT) industrial application. Initially, each 2D monochrome 2500-pixel sensor image was divided into 625 clusters, and then each cluster is fed to a convolutional-based neural network (CNN) to address the 16 classes problem. The authors stated that inherent regularization for the assumption of binary materials was responsible for the utilization of the CNN classifier. The quality of reconstructed images developed by the proposed CNN was evaluated utilizing simulated data and compare with the corresponding images developed by a traditional reconstruction algorithm. The presented numerical experiments and analysis provide empirical evidence about the efficiency of the proposed approach. Finally, the authors highlighted that the proposed approach possesses the same advantages as the traditional system such as no invasions, simplicity, rapid response, no radiation and a low cost; however, the proposed system is able to achieve a higher image quality due to the extended frequency range.
The fifth paper is entitled A Regularization-Based Big Data Framework for Winter Precipitation Forecasting on Streaming Data, and it is authored by Kanavos et al. [7]. In this work, the authors proposed an approach for forecasting of qualitative weather information on winter precipitation types, based on an Apache Spark Streaming distributed framework. Initially, the real-time data from different sensors are processed and then are utilized as inputs in a weather prediction model, which focuses on forecasting the weather type given three precipitation classes: snow, freezing rain and rain. For the developed of the prediction model, a large variety of machine learning algorithms (Bayesian, decision trees, and meta/ensemble methods) were used, fitted and evaluated. Additionally, in order to increase the prediction performance, a regularization technique for feature selection was utilized in order to reduced overfitting. The reported experimental results demonstrated that the OzaBag algorithm reported the best performance, while the use of the regularization technique is able to boost the forecasting accuracy of all evaluated models.
The sixth paper is authored by Tucci et al. [8] entitled A Regularized Procedure to Generate a Deep Learning Model for Topology Optimization of Electromagnetic Devices. The authors developed a new regularization framework based on a variational autoencoder (VAE) for topology optimization of electromagnetic devices. Additionally, a deep neural network was utilized as a Surrogate Model (SM) for accelerating the resolution of single trial cases. Both VAE and SM were trained in a multi-model custom training loop in which both models' losses are minimized simultaneously. An advantage of the proposed approach is the transformation of the constrained optimization into an unconstrained one. In their research, they consider the TEAM25 problem, which consists of the optimization of the geometry of an electromagnetic die press. Based on their preliminary numerical experiments, the authors stated that VAE is able to efficiently regularize the resolution process, which implied the considerable improvement of the performance of the resolution process as well as the quality of the final solution.

Conclusions and Future Approaches
The rationale and the motivation of this Special Issue was to conduct a minor and timely contribution to the existing literature. It is hoped that the significant approaches presented in this Special Issue will be found constructive, interesting and recognized by the international industry and scientific community. Our objective and expectation is that researchers will be inspired by the presented innovative strategies and enhance research in various multidisciplinary domains as well as stimulate further research in the domain of artificial intelligence in general. Future approaches may involve exploiting regularization techniques and methodologies for further improving prediction accuracy and enhancing prediction models' reliability.