Special Issue on Signal Processing and Machine Learning for Biomedical Data

This Special Issue is focused on advanced techniques in signal processing, analysis, modelling, and classification, applied to a variety of medical diagnostic problems [...]


Introduction
This Special Issue is focused on advanced techniques in signal processing, analysis, modelling, and classification, applied to a variety of medical diagnostic problems. Biomedical data play a fundamental role in many fields of research and clinical practice. Very often the complexity of these data and their large volume makes it necessary to develop advanced analysis techniques and systems. Furthermore, the introduction of new techniques and methodologies for diagnostic purposes, especially in the field of medical imaging, requires new signal processing and machine learning methods. The recent progress in machine learning techniques, and in particular deep learning, revolutionized various fields of artificial vision, significantly pushing the state of the art of artificial vision systems into a wide range of high-level tasks. Such progress can help address problems in the analysis of biomedical data.
This Special Issue placed particular emphasis on contributions dealing with practical, applications-led research, on the use of methods and devices in clinical diagnosis. The works that make up this special issue show a remarkable variety of applications for the detection and classification of medical imaging problems. In particular, the aforementioned works can be divided on the basis of types of techniques used, into three categories-signal processing (SP) methods, traditional machine learning (ML) methods, and deep learning (DL) methods.

Materials and Methods
Signal processing represents a powerful analytical framework for medical imaging analysis and the development of efficient detection and classification algorithms. In this special issue, different signal processing methods were applied to model and process biomedical signals in different contexts.
De Pedro-Carracedo et al. [1] dealt with phase space reconstruction from one scalar time series and applied the method to photoplethysmographic signals, in their study. In their work, the most usual and clear methodologies for calculating the phase reconstruction are described, which focus mainly on how to correctly determine the reconstruction parameters.
The authors of [2] explored the application of a new metric to assess the functional synchronization mechanisms in the brain. The analysis was conducted on functional magnetic resonance imaging (fMRI) and includes ten resting-state baseline fMRI sessions of two subjects from the Hangzhou Normal University (HNU) cohort of the Consortium for Reliability and Reproducibility (CoRR). The analysis was conducted by comparing three synchronization connectivity patterns derived from the SYNC metric, statistical Pearson correlation, and spectral coherence functional connectivity. The results suggest that statistical correlation and coherence networks show more evenly distributed synchronization patterns of comparable size in the brain, while the SYNC networks exhibit more granular partitions, highlighting the more varied synchronization patterns.
Tu et al. [3] proposed a fast reconstruction algorithm called self-learning subspace matrix factorization (SLSMF), which was applied to signals non-uniform sampling multidimensional nuclear magnetic resonance (NMR) spectroscopy. Results on the synthetic and realistic NMR data show that compared to the SLS, the fast SLS approach remarkably reduced computation time without sacrificing spectrum quality, and enabled faster reconstruction with parallel computing.
In [4], the authors developed a system to classify Anomic and Wernicke's aphasia, based on the acoustic frequencies of speech signals. The system consisted of three diagnosis components-confrontation naming with 30 pictures to identify, single-word repetition of 15 single words, and comprehension components. The latter consisted of single-word comprehension and simple command comprehension. The evaluation was conducted on a total of 60 participants; 18 patients with Anomic Aphasia, 12 with Wernicke's Aphasia, and 30 non-aphasic.
Babič et al. [5] described a novel method for discriminating between lung cancer and non-cancer DNA sequences. The method focuses on network and graph theory, fractal geometry, and statistical pattern recognition, to define the prognostic value of HIF-1 expression in surgically treated lung cancer patients. DNA nucleotides are represented as a path in a graph, then fractal geometry is applied to measure the complexity of the graph. Comparison of the statistical and topological features extracted from the non-cancerous DNA sequences and cancerous DNA sequences showed that the fractal dimension decreased in the lung cancer network, while the topological properties of the network increased in the lung cancer network.
In [6], Maestre-Rendon et al. developed smartphone applications to monitor a patient with a cardiovascular disease. Through the use of the camera of the mobile device and without direct contact with the patient, the pulses per minute (PPM) was obtained. In particular, the authors showed that the analysis of the variations in skin color that are imperceptible to the eye allow estimation of the heart rate.
The authors of [7] addressed the problem of automatic detection of myocardial infarction (MI) and for this purpose proposed a long short-term memory (Bi-LSTM) network. A heartbeat-attention mechanism was proposed to automatically weigh the difference between heartbeats based on 12-lead ECG records. The database used was the Physikalisch-Technische Bundesanstalt (PTB) diagnostic ECG database, with 549 ECG records from 290 patients. The results showed an accuracy of the order of 95%.
Shu et al. [8] proposed an algorithm to simultaneously estimate optimal rigid registration for the serial section images of biological tissues. The algorithm was non-iterative and it could simultaneously compute the transformations in a short time. The method was tested on 336 microscopic images of the serial section of a zebrafish acquired by scanning electron microscopy.

Methods Based on Traditional Machine Learning
There are many techniques that the traditional machine learning approach makes available for the analysis of biomedical images. There are many families of classifiers and training methods that could be used. The ML techniques presented in this special issue were therefore varied.
The authors of [9] in their paper on Automatic Segmentation and Classification of Heart Sounds Using Modified Empirical Wavelet Transform and Power Features proposed a system that is able to segment and classify systolic and diastolic intervals of phonocardiogram (PCG) signals of heart sounds. The classification is binary and allows us to discriminate between normal and abnormal heart sounds. The authors performed feature extraction based on power values in the systolic and diastolic intervals and trained four classifiers-SVM, KNN, random forest, and MLP. The best accuracy result was 99.26%, using the KNN classifier.
In [10], the authors proposed Net-Net AutoML models for the selection of ANNs for the study of Brain Connectome Networks (BCNs) prediction. To predict BCN node connectivity, the Net-Net AutoML evaluates other ANNs trained to predict BCN connectivity.
The following twelve machine learning classifiers were tested-KNN, LDA, GBN, SVM, LogR, MLP, DT, RF, XGB, GB, AdaBoost, and Bagging. The performance was expressed in terms of AUC (Area Under the Curve ROC), by considering a 10-fold cross validation strategy and the best classification was achieved through the Random Forest classifier.
Fanizzi et al. [11] addressed the identification of clustered microcalcifications in digital mammography. The public database used was the BCDR-DM; in particular 104 digital mammograms containing microcalcifications were used. SURF and MinEigenAlg algorithms were used to extract 96 regions of interest (ROIs), 56 benign and 40 malignant, from the mammograms. Each ROI underwent a multiscale decomposition process, based on the Haar wavelet transform and the gray-level co-occurrence matrix. The classification phase of microcalcification clusters into benign and malignant was carried out with the random forest classifier, and the performance obtained was an AUC of 94.19%.
In [12], the authors addressed the detection of breast cancer with a novel classification system, based on biological immune systems processes. Their model called AISAC (Artificial Immune System for Associative Classification) was compared with ten classification systems-three immune-based classification algorithms (AIRS1, Immunos1, and CLON-ALG), and six general-purpose classifiers (Support Vector Machines, multilayer perceptron, nearest neighbor, RIPPER, C4.5, naïve Bayes, random forest).
The authors of [13] proposed the use of the multiscale filter responses of the Gaussian Matched Filter (GMF) and the Gabor filters, coupled with the multilayer perceptron network (MLP) for the automatic segmentation of coronary arteries in X-ray angiograms. The coronary artery detection was carried out with a four-layered perceptron network. For the final segmentation, each pixel was discriminated between two classes-vessel features and image background. The results were obtained from a public database with 130 X-ray coronary angiograms, and their corresponding ground-truth image was outlined by an expert cardiologist. The authors obtained an AUC for detection of 98%.
Chao et al. [14] addressed discrimination between younger/older normal sinus rhythm (NSR) and congestive heart failure (CHF), by analyzing the electrocardiogram (ECG) signal. Using the multiscale entropy (MSE) algorithm, they extracted 20 features and conducted a feature selection. The classification phase was addressed with the SVM, KNN, and LDA classifiers, using the leave-one-out cross-validation strategy.

Methods Based on Deep Learning
Over the past decade, the popularity of methods that exploited deep learning techniques increased considerably, evidently as deep learning improved the state-of-the-art methods in research fields, such as speech recognition and computer vision. In the field of computer vision, deep learning expressed its potential in image processing, also thanks to the Convolutional Neural Networks (CNNs).
The authors of [15] in their paper on Performance of Fine-Tuning Convolutional Neural Networks for HEp-2 Image Classification tackled the analysis of HEp-2 images for the diagnosis of autoimmune diseases. In particular, the classification of the fluorescence intensity was addressed. As recognized in the literature, this interpretation was particularly subjective, for this reason the authors chose to classify these images by means of the recent CNN networks. Four of the best known pre-trained CNN networks were used, namely; AlexNet, SqueezeNet, ResNet18, and GoogleNet. The authors analyzed both the technique that exploited the layers of pretrained networks as feature extractors to train a linear SVM, and the fine-tuning technique to conduct retraining. The retraining of all layers, called from scratch, was also analyzed and the rotation-based augmentation technique was compared. The performances obtained demonstrated the great classifying power of CNN reaching AUC values higher than 98%.
De Nunzio et al. [16] developed a computer-aided design (CAD), capable of analyzing breast MRI images and classified breast cancer. The system consisted of two main processing levels-the segmentation of possibly tumoral ROIs and characterization of the selected ROIs between the in situ and invasive tumor. To select suspicious regions that were likely to contain a tumor mass, the authors used a deep learning method and in particular they used the pre-trained GoogleNet as an extractor of features from each ROI to train an ANN network. The tumor characterization consisted of Radiomics feature extraction (1820 features), features reduction phase, and classification with three different classifiers-Naive Bayes, random forest, and XGBoost. The performances obtained showed a sensitivity of 75% in mass detection and 70% of AUC for the binary classification.
In [17], the authors developed a machine learning algorithm for the multi-class detection of three common types of voice disorders. Two publicly available databases for voice disorders were used.
The problem of unbalanced data was addressed with the use of CGAN (conditional generative adversarial network) to increase the number of training data for classes with fewer samples, using synthetic data. The features extraction phase used four voice-qualitybased parameters-harmonic to the noise ratio (HNR), shimmer, jitter, and fundamental frequency. The classification method used an improved fuzzy c-means clustering (IFCM) algorithm that considered the relationship between adjacent data points in the fuzzy membership function.
The authors of [18] proposed an algorithm based on deep learning to improve the recognition rate of the left and right hand MI-EEG (Motor Imagery ElectroEncephaloGram) signals. To this end they used the public database "BCI Competition IV Dataset 2b" composed of nine subjects who participated in five sessions. The method was based on the following chain-filtering with a 4-35 Hz filter; mapping into a time-frequency image by applying CWT (continuous wavelet transform), use of a simplified convolutional neural network (SCNN), with the image as input and the binary classification as output. The performances obtained exceed traditional methods and were 83.2% of the average classification accuracy.
Chouhan et al. [19] addressed the problem of detection of the pneumonia diseases by analyzing chest X-ray images. The method was based on the use of 5 pre-trained CNNs (AlexNet, DenseNet121, InceptionV3, ResNet18, and GoogleNet) subjected to fine-tuning. The final image classification was carried out by the majority result obtained by the 5 CNNs. The database from the Guangzhou Women and Children's Medical Center was used, which contains 5232 images, and the result obtained had an accuracy of 96.39% and an AUC of 99.34%.
In [20], the authors proposed a super-resolution technique based on CNN for improving the resolution of low-resolution magnetic resonance imaging (MRI). The gradientguided residual network (DGGRN) proposal was compared to other methods, using three public databases.
The authors of [21] presented a benchmark of several deep neural networks for MRI reconstruction. The methods analyzed for the problem of MRI reconstruction made use of two databases "fastMRI" and "OASIS". Four different metrics were used to compare the networks-the Peak Signal-to-Noise Ratio (PSNR); the Structural SIMilarity index (SSIM); the number of trainable parameters in the network; the runtime in seconds of the neural network on a single volume. The networks that the authors compared were-Zero-filled; KIKI-net; U-net; Cascade net; and Primal-Dual-net. It should be emphasized that the authors made the code and the weights of the analyzed networks available to the researchers.
Acknowledgments: This special issue was possible thanks to the contribution of various talented authors, reviewers who proved to be very professional, hardworking, and particularly attentive and last but not least, thanks to the dedicated editorial team of Applied Sciences. The result, in terms of quality and quantity of works, was a success. So congratulations to all authors and a sincere thanks to all reviewers. Finally, we place on record our gratitude to the Applied Sciences editorial team and special thanks to Wing Wang Assistant Managing Editor for giving us the opportunity to make this special issue.

Conflicts of Interest:
The authors declare no conflict of interest.