In the last decade, Machine Learning (ML) has indisputably had a pervasive application in healthcare and biomedical applications [
1]. The excellent performance of ML algorithms and the new scientific knowledge they have provided had supported, improved, and speeded up disease diagnosis, precision medicine, biomarker discovery, and clinical decision-making in general. ML is a broad term that includes a plethora of approaches and methods, such as unsupervised learning with clustering and dimensionality reduction, supervised learning with regression and classification, or reinforcement learning. Each of these methodologies have been applied in the medical and biomedical fields with more or less success, but what they have in common is the concept of
learning. The concept
learning can take three forms: (i) learning something new when no prior knowledge about a domain or concept exists; (ii) learning something new about a topic that is already known, such as adding new information or refining previous expertise; and (iii) learning how to do something better, more efficiently, or with more accuracy. In all these three forms, the final outcome is a new or updated representation of scientific knowledge, which can then support our understanding of medical and biomedical processes.
The present Special Issue collected six works that focus on the application of ML algorithms to healthcare and clinical challenges, all demonstrating one or more of the three forms that the concept of
learning can take. Two of these collected studies [
2,
3] represent the case in which ML is used for learning something new—in particular, concerning the individual responses of clinical trials [
2] and the treatment of the COVID-19 infectious population [
3]. Beacher et al. [
2] aimed to predict the outcome of phase III clinical trials for prostate cancer and improve the clinical-stage drug development. The strength of the study is the large sample size and the use of three different sources of data with a high compatibility in terms of demographic and clinical characteristics; these were used as training, testing, and validation sets [
2]. The large cohort and the separated study data were useful for assessing the generalizability of the five applied ML approaches: logistic regression, KNN, CatBoost, XGBoost, and a voting classifier [
2]. In De la Sen et al. [
3], novel knowledge is obtained about the COVID-19 pandemic through the design of a so-called SE(Is)(Ih)(Icicu)AR epidemic model. The proposed epidemic model also represents a new way to solve old problems more efficiently with ML; in the past, these problems were tackled using integro-differential equations or difference equations [
3]. It is interesting to note that De la Sen et al. [
3] treated their epidemic model as a multi-class problem, where the infectious population was split into sub-populations based on their symptoms. This represents a valuable attempt to translate the application of ML into a clinical and hospital setting for the management of the COVID-19 pandemic [
3].
The diagnosis of a disease or the investigation of its characteristics represents a form of learning about a domain for which there is prior knowledge, as demonstrated by the other two works collected in the present Special Issue by Mahmood et al. [
4] and Sarica et al. [
5]. Deep learning was successfully applied for the automatic classification of schizophrenia in [
4], where functional connectivity (FC) brain data were used for training the ML model. It is worth of noting how Mahmood et al. [
4] achieved an optimal accuracy by proposing a sophisticated algorithm based on graph neural network (GNN); this also led to a good interpretability of the ML results. It is equally important to note that this work [
4] highlights the well-known dependence of ML approaches on prior anatomical knowledge—i.e., anatomical or functional atlases—and on manual processing and decision, which can introduce human bias to the automatic prediction of the pathologies. The classification of neurological and neurodegenerative diseases can be supported by supervised learning, as well as by unsupervised learning. A novel approach of clustering, based on the affinity propagation algorithm, was proposed by Sarica et al. [
5] with the purpose of investigating the natural structure of the cognitive profiles in Parkinson’s disease and Parkinsonisms. The authors demonstrated the reliability of the use of cluster analysis in discovering intra- and inter-diagnostic heterogeneity in the cognitive profile of Parkinsonism patients, and, more importantly, showed how to transform a ML approach into a decision support tool for use in a clinical setting [
5].
The last two works [
6,
7] that belong to this Special Issue represent two interesting examples of the application of ML to learning how to do something better, faster, more efficiently, and with more precision. In the study of Chen et al. [
6], a B-spline surface-fitting algorithm was employed for improving existing approaches for lung segmentation and solving a peculiar issue occurring in the recognition of lung fissures. The proposed novel method [
6] reduces the time and computing costs of elaboration compared to the literature, but, similarly to the work conducted by Mahmood et al. [
4], poses the problem of the dependence on prior anatomical structure knowledge when building reliable ML models. Indeed, the study by Nakasi et al. [
7] also highlights the high dependence on human expertise and manually annotated data for the automatic count of malaria parasites in thick smear blood. The proposed method [
7] overcomes the problems of the time-consuming conventional approaches used for the identification and quantification of malaria parasitemia thanks to transfer learning, which is applied on digital images, with a Faster Regional Convolutional Neural Network (Faster R-CNN) and Single Shot Multibox Detector (SSD).
The scientific contributions collected in this Special Issue represent valuable demonstrations of the application of ML in healthcare and biomedical applications, and, moreover, suggest several good practices for use in future works. For example, the importance of sufficiently large datasets [
2,
3,
5,
7], their comparability in terms of demographic and clinical features [
2], the use of separate validation sets [
2,
4], as well as the sharing of data [
6] are confirmed. This also demonstrates the utility of the hyperparameter tuning of ML algorithms [
2], as well as the usefulness of introducing new constraints into models [
3]. Two works [
4,
6] show that a better representation of an object of interest through derived features could improve the performance of ML methods. This Special Issue raises a fundamental issue with regard to the need to reduce the amount of manual processing required and the need to avoid human bias in building ML models [
4,
6,
7]. Finally, demand to increase the interpretability of ML findings has emerged [
2,
4,
5], as the recent growing interest of the scientific community in Explainable Artificial Intelligence (XAI) demonstrates [
8].