Artificial Intelligence and the Medical Physicist: Welcome to the Machine

: Artificial intelligence (AI) is a branch of computer science dedicated to giving machines or computers the ability to perform human ‐ like cognitive functions, such as learning, prob ‐ lem ‐ solving, and decision making. Since it is showing superior performance than well ‐ trained human beings in many areas, such as image classification, object detection, speech recognition, and decision ‐ making, AI is expected to change profoundly every area of science, including healthcare and the clinical application of physics to healthcare, referred to as medical physics. As a result, the Italian Association of Medical Physics (AIFM) has created the “AI for Medical Physics” (AI4MP) group with the aims of coordinating the efforts, facilitating the communication, and sharing of the knowledge on AI of the medical physicists (MPs) in Italy. The purpose of this review is to summa ‐ rize the main applications of AI in medical physics, describe the skills of the MPs in research and clinical applications of AI, and define the major challenges of AI in healthcare.


Introduction
Artificial intelligence (AI) is a branch of computer science dedicated to giving machines or computers the ability to perform human-like cognitive functions, such as learning, problem-solving, and decision making [1,2]. AI-based systems have shown performance superior to experienced human beings in tasks, such as image classification and analysis, speech recognition, and decision-making [3]. Consequently, AI is expected to change profoundly every area of science, including medical physics, the clinical application of the principles of physics to healthcare [4,5]. The knowledge and skills of the medical physicists (MPs), which include aspects of mathematics, bioinformatics, statistics, safety, and ethics in the use of medical devices, are invaluable in the clinical and research applications of AI in medicine.
Moreover, analytical and computational techniques of physics, in particular those derived from statistical physics of disordered systems, can be extended to large-scale problems, including machine learning, e.g., to analyze the weight space of deep neural networks [6,7].
Given the exponential growth of applications of AI, such as machine learning (ML) and deep learning (DL) in all areas of medicine, which use ionizing radiation, ultrasounds, and magnetic fields for diagnostic and treatment purposes, witnessed over the past few years, the MPs' workflow will be profoundly affected by the advent of AI. The areas affected will include quality controls of equipment, as linear accelerators and im-multimodal clinical data available, with the aim of discovering important groupings or defining features in the data [28].
Once similar patients are identified, the diagnosis, treatment, and outcome extracted from EHRs and other digital content can be ranked to give recommendations [17], e.g., by computerized clinical decision support systems (CDSS), which aid in decision-making [30]. In this way, pipelines can be designed to continuously and automatically extract information and improve the accuracy of patient outcome prediction [31].

Imaging
The main purpose of the use of AI and ML applications in imaging is to support the specialist in the diagnosis of diseases. Computer-aided diagnosis (CAD) is among the first applications of these new algorithms in the imaging area [32,33] and incorporates ML classifiers trained to distinguish lesions from normal tissue [34]. In lung computed tomography (CT), ML applied to combinations of CT textural features scored high accuracy in distinguishing malignant lesions [35] or invasive from minimally invasive lesions [36].
In image elaboration, DL algorithms can learn the structure labeling of each image voxel directly (semantic segmentation) in order to contour lesions or organs [46]. U-net, one of the most popular DL architectures for image segmentation, has proven to be capable of automatically segmenting lung parenchyma [47] and lung tumor using PET-CT hybrid imaging [48].
A cornerstone of optimization of clinical imaging protocols is patients' dose estimation, which allows the dose to be balanced with image quality. Dose to the patient can be automatically calculated by DL in CT [49], single-photon emission computed tomography (SPECT) [50], and PET [51]. In interventional radiology, DL has been proposed for skin dose estimation [52]. In chest CT, ML could be used to predict the volumetric computed tomography dose index (CTDIvol) based on scan patient metrics (scanner, study description, protocol, patient age, sex, and water-equivalent diameter (DW)) and identify exams, which hold potential for dose reduction by tuning the acquisition parameters [53].
Another pillar of patient dose optimization is image quality improvement, as it allows dose reduction for the same image quality. The integration of AI algorithms within the imaging technology allows for improving imaging quality and, consequently, to reduce patient dose. DL methods have been used for improving PET image quality, reducing noise [54], removing streak artifacts from CT [55], and developing novel techniques for tomographic image reconstruction based on a reduced amount of acquired data. Other promising applications are a generation of synthetic images, such as synthetic CT from MRI [56], virtual contrast-enhanced images [57], and rigid/deformable intramodal and multimodal image registration [58], and extraction of the respiratory signal [21] that could be used for breathing motion compensation of images [59].
In interventional radiology, AI can predict tumor response to transarterial chemoembolization based on image texture and patient characteristics [60,61]. In the future, real-time registration DL algorithms could be used to superimpose high-resolution preoperative MR imaging with intra-procedural fluoroscopy, guiding the physicians during the catheter's manipulation [62] for estimating ablation margins and helping minimize damages to structures close to the treated area.
AI can be useful also in longitudinal studies during follow-up of treatments in order to detect subtle changes between images, thus identifying progress or recurrence at an earlier stage [63,64]. Ophthalmic imaging, e.g., fundus digital photography, optical coherence tomography, among other imaging fields, is where artificial intelligence can support the specialist in the diagnosis of ophthalmic disorders, such as diabetic retinopathy, age-related macular degeneration, and others [65]. Other areas include cardiology [66,67] and rheumatology, which have a long history of research in AI applications aimed to detect and assess also rheumatological manifestations, bone erosions, and cartilage loss [68]. The development of digital pathology, due to the introduction of whole-slide scanners, and the progression of computer vision algorithms have significantly grown the usage of AI to perform tumor diagnosis, subtyping, grading, staging, and prognostic prediction. In the big-data era, the pathological diagnosis of the future could merge proteomics and genomics [69]. Spatial metabolomics is a new field aiming at measuring the distribution of molecules, such as metabolites, lipids, and drugs, within body structures, using imaging, such as mass spectrometry, where each pixel is represented by its mass spectrum [70]. Being characterized by a large amount of high dimensional data, including overlapping and noisy molecular signals, this technique looks promising for the application of AI [71].
Other applications that could become a focus of AI in the near future are computer vision [72], dealing with object detection and feature recognition in digital images, and virtual assistants [73], employing speech recognition in neuroradiology [74], radiology, and beyond. By augmented reality, the operator's perception of an operating room environment could be enhanced with AI-generated information [75].

Therapy
ML can be useful to carry out many of the activities during the whole workflow of radiotherapy, starting with the choice of the optimal radiation approach, e.g., choice of proton vs. photon [76]. A convolutional neural network (CNN) can automatically segment targets and organs at risk in radiotherapy [77]. ML-based auto-planning [78,79] mimics the iterative plan design, evaluation, and adjustments made by experienced operators with the goal of improving quality and efficiency and reducing inter-user variability [46]. Knowledge-based approaches leverage a large database of prior treatment plans (up to thousands) to develop associations between geometric and dosimetric parameters from a selection of previous plans in order to determine achievable dose constraints or dose distributions that can be used for benchmarking the quality of plans [9,80]. ML-based auto planning was also developed for brachytherapy [81].
The dose distribution from radiation therapy treatment can be predicted by DL in order to speed up the optimization [82] or determine the best achievable dose distribution from the patient image [83]. ML was applied to predict dose in brachytherapy [84] and in vivo measured dose in intraoperative radiotherapy [85].
Recently, dosomics, the application of radiomics or DL to the analysis of the dose distribution, eventually corrected into biologically effective dose to account for diverse fractionation, was investigated for the ability to predict side effects of radiation therapy [86,87]. Radiomics can also be applied to cone-beam CT (CBCTs) acquired for image-guidance of the radiotherapy treatment, making these images useful for data mining [88].
A major concern of radiotherapy is the change in the anatomy of the patient during therapy, which could result in unwanted dose changes. In this case, re-planning of the treatment is warranted. ML can identify significant changes in patient anatomy during radiotherapy [19] and predict patients who would benefit from adaptive radiotherapy (ART) [89]. Eventually, by using information extracted from radiomics voxel-based analyses, sensitive/resistant tumor sub-volumes might be identified, requiring higher (or lower) dose, thus enabling dose painting according to a "radiomic target volume" (RTV) [90].
In nuclear medicine, radiometabolic therapy with unsealed (radiopharmaceuticals) or sealed sources (microspheres, etc.) is of growing importance. The application of AI in this area can improve dosimetry by accounting for patients' anatomy, activity distribution, and tissue density, and planning, in order to administer the highest dose to the target while sparing critical organs, as well as for predicting treatment response [91]. Methodological studies have been performed to investigate the robustness of dosomic approaches [92].

Quality Assurance (QA)
According to the International Organization for Standardization, QA is a system that ensures quality for a given product, service, process. Quality is the degree to which the system fulfills requirements (need or expectation that is stated-generally implied or obligatory) [93], thus avoiding mistakes and defects. Quality controls (QC) are the tests performed to describe, measure, analyze, improve, and control a certain product or process. In radiological sciences, QCs are applied to verify and monitor devices and procedures for diagnosis and therapy, as well as the support systems used by clinicians. AI can be used to perform automatically QCs that, if carried out manually, would not be feasible routinely due to a large amount of time required. AI QC systems could be used to learn and improve their accuracy over time and develop new tests over time without human intervention.
Quality assurance of radiotherapy (RT) is a significant part of the MP's work, and it is aimed at preventing radiological incidents and misadministration of radiation dose. A number of ML-based approaches have been explored to predict errors in treatment plans in order to automate chart check of plans. A K-means clustering algorithm was employed to learn from prior plans to perform the detection of errors in prostate plans [18].
Automated quality control of LINACs is another promising application of ML, which can be used for predicting machine performance issues, such as deviation of dose output [94], multileaf collimator (MLC) positions [95], and beam symmetry [96]. A method for automated quality control of LINACs by ML applied to electronic portal imaging device (EPID) was proposed, which could identify sag and deviations in the vertical direction and field shift [97]. Other AI applications aim at predicting results of in-phantom patients' specific QA of intensity modulated RT (IMRT) or volumetric modulated arc therapy (VMAT) [98,99].

Data Size and Quality
ML and DL algorithms require a large amount of training samples, which grows rapidly with the dimensionality of data (the curse of dimensionality). An unappropriated data size will lead to a reduction in the certainty of the prediction, considering that many ML applications will always deliver a result, disregard the size and quality of the data set [100]. Unfortunately, a proper metric to evaluate sample size and power for ML and DL is missing.
Frequently, datasets used for training AI have a small number of samples with respect to the dimensionality of data and of the desired tasks [101], to the point that, frequently, there are more features per subject than subjects in the entire dataset [102]. Under these circumstances, overfitting, a condition where models are more sensitive to noise in the data than to their patterns, and instability occur, making the model poorly reproducible and generalizable, meaning that it will perform poorly on unseen datasets [103].
Feature selection algorithms, such as stepwise feature selection [104], the minimum redundancy maximum relevance (mRMR) [105], and RELIEF (relevance in estimating features) [106,107], can be applied to reduce overfitting by selecting a non-redundant subset of variables best suited to predict the outcome.
To reduce overfitting in DL, data augmentation (e.g., by the affine transformation of the images) during training is commonly implemented [10], and layers in the networks are specialized in reducing overfitting, such as dropout layers [108]. On the other side, DL suffers from other sources of uncertainties (e.g., the presence of many local minima in the loss function and the stochastic nature of training algorithms), so that repeating model training multiple times does not necessarily produce the same model [2]. Besides, the class imbalance problem, in which some classes have a significantly higher number of samples, is detrimental for ML performance, if not properly accounted for [109,110]. For overcoming class imbalance, under-sampling or over-sampling can be applied; the latter has been proven to be more effective [110].
Other biases in the training datasets, e.g., age, gender, and race, or in the diagnostic or therapeutical approach, e.g., technologies use for imaging or radiotherapy, may result in biased models, which may lead to poor performance for minority groups who are poorly represented in the training dataset. This could potentially aggravate healthcare disparities [103].
Another source of unreliability stems from the constant evolving of the patterns of clinical practice over time due to the introduction of new treatment approaches, technologies, or gradual changes in patient population (e.g., percentage of patients with a given histological subtype). This may result in increased unreliability of the AI system's recommendations or prediction over time [30]. The "half-life" of the relevance of clinical data used for training is thought to be typical of 4 months [111].

Interpretability
Interpretability is the level of understanding of the information that the model extracts from input data, why it is extracted, and how it arrives at its output [2]. ML models are usually perceived as black boxes by the users and clinicians, meaning that they have a low level of interpretability. This issue is exacerbated for deep neural networks, given the complicated multi-layer structures and numerous numerical operations performed by each layer, and hinders the application of AI in the clinic.
Graph approaches can be of help to improve the interpretability of ML and DL methods. The activation maps extracted by the CNN, overlaid with the image analyzed, can show on which image regions the CNN focuses strongly for prediction [112]. For ML classifiers, interpretation can be facilitated by identification of the most important variables or features for prediction and comparing their values in illustrative cases, e.g., patients with a poor and good prognosis, as done in many radiomics studies, e.g., [86,113,114]. In unsupervised learning, some methods, like t-distributed stochastic embedding (t-SNE), allow visualization of high-dimensional data by giving each data point a location in a two or three-dimensional map [20].

Legal and Ethical Issues
Key ethical issues associated with AI-systems automatically mining large patient databases include informed consent, privacy and data protection, ownership, objectivity, transparency of the obtained clinical or research model, and quality of training and validation data [115]. Automatizing tasks and decisions with the use of AI-based machines on a large scale could bring increased systemic risks of harm and systematic errors. These errors are categorized into omission when humans do not notice the failure of an AI tool and commission when an action is performed following AI's decision when there is evidence that AI is wrong [115]. The responsibility to prevent these errors by anticipating incorrect performance or misuses of AI before incidents occur falls to humans.
A model should be transparent, meaning that its formulas and code should be available and comprehensible so that it is possible to trace why an algorithm has failed and adverse clinical events [115]. The data "truthfulness" consists of understanding the type of information contained, the completeness and accuracy, their variance and bias, and if they reflect the problem of interest. Because of the "black box" phenomenon, in-forming the patient clearly could become more difficult for the doctor when a decision is influenced by AI [116].
AI systems' decisions are based on the data used for training, the algorithms that are used, and what they have learned since their creation [117]. If some human biases, such as variability in healthcare because of ethnic, social, environmental, or economic factors, or clinically confounding factors, such as comorbidities, are present in the training data, they could result in biased decisions of the AI systems [28,117]. Since AI does not incorporate ethical concepts like equality, humans who use AI will hold the responsibility for preventing these errors [115]. Finally, before integrating AI into medical practice, it is important to prevent the loss of competence of the human who will not be able to carry out a task he used to do before because it has been transferred to the AI, also defined as "deskilling" [116].

Imaging
As already underlined in this paper, one of the major tasks in which the MP is deeply involved in the imaging field is the optimization process, i.e., finding the balance between dose and image quality.
MP understands the components of an imaging device used and the basic physical mechanisms at the root of signal change and image contrast and comprehends the technical and/or physiological artifacts limiting the performance [4,118]. Moreover, the MP understands the limitations and potential pitfalls of dose measurement, calculation, and prediction [90]. Thus, MP has knowledge and skills that are of value for the development, implementation, and use of AI in imaging.
AI-based systems have been developed to estimate patient dose. MP shall validate and periodically check these systems to avoid possible errors in the estimation. For example, the dose to each voxel in the calculated distribution depends on the dose calculation algorithm used, on the calculation voxel spacing, and on the uncertainty in dose measurement in the dataset used for ML training. In phantom, dose measurements can be planned by the MP to test algorithms' predictions.
MP shall also assess image quality through routine testing [119]. Recently, image quality enhancers, based on DL, have been introduced in clinical practice in order to ameliorate image quality. Consequently, image acquisition protocols could be updated to achieve dose reduction, and the MP will be involved in the optimization to ensure the minimum possible ionizing radiation dose to the patient [119,120].
It is also necessary to verify to what extent the imaging parameters' change influences the quantitative image content and, consequently, the response of AI systems. To this purpose, various physical phantoms have been developed. The Credence Cartridge Radiomics (CCR) phantom for radiomics was created for CT [121] and CBCT [122] images. More recently, anthropomorphic phantoms with heterogeneous objects were designed in order to simulate the texture of lung nodules [123]. PET phantoms with 3D printed inserts simulating heterogeneities in FDG uptake have been proposed [124], as well as MR phantoms simulating relaxation times and texture of pelvic tissue and malignancies [125]. Using these kinds of phantoms, the sensitivity of radiomics-based ML classifications on image acquisition parameters has been investigated. In CT, the classification is affected by the device used [121], method of image reconstruction [126], noise reduction algorithms, slice thicknesses [127,128]. PET features depend on acquisition mode [129,130], reconstruction algorithm, image resolution, and discretization [131,132]. MRI features are sensitive to the field of view, field strength, pulse sequence, reconstruction algorithm, and slice thickness [133].
Physical and digital phantoms could also be used to periodically verify the performances of image-based ML algorithms. Digital phantoms are usually representative scans of patients with known acquisition parameters. A dataset of CTs acquired twice on the same patient 15 min apart allows "test-retest", an assessment of the reproducibility of the radiomics workflow under the same conditions [127].
The accuracy of AI-generated segmentation, image reconstruction, and synthetic images (e.g., MRI) can be assessed using a ground truth digital phantom, for example of brain glioma patients [133] and image simulators, capable of simulating MRI acquired with different pulse sequence or field strength and reconstructed with different methods [133]. Specific tests allow assessing the accuracy of AI-based image registration [134].
In addition, MP can ensure correct extraction and quantitative analysis of imaging data. Thus, before performing quantitative analysis with AI algorithms, the accuracy and precision associated with the quantitative parameters within the images (e.g., tumors) should be assessed [29]. Moreover, MP is responsible for the pre-processing of images necessary for correct AI application. This would include the conversion of PET and SPECT images in standard uptake value (SUV), the standardization of MR images intensity scale [135], as well as assessment and correction of confounding factors of images, such as artifacts for metal implants in CT, magnetic field non-uniformity in MRI, and partial volume effect (PVE) in nuclear medicine images. Multimodal images should be registered using a proper method for rigid or deformable registration [136], a critical step that may affect the accuracy of AI models analyzing hybrid image datasets voxel by voxel [137] in order to combine metabolic, functional, and morphologic information.
In interventional radiology, MPs are involved in monitoring patients' dose and manage patients' radiation risks by reviewing interventional procedures [138]. The involvement of MPs will also reach safe implementation and QA of other AI systems, such as robotic angiographs and/or neuro-navigators, robots, etc., and platforms (catheter navigation assistants, analyzing relationships between catheter positions, therapeutic effect, and patient outcomes, etc.) for interventional therapies.
In other fields of medical imaging where AI is rapidly emerging, such as pathology imaging, MPs can support the acceptance and validation of AI systems. Recently, [139] pathology Digital Imaging and Communications in Medicine (DICOM) file format has standardized the representation, storage, and communication of pathology images acquired with whole-slide scanners [139]. Common acquisition protocols could reduce the variability in slide preparation and digitization procedures and scanner models among different centers and improve the performance of AI detection systems.

Data Collection and Curation
Given their skills in numerical analysis and clinical integration, MPs can significantly aid in the management of aggregate data [4], which will include clinical and image data from multiple modalities, such as PET, CT, radiography MRI, ultrasound, daily CBCT, hybrid imaging, such as PET/CT and PET/MRI, 3D/4D and image time series, and 3D/4D dose distribution from RT. MP will be involved in the development of metrics to assess the quality and completeness of data, methods to curate data, and QA programs of data archives [140].
CAD systems and other AI-based decision systems using images as input will need minimum quality specification and acquisition protocols in order to ensure output accuracy. The MP can ensure that the images are acquired according to the protocol required for correct AI use, free from relevant imaging artifacts, and correctly preprocessed [141] and harmonized [142] to reduce variability.
Moreover, MP can ensure that image data, together with their acquisition parameters and the dosimetric data from imaging and therapy, are stored in commonly accepted standards, such as the Digital Imaging and Communication in Medicine (DICOM), or comparable format and can create new standards for raw acquisition data to be stored in the standard format [143]. MP will necessarily oversee storage, security, and integrity of the large, machine-readable data collections needed to build a model [103]. The QA of datasets is a guarantee for the clinician, patient, and patient associations of the ethical and unbiased use of patientsʹ health data by AI systems.

Commissioning and Validation of AI
Commissioning of AI tools is a series of tests to assess if the system installed in the local site operates correctly and is ready for clinical use. The commissioning tasks, tests, schedule, and tolerances, with the required equipment and human resources, should be planned before installation [30]. The test plan could consist, for example, of applying AI to a set of well-known clinical cases, for which ground truth data are available. Comparison of different ML methods on the same dataset is useful and can show which ML algorithms have the best performance and which are more prone to overfitting data for the task at hand [85,144]. A technique called adversarial ML, where attempts to deceive models are carried out with a number of crafted configurations of data, e.g., by adding noise to images, can be used for quality assessment of many classes of ML and DL algorithms [145,146].
The lack of interpretability of AI systems-or 'black-box' problem-constitutes an obstacle towards their adoption in the clinic [10]. Monitoring AI performance by proper quality controls that test the models in well-known situations can improve the interpretability of models, as well as assessing architectures of DL models and their output using activation and feature maps.
An initiative led by the US FDA, the Microarray Sequencing Quality Control MAQC/SEQC [147], invites researchers to submit their models, features selected as important, and performance estimates to a specific data analysis plan (DAP), which includes ML and statistical crosscheck, before performing external validation data [100].
Validation, e.g., using the criteria in the TRIPOD statement [148], is required because many of the available AI models are trained using small datasets, and although augmentation and resampling methods are frequently applied, they are affected by overfitting and poor generalizability and reproducibility [112]. Large and possibly multi-institutional datasets, independent from the training datasets with realistic variability and the lowest bias as possible, are needed for validation. These can be achieved by increasing the level of collaboration among institutions [112], and the MP can play a role in checking the compliance with the required standards.

AI in Radiotherapy
MPs contributed to making radiotherapy into a frontier of personalized precision medicine by developing CT-based dose calculation, treatment planning, and image-guided radiation therapy (IGRT) [90]. Other traditional domains of MPs in radiotherapy include quality assurance and radiation protection [90]. MPs have been also at the forefront in using AI in RT, leading to the implementation of knowledge-based treatment planning, where ML algorithms are trained on the dataset, comprising patient images, contours, clinical information, and treatment plans performed by experienced MPs to automatically develop high-quality plans, allowing to accelerate radiotherapy plan design [46].
As with any other ML-based procedures, auto-planning systems also are as good as their human-generated training data, and their outcome will need to be tested and finally approved. Oftentimes, the proposed plan will need to be customized and modified by clinical MPs because of the unique anatomy of every patient. More importantly, when potential issues are identified for a specific plan, MPs communicate with other team members, such as physicians, therapists, and dosimetrists, to reach a clinically acceptable solution [149].
MPs are involved in validation and quality assurance of dose predicted by DL [90], which can be tested by properly designed in-phantom film/ion chamber measurements according to dosimetry protocols and benchmarking against previously established dose calculation algorithms. Another critical aspect is also investigating how the uncertainties of dose affect prognostic or predictive dosomic models [90].
Given their familiarity with imaging devices and LINACs derived from managing QA programs, MP will have a critical role in the analysis of AI applied to the quality control of LINACs. When an AI tool predicts a machine failure, MPs can help identify the cause of the issue and corrective actions, such as calibrations [149].

Safety/Risk Management
One of the key activities of the MP is patient safety management that is the evaluation of medical devices and procedures to guarantee the safety of patients. MPs are trained to prevent and analyze accidents [149] by using risk assessment, which consists of the analysis of events potentially involving accidental medical exposures or injury to a patient [150], and failure modes and effects analysis (FMEA) [151].
ML has the potential to reduce imaging radiation exposure, which is a hazard for patients and workers, without penalizing image quality [152].

Periodical Tests
QA should be applied to AI systems themselves, which, having an impact on pa-tientʹs health, should be considered as medical devices [153]. Physicists are also responsible for ensuring that clinically used AI algorithms continue to perform with the desired level of accuracy by conducting an appropriate routine QA test program with clearly established frequency, metrics, tolerance levels, and actions to be performed in case of test failure [103]. The frequency and nature of the series of tests will be in need of frequent updates, given the rapid pace of evolution of AI. This is especially important for those AI systems that, being constantly learning and updating, will be subject to change in terms of their response and accuracy [94,119]. At the same time, it is critical to assess the effect of the decay of the relevance of the training data due to changes in practices (e.g., changes in prescribed dose and dose per fractions) [94].

Training of AI Users
According to a white paper, the Canadian Association of Radiologists [154] should provide practitioners with an understanding of the value, the pitfalls, weaknesses, and potential errors that may occur in the use of AI products [154]. The medical physics associations are launching initiatives to provide appropriate training and education programs in the field of AI applied to imaging and therapy [90]. On the other hand, being skilled at communication and divulgation of science, MPs are critical to establishing a common language with other professionals and patients [155]; MPs can take part in education and training in the use of AI of other health care professionals, and be a part of the interdisciplinary team working for the effective, efficient, and safe delivery of AI in the clinic [3].

Research in AI
MPs are often active researchers and, having expertise also in statistics, mathematics, and informatics, are suitable for research in AI. Extensive research is needed to understand how to successfully introduce AI and define the use and characteristics of AI in clinical practice [119].
Other active areas of research where MPs will be primarily involved include assessing data veracity and validity, developing metrics for completeness, accuracy, correctness, and consistency, and perform data cleaning activities [140]. Physicists should promote the integration of digital information from diagnostic and therapeutic procedures with genotyping and phenotyping data into large data sets acquisition across all areas (clinical, dosimetric, imaging, molecular, pathological, etc.), requiring multi-institutional and multinational collaboration [24,90]. Examples of this are The Cancer Imaging Archive (TCIA) [156] and the Platform for Imaging in Precision Medicine (PRISM) platform [157].
The specific task for MPs in AI research includes the definition of the problem to be solved and determining its category (e.g., classification, regression, pattern recognition) in the lexicon of AI, choosing proper models to be trained, determining a strategy for collecting data from the appropriate dataset, and validating the model [103]. MPs also need to investigate and report the possible pitfalls of the AI-based methods developed and on how to overcome them. Besides, challenging is a personalizing therapy according to AI output, e.g., dose painting in radiotherapy [90].
Privacy, security, secure access to health information, de-identification of sensitive data, and obtaining informed consent, which are also of concern in research areas, become more relevant in the era of big data. The MP involved in these research areas will be required to apply the statements and recommendations released by governmental agencies, scientists, healthcare providers, companies, and other interested parties and will have an active role in formulating these statements [140].
Moreover, if MPs work at developing AI models or fine-tuning them on their data, they have to carefully understand and address the limitations of the data used for training and of the trained models [94]. Exploring multiple approaches, such as different feature selection and ML methods and their combinations, can help in understanding these limitations.
The Findability, Accessibility, Interoperability, and Reusability (FAIR) principles are intended to guide researchers into data management and reporting [158]. The methodology of research studies should be detailed thoroughly, including also deep learning architectures and optimization parameters, and the datasets used to train models should be clearly described in order to increase reproducibility and facilitate meta-analysis. Moreover, decision, automation, and prediction models relying on AI must be tested in independent and sufficiently large datasets to compare their validity against established methods, including conventional biomarkers (e.g., clinical, radiological, etc.). The codes and data used for training and testing the models should be made publicly available, e.g., by The Cancer Image Archive. More guidelines for improving transparency and reproducibility of models can be found in the TRIPOD [148].

Conclusions
AI can extend the expertise area of MPs, extracting even more information to improve patient care, and the MP is ready to welcome the AI revolution. On the other hand, the MPs' knowledge and skills will be required and beneficial for safe and optimal implementation of AI, especially in radiological sciences, and their involvement in the multidisciplinary AI team is crucial.

Conflicts of Interest:
The authors declare no conflict of interest.