Machine Learning Tools and Platforms in Clinical Trial Outputs to Support Evidence-Based Health Informatics: A Rapid Review of the Literature

: Background: The application of machine learning (ML) tools (MLTs) to support clinical trials outputs in evidence-based health informatics can be an effective, useful, feasible, and acceptable way to advance medical research and provide precision medicine. Methods: In this study, the author used the rapid review approach and snowballing methods. The review was conducted in the following databases: PubMed, Scopus, COCHRANE LIBRARY, clinicaltrials.gov, Semantic Scholar, and the ﬁrst six pages of Google Scholar from the 10 July–15 August 2022 period. Results: Here, 49 articles met the required criteria and were included in this review. Accordingly, 32 MLTs and platforms were identiﬁed in this study that applied the automatic extraction of knowledge from clinical trial outputs. Speciﬁcally, the initial use of automated tools resulted in modest to satisfactory time savings compared with the manual management. In addition, the evaluation of performance, functionality, usability, user interface, and system requirements also yielded positive results. Moreover, the evaluation of some tools in terms of acceptance, feasibility, precision, accuracy, efﬁciency, efﬁcacy, and reliability was also positive. Conclusions: In summary, design based on the application of clinical trial results in ML is a promising approach to apply more reliable solutions. Future studies are needed to propose common standards for the assessment of MLTs and to clinically validate the performance in speciﬁc healthcare and technical domains.


Introduction
Evidence-based health informatics (EBHI) can be defined as the conscious, explicit, and judicious use of current best evidence to support a health care decision that employs information technologies (ITs) [1]. Towards this direction, clinical trials are considered as a well-established experimental clinical tool suitable not only to evaluate the effectiveness of interventions, but also to support the conduct of an adequately designed systematic review [2]. Furthermore, meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published clinical trials [3]. Meta-analysis applied to clinical trials is a central method for quality evidence generation. In particular, meta-analysis is gaining speedy momentum in the growing world of quantitative information [4]. Thus, both EBHI and clinical trials are currently at the forefront of supporting clinicians in clinical decision making.
Precedence Research announced that the global clinical trials market size was valued at USD 51.05 billion in 2021, and is forecast to hit USD 84.43 billion by 2030 with a registered compound annual growth rate (CAGR) of 5.7% during the forecast period of 2022 to 2030 (https://www.precedenceresearch.com/clinical-trials-market, accessed on 10 August 2022).
growing world of quantitative information [4]. Thus, both EBHI and clinical trials are currently at the forefront of supporting clinicians in clinical decision making.
Precedence Research announced that the global clinical trials market size was valued at USD 51.05 billion in 2021, and is forecast to hit USD 84.43 billion by 2030 with a registered compound annual growth rate (CAGR) of 5.7% during the forecast period of 2022 to 2030 (h ps://www.precedenceresearch.com/clinical-trials-market, accessed on 10 August 2022).
At the same time, the increasing volume of patient admissions due to the increase in various chronic diseases and the rapidly increasing aging population worldwide is fueling the growth of artificial intelligence (AI) in the healthcare market.
Related research by Precedence announced that the size of the global AI in the healthcare market was estimated at USD 11.06 billion in 2021, and is expected to exceed approximately USD 187.95 billion by 2030, growing at a CAGR of 37% during the forecast period of 2022 to 2030 (h ps://www.precedenceresearch.com/artificial-intelligence-in-healthcare-market, accessed on 10 August 2022). The clinical trials segment generated revenue of over 24.2% in 2021 and dominated the global healthcare AI market.
However, even if we consider the field of research, we find that during this time period, according to the search engine semantic scholar, defining the search key "health", 6,060,000 articles are circulating on the internet, of which 749,050 are reviews and 5290 are clinical studies (Figure 1).  The information available on the internet is growing dramatically every year. For example, searching for "clinical trials" with the semantic scholar search engine found 32,099 related articles in 2001, 82,250 related articles in 2011 (an increase of 256%), and 139,008 related articles in 2021 (an increase of 169%) ( Figure 2). However, because of the large and complex collection of datasets derived from clinical trials, it often becomes impossible to fully exploit and apply them in health and they are difficult to process with traditional data processing applications [5]. Furthermore, because this amount of information is growing rapidly, the ability to apply machine learning tools (MLTs) to automate knowledge extraction is now more critical than ever.
Currently, the application of digital technology in clinical trials is studied, proposed, promoted, and implemented in some studies [6,7].
Clinical trials are a fundamental tool used to evaluate the effectiveness and safety of new drugs and medical devices and other health system interventions. The traditional clinical trial system acts as a reliable tool for the development and implementation of new drugs, devices, and interventions in the health system. However, digital tools can be used to analyze and optimize clinical trials, and finally, in the future, it will be possible to support them with digital tools completely by implementing virtual tests and experiments using even virtual human models [8].
The information available on the internet is growing dramatically every year. For example, searching for "clinical trials" with the semantic scholar search engine found 32,099 related articles in 2001, 82,250 related articles in 2011 (an increase of 256%), and 139,008 related articles in 2021 (an increase of 169%) ( Figure 2). However, because of the large and complex collection of datasets derived from clinical trials, it often becomes impossible to fully exploit and apply them in health and they are difficult to process with traditional data processing applications [5]. Furthermore, because this amount of information is growing rapidly, the ability to apply machine learning tools (MLTs) to automate knowledge extraction is now more critical than ever.
Currently, the application of digital technology in clinical trials is studied, proposed, promoted, and implemented in some studies [6,7].
Clinical trials are a fundamental tool used to evaluate the effectiveness and safety of new drugs and medical devices and other health system interventions. The traditional clinical trial system acts as a reliable tool for the development and implementation of new drugs, devices, and interventions in the health system. However, digital tools can be used to analyze and optimize clinical trials, and finally, in the future, it will be possible to support them with digital tools completely by implementing virtual tests and experiments using even virtual human models [8].
At the present stage, however, the management of clinical trials results in drawing conclusions, and selecting the appropriate treatment with the application of AI and machine learning (ML) is an extremely topical and critical issue. In this way, it will be possible to make the most of clinical studies, thus achieving a more rational and more economical application in daily clinical practice compared with classical methods.
ML was defined by Arthur Samuel, a ML pioneer, as "a field of study that gives computers the ability to learn without being explicitly programmed" [9]. A broader domain of ML is AI. AI refers, in general, to the simulation of human intelligence in machines that are programmed to think in the same way as humans and mimic their actions. At the present stage, however, the management of clinical trials results in drawing conclusions, and selecting the appropriate treatment with the application of AI and machine learning (ML) is an extremely topical and critical issue. In this way, it will be possible to make the most of clinical studies, thus achieving a more rational and more economical application in daily clinical practice compared with classical methods.
ML was defined by Arthur Samuel, a ML pioneer, as "a field of study that gives computers the ability to learn without being explicitly programmed" [9]. A broader domain of ML is AI. AI refers, in general, to the simulation of human intelligence in machines that are programmed to think in the same way as humans and mimic their actions.
Although there is a skepticism regarding the practical application and interpretation of results from ML-based approaches in healthcare settings, the inclusion of these approaches is growing at a rapid pace [10].
More in detail, recent developments in AI and ML technology have brought on substantial strides in issues such as the prediction and detection of health emergencies, the treatment of diseases and immune response problems [10], the diagnosis of diseases, living assistance, biomedical information processing, biomedical research [11], automated treatment, disease recommendation, automated robotic surgery, and drug discovery and development [12].
At the same time, ML and AI have been developing rapidly in recent years in terms of software algorithms, hardware implementation, and applications in a huge number of areas [11].
However, the authors of [13] found no unified information extraction framework tailored to the systematic review process, and published reports focused on a limited (1-7) number of data elements. Biomedical natural language processing techniques have not been fully utilized to fully or even partially automate the data extraction step of systematic reviews.
Nevertheless, it is estimated that natural language processing (NLP) will emerge as the most effective tool for generating structured information from unstructured data, which is commonly found in clinical trial texts. In the research article [14], the bibliometric analysis of the annual publication trend showed that there has been a dramatic increase in research interests in NLP-enhanced clinical trial research.
Moving in this direction, the author of this article deals with the application of ML through appropriate tools to extract the results from the application of clinical trials so that they can be properly applied in daily clinical practice [15]. Thus, initially, the author searched for relative work and analytically described it below in Section 3.1.
More analytically, the author in this article performed a rapid review exploring MLTs and approaches in the field of clinical trials to support EBHI.
The main research question was as follows: • RQ1. What MLTs and platforms are reported in the literature to derive results through clinical trial implementations?
The secondary research questions were as follows: • RQ2. What are the main categories of these MLTs? • RQ3. What are the results, benefits, and experience gained from their implementation and what are the inherent difficulties in implementing them and the main observations for future work and challenges to be overcome?
The rest of this study is organized as: Section 2 discusses a group of related articles. Section 3 presents the Materials and Methods of this study. Section 4 summarizes the results. Section 5 discusses the key issues arising from this study. Section 6 concludes the study and presents future directions.

Related Work
There are some notable studies in the field, but only a limited number deal with the subject thoroughly. Automation has been proposed or used to expedite most steps of the systematic review process of clinical studies, including searching, screening, and data extraction.
Marshall and Wallace [16] provided an overview of the current machine learning methods that have been proposed to expedite evidence synthesis. They also offer guidance on which of these are ready for use, their strengths and weaknesses, and how a systematic review team might go about using them in practice.
In addition, Tsafnat et al. [17] detailed a survey designed to support or automate individual tasks in the systematic review, and in particular systematic reviews of randomized controlled clinical trials, which revealed the trends that see the convergence of several parallel research projects. This survey described each of the systematic review tasks in detail. Each task was described along with the potential benefits of its automation. In addition, the technology systems (up to 2014) that automate or support the tasks are listed in detail.
Many significant studies refer to algorithms [14,18,19] and strategies to automate data and knowledge extraction from reviews [20]. Finally, many studies focus on the evaluation of ML methods through a specific tool [21][22][23].

Study Design
In this study design, the author used the rapid review approach [24]. A rapid review can be defined as a form of knowledge synthesis that is produced within a short timeframe using limited resources by streamlining or omitting a number of methods for producing evidence [24].
Moreover, the forward and backwards snowball method was used [25]. It has been proposed that in systematic reviews of complex or heterogeneous evidence in the field of health services research, "snowball" methods of forward (citation) and backwards (reference) searching are especially powerful. This method allows researchers using the references and citations of an article to find specific literature on a topic quickly and relatively easily. Experimentations with this methodology yielded positive results and have also been presented in [26].
Finally, the SF/HIT model was used as a template to define specific keywords in order to identify the impacts and outcomes resulting from the use of digital tools in the healthcare domain [27].

Search Strategy and Eligibility Criteria
The review was conducted using the following databases: PubMed, Scopus, COCHRANE LIBRARY, clinicaltrials.gov, Semantic Scholar and the first six pages of Google Scholar for the 10 July-15 August 2022 period.
Reviews that observed the main objective of describing the MLTs that may extract information and knowledge from the data of clinical trials and the assessments of them were included. It was decided not to restrict the search field in order to collect as much information as possible. Restrictions were related to the language (only English articles were included). Snowballing was undertaken, starting from the included citations and from the references of each article.

Data Screening
The reference manager Qiqqa version v.76s and Excel were used to export and manage the results.
A two-stage review process ( Figure 3) was performed by the author, (a) initially excluding assignments based on the titles and their abstracts, and (b) then the remaining assignments were reviewed based on reading the full text of the article. Specifically, the first stage included two phases.
In the first phase, the articles were extracted based on the following acceptance characteristics: ML methods OR ML approaches OR machine learning tool OR machine learning systems OR ML techniques AND ((RCT OR clinical trial) AND review).
In this phase, only reviews of clinical studies/trials were selected, as the aim was to search for tools in which ML was applied and compared in trials.
Thus, the collected articles were studied based on their titles and abstracts, and the selected ones were included in the second phase.
During the second phase, the MLTs were identified and selected. Next, relevant studies describing these tools in detail were searched using the Snowball citations method.
Finally, these studies were screened based on the title and abstract, and the appropriate ones were included in the pool of selected articles.
One researcher reviewed the articles. Specifically, the first stage included two phases.
In the first phase, the articles were extracted based on the following acceptance characteristics: ML methods OR ML approaches OR machine learning tool OR machine learning systems OR ML techniques AND ((RCT OR clinical trial) AND review).
In this phase, only reviews of clinical studies/trials were selected, as the aim was to search for tools in which ML was applied and compared in trials.
Thus, the collected articles were studied based on their titles and abstracts, and the selected ones were included in the second phase.
During the second phase, the MLTs were identified and selected. Next, relevant studies describing these tools in detail were searched using the Snowball citations method.
Finally, these studies were screened based on the title and abstract, and the appropriate ones were included in the pool of selected articles.
One researcher reviewed the articles.

Data Extraction and Analyses
The following data were extracted from the included studies: authors, type of article, and summary of the article.
The type of article was one of the following: • Review (selected from the first phase); • Tools assessment (selected from the first either second phase); • Automated tool (article selected from the first phase); • Book either book chapter (selected from the first or second phase).
The heterogeneity and the difficulty in finding analytical and similar descriptive data of the tools made it difficult to carry out a rigorous and standardized analytical record.
Specifically, the results of this review were classified into 12 tasks (categories) in accordance with their type of use and are presented in the results section. This classification relied heavily on the classification of tasks developed in the study by Tsafnat et al. [17].
Specifically, these are the following:

Results
Finally, 49 articles met the criteria and were included in this review; 17 of them were identified and gathered in the first phase of the study (Table 1) and 32 tools were identified and gathered in the second phase of the study (described in Section 4.2).  [20] An overview of strategies researchers have developed to automate the Systematic Literature Review (SLR) process. We used a systematic search methodology to survey the literature about the strategies used to automate the SLR process in SE These articles describe in detail the MLTs and platforms applied to the automatic extraction of clinical trial data and outputs.

Review Articles on MLTs for Extracting Clinical Trial Results
Systematic reviews, the cornerstone of evidence-based medicine, are not produced quickly enough to support clinical practice. Production costs, availability of the required expertise, and timeliness are often cited as major factors for this delay. The following reviews and surveys (Table 1) were designed to support or automate individual tasks of reviews, and systematic reviews of randomized controlled clinical trials, and to reveal trends, applied algorithms, and tools while highlighting the convergence of many parallel research projects [17].

Articles Relative to MLTs for Extracting Clinical Trial Outputs
This section lists state-of-the-art tools that automate tasks that support knowledge extraction from clinical trial outputs.
Analytically, during the second phase of this research, 32 MLTs were identified and were selected.
The most important of these MLTs with a brief description of them are listed below. Many of the tools present characteristics that place them in more than one category. Thus, if deemed necessary, they are recorded in all categories. Otherwise, this is simply stated in their description.

Design Systematic Search (includes 2 Tools)
Accelerates the design of a search by counting the number of times a word or phrase appears in a selected group of articles either by checking the recall and precision for each term in the search string and then displaying it visually.
Some of these tools are described below: • SRA-Word Frequency Analyzer [28], (h p://sr-accelerator.com/#/help/wordfreq, accessed on 10 August 2022) Accelerates the design of a search by counting the number of times a word or phrase appears in a selected group of articles. Words that appear frequently should be used in the systematic search.
• The Search Refiner [28] Accelerates designing a search by checking the recall (number of relevant studies found) and precision (number of irrelevant studies found) for each term in the search string and then displays it visually. Used to quickly determine which terms should be removed from the search string.

Run Systematic Search (includes two tools)
Allows for the search of specific concepts. Two of these tools are described below: • Polyglot Search Translator (h p://sr-accelerator.com/#/polyglot, accessed on 10 August 2022), [28,40] Accelerates running a search by converting a PubMed or Ovid Medline search to the correct syntax to be run in other databases.

Design Systematic Search (Includes Two Tools)
Accelerates the design of a search by counting the number of times a word or phrase appears in a selected group of articles either by checking the recall and precision for each term in the search string and then displaying it visually.
Some of these tools are described below: • SRA-Word Frequency Analyzer [28], (http://sr-accelerator.com/#/help/wordfreq, accessed on 10 August 2022) Accelerates the design of a search by counting the number of times a word or phrase appears in a selected group of articles. Words that appear frequently should be used in the systematic search.

•
The Search Refiner [28] Accelerates designing a search by checking the recall (number of relevant studies found) and precision (number of irrelevant studies found) for each term in the search string and then displays it visually. Used to quickly determine which terms should be removed from the search string.

Run Systematic Search (Includes Two Tools)
Allows for the search of specific concepts. Two of these tools are described below: • Polyglot Search Translator (http://sr-accelerator.com/#/polyglot, accessed on 10 August 2022), [28,40] Accelerates running a search by converting a PubMed or Ovid Medline search to the correct syntax to be run in other databases.

Deduplicate (Includes One Tool)
Automates most of the deduplication process. One relative tool is described below: • De-duplicator (http://sr-accelerator.com/#/help/dedupe, accessed on 8 August 2022) Automates most of the deduplication process by identifying and removing the same study from a group of uploaded records. It is designed to be cautious so some duplicates will remain, which will require removal manually.

Obtain Full Texts (Includes Three Tools)
Screen abstracts and obtain full texts. Some of these tools are described below: • SRA Helper (http://sr-accelerator.com/#/sra-helper, accessed on 8 August 2022) Accelerates screening and obtaining full texts by assigning groups to be performed with a hotkey. Hotkeys are also assigned to search a list of prespecified locations to attempt to find the full text of articles.
• SARA (http://sr-accelerator.com/, accessed on 8 August 2022) Automates requesting full-text articles to the library by requesting all of the needed full texts with a single request, for which normally these requests need to be processed and sent one at a time (available within SRA).
• ASH [41] The ASH tool allows users to download the full text of articles and perform a full-text search. The tool provides a meta-search interface that allows users to obtain much higher search completeness, unifies the search process across all digital libraries, and can overcome the limitations of individual search engines.

Snowballing (Includes One Tool)
These MLTs apply the method for automatic citation snowballing. One of these tools is described below: • ParsCit [42] The proposed tool for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles.

Screen Abstracts (Includes Six Tools)
Screening abstracts automatically sort a search retrieval by relevance. Some of these tools are described below: • RobotSearch (https://robotsearch.vortext.systems/, accessed on 8 August 2022), [9] It is a front-end for a ML model that identifies reports of randomized controlled trials. Moreover, automate citation screening by identifying the studies that are obviously not randomized controlled trials (RCTs) from a group of search results. Removes them, leaving a pool of potential RCTs to be screened.

•
Abstrackr (http://abstrackr.cebm.brown.edu, accessed on 8 August 2022), [16,43] The authors in [43] described the ongoing development of an end-to-end interactive ML system. More specifically, they developed abstrackr, an online tool for the task of citation screening for systematic reviews. This tool provides an interface to our ML methods. The main aim of this work is to provide a case study for deploying cutting-edge ML methods that will actually be used by experts in a clinical research setting.
• EPPI reviewer (https://eppi.ioe.ac.uk/cms/er4, accessed on 8 August 2022), [16,44] EPPI reviewer is an application plus a web-based software program for managing and analyzing data in literature reviews. It was developed for all types of systematic reviews (meta-analysis, framework synthesis, thematic synthesis, etc.), but also has features that would be useful in any literature review.
• SWIFT-Review (https://www.sciome.com/swift-review/, accessed on 8 August 2022), [16] SWIFT-Review (Sciome Workbench for Interactive computer-Facilitated Text-mining) provides several features that can be used to search, categorize, and prioritize large (or small) bodies of literature in an interactive manner. Moreover, it utilizes statistical text mining and ML methods that allow users to uncover over-represented topics within the literature corpus and to rank order documents for manual screening.

•
Colandr (https://www.colandrapp.com, accessed on 8 August 2022), [16] Colandr is a web-based, open access platform for conducting evidence reviews. Colandr can be used by collaborative teams and provides an organizational structure to manage information throughout the entire evidence review process. Among others, it provides collaborative team working, citation upload in common bibliographic formats (e.g., BibTex and RIS), de-duplication of citations, citation screening using the title and abstract powered by ML, data extraction from full texts powered by natural language processing, and the export of screening decisions and extracted data in comma-separated value format.
• Rayyan (https://rayyan.qcri.org, accessed on 8 August 2022), [16,45] Rayyan is a free web and mobile app that helps expedite the initial screening of abstracts and titles using a process of semi-automation while incorporating a high level of usability.

Data Extraction and Text Mining Tool (Includes Six Tools)
These systems automatically extract data elements (e.g., sample sizes, descriptions of PICO elements).
Some of these tools are described below: • ExaCT [16,21,22], (http://exactdemo.iit.nrc.ca, accessed on 8 August 2022) ExaCT is a prototype ML and text mining tool that helps to automatically extract study characteristics from the full-texts of RCTs. It also aims to help efficiency compared with manual data extraction.

•
RobotAnalyst [46], (http://www.nactem.ac.uk/robotanalyst/, accessed on 8 August 2022), [16] RobotAnalyst is a web-based software system that combines text-mining and ML algorithms for organizing references by their content and actively prioritizing them based on a relevancy classification model that is trained and updated throughout the process.
RobotAnalyst and SWIFT-Review also allow for topic modeling, where abstracts related to similar topics are automatically grouped, allowing the user to explore the search retrieval.

•
Dextr [47] Dextr provides a similar performance to manual extraction in terms of recall and precision and greatly reduces data extraction time. Unlike other tools, Dextr provides the ability to extract complex concepts (e.g., multiple experiments with various exposures and doses within a single study), properly connect the extracted elements within a study, and effectively limit the work required by researchers to generate machine-readable, annotated exports.

•
RobotReviewer (https://robotreviewer.vortext.systems, accessed on 8 August 2022), [16] RobotReviewer is an open-source ML system that supports semi-automated bias assessments. It accelerates assessing risk of bias on four of the seven risk of bias domains by highlighting the supporting phrases in the PDF of the original paper. A check of the assessments is recommended, although the process is drastically speeded up.

•
Trialstreamer [48] Trialstreamer continuously monitors PubMed and the World Health Organization International Clinical Trials Registry Platform and looks for RCTs using a validated classifier. It combines ML and rule-based methods to extract information from the RCT abstracts.

Automated Bias Assessments (Includes One Tool)
These tools support automatic assessment of the biases in the reports of RCTs. The systems are recommended for semi-automatic use (i.e., with human reviewer checking and correcting the ML suggestions).

Automated Meta-Analysis (Includes Three Tools)
Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies. Unfortunately, metaanalysis involves great human effort, rendering a process that is extremely inefficient and vulnerable to human bias. To overcome these issues, researchers are working toward automating meta-analysis [3].

•
PythonMeta [4] PythonMeta package performs the meta-analysis on an open-access dataset from Cochrane.

Summarize/Synthesis of Data (Analysis) (Includes One Tool)
Although software tools have long existed to support the data synthesis component of reviews (especially to perform meta-analyses), methods for automating them are beyond the capabilities of the available ML and NLP tools [16]. However, research in these areas continues apace. Thus, a related tool, recently developed, is described below: • Visae [52] Visae is an app developed in R that uses correspondence analysis to help summarize data on adverse events from clinical trials. It is built on the underlying approach of applying stacked correspondence analysis and contribution biplots to help explore differences in adverse events among interventions within clinical trials 4.2.11. Write Up (Includes Two Tools) These tools help with the auto-generation of the abstract, results, and discussion sections of a review.
Some of these tools are described below: • Endnote (https://endnote.com/, accessed on 8 August 2022) Endnote screen abstracts, obtains full texts, and writes up SR. Accelerates multiple tasks and it assists with reference management. Useful for storing search results, finding full texts, sorting into groups during screening, and to insert references into the manuscript.

•
RevManHAL [53] RevManHAL is an add-on program, which helps auto-generate the abstract, results, and discussion sections of RevMan-generated reviews in multiple languages.

Data Miner/Analysis of Data for General-Purpose (Includes Five Tools)
These are toolkits that supportML and data mining processes. Some of these tools are described below: • RapidMiner [5,37] RapidMiner supports predictive analysis with its user-friendly, rich library of data science and ML algorithms through its all-in-one programming environments such as RapidMiner Studio. Besides the standard data mining features such as data cleansing, filtering, clustering, etc., the software also features built-in templates, repeatable work flows, a professional visualization environment, and seamless integration with languages.
• WEKA [5,37,[54][55][56][57] WEKA is a widely used toolkit for ML and data mining that was originally developed. It contains a large collection of state-of-the-art ML and data mining algorithms written in Java. WEKA contains tools for regression, classification, clustering, association rules, visualization, and data pre-processing.
• KNIME [5,58] KNIME is an open source data analysis platform. It allows the user to create workflows for processing and analyzing almost any kind of data. Written in Java and built upon Eclipse, its access is through a GUI that provides options to create the data flow and conduct data pre-processing, collection, analysis, modeling, and reporting.
• COKE [23,24] The COKE (COVID-19 Knowledge Extraction framework for next generation discovery science) project involves the use of machine reading and deep learning to design and implement a semi-automated system that supports and enhances the SLR and guideline drafting processes. Specifically, the authors propose a framework for aiding in the literature selection and navigation process that employs natural language processing and clustering techniques for selecting and organizing the literature for human consultation, according to PICO (Population/Problem, Intervention, Comparison, and Outcome) elements.
• KEEL (http://keel.es/, accessed on 8 August 2022), [59][60][61][62][63] KEEL (Knowledge Extraction for Evolutionary Learning) is a Java-based open source tool. It is powered by a well-organized GUI that lets you manage (import, export, edit, and visualize) data with different file formats, and to experiment with the data (through its data pre-processing, statistical libraries, and some standard data mining and evolutionary learning algorithms).
Summarizing the results obtained from this study, it is worth mentioning that advances in technology have revolutionized the healthcare sector. ML has helped create tools and methods for the effective management of data in healthcare [64].
Data mining, also known as knowledge discovery from databases, is a process of mining and analyzing enormous amounts of data and extracting information from it [33].
The growing interest in the extraction of useful knowledge from data with the aim of being beneficial for the data owner has given rise to multiple data mining tools [35].
More specifically, this review produced the following results: • Using MLTs to assist with data extraction resulted in performance gains compared with using manual extraction. • At the same time, the use of MLTs has enough flexibility and can speed up and further improve the results of meta-analyses.

•
In summary, there are a number of data mining tools available in the digital world that can help researchers with the evaluation of the clinical trials outputs [34]. Evaluations from applying ML to datasets and clinical studies show that this approach could yield promising results.
Evaluations of these tools were found in a number of articles identified by this study. Specifically, the initial use of automated tools resulted in modest [21] to satisfactory time [29] savings compared with manual management.
In addition, the evaluation of the performance, functionality, usability, user interface, and system requirements also yielded positive results [35].

Discussion
The whole idea of developing ML is associated with achieving faster, more efficient, and more reliable results in the health sector. ML mainly comprises algorithms that, when put together, have the power to diagnose, display results, and feed data into databases faster than the traditional method of entering the data manually. Nowadays, as more clinically relevant datasets are available electronically, researchers have applied ML techniques to a wide range of clinical tasks [64].
As reported in the literature, many benefits arise from MLTs in the field of extracting clinical trial results.
More specifically, ML has been described as "the key technology" for the development of precision medicine [4]. ML uses computer algorithms to build predictive models based on complex patterns in data. ML can integrate the large amounts of data required to "learn" the complex patterns required for accurate medical predictions. ML has excelled in automated meta-analysis, extraction of data from clinical trials and text mining, semiautomates bias assessments, and in specific medical domains.
The aim of this article is to discover data mining tools used in EBHI and to provide the research community with an extensive study based on a wide set of features that any tool should satisfy. In this paper, the author addresses the interest of data mining and describes the most popular mining tools used in EBHI, and especially to extract clinical trial results.
Although there is no tool that can automate the entire knowledge extraction process, the author identified a broad evidence base of publications describing the overview of (semi)automated data-extraction literature in order to extract the results from the clinical trials.
However, the lack of publicly available gold-standard data for evaluation, and the lack of application thereof, makes it difficult to draw conclusions about which is the best-performing system for each data extraction target [66].
This review aims to present the appropriate MLTs that will allow for faster and more reliable extraction of information and knowledge from clinical trials and related to the prognosis, diagnosis, treatment, and drug use, as related studies are limited.
There are a limited number of relevant studies. However, these either refer to algorithms and techniques or study the performance of an MLT with applications in clinical trials. The reviews that focus on the subject of data extraction from clinical trial data either present a small sample of MLTs or deal with a specialized task.
Thus, the contribution of this study is the renewal of existing knowledge by presenting a large number of older and more modern tools for extracting information and knowledge from the outputs of clinical studies. MLTs of more general use are also presented, i.e., tools that are not limited to the management of RCTs, but that can be used in them as well.
Nevertheless, this review aims first to explore the options available for automating information and knowledge extraction in this domain. A more detailed and in-depth review will follow in the future.
In addition, the present study has some methodological limitations. Initially, the author had some difficulty in identifying suitable articles. This limitation was partially addressed through the use of snowballing methods. Secondly, the author included articles written only in English.
In addition, it was not possible to present a consolidated list with a common rating. This happened because each author adopted different evaluation criteria for the tools they presented.

Conclusions and Future Directions
Evidence-based knowledge synthesis in medicine, i.e., clinical trials, is rapidly becoming unfeasible due to the extremely rapid increase in evidence production. At the same time, limited resources (in cost, human resources, time, and money) can be better used with computational assistance and automation to significantly improve the process of extracting knowledge from clinical trials. In addition, advances in the automation of systematic reviews of clinical trials will provide clinicians with more evidence-based answers and thus enable them to provide higher quality information [17].
Sequentially, ML is the fastest growing field in computer science and in accordance with Health Informatics, one of the biggest challenges has become providing improvements in medical diagnoses, disease analysis, and drug development in the future [67].
In summary, design based on the application of clinical trial outputs in ML is a promising approach to implement more effective solutions.
However, more studies are needed in the future for clinical and technical validation of the performance of ML tools in the health sector. Among other things, future research should focus on studying the assessment characteristics in order to propose common measurement standards and assessment mechanisms for these MLTs.
It is also important to conduct a systematic review analytically and precisely evaluate, and to apply strict evaluation criteria to the MLTs. In this way, it becomes possible to choose the right MLT for each case.