1. Introduction
Increasingly, the healthcare sector is looking for computer applications to support the daily practice of health professionals. Open-source software is particularly desirable in this sector as, besides being free, it has a source code that is fully available to users for viewing, reading, modification, and redistribution, without the restrictions of ownership of the product (unlike free software, which only allows its use without charge).
Open-source software differs fundamentally from an ownership model in terms of the development process and the product licenses. All open-source applications are licensed by an open-source license, which gives the user the right to use the software, access and modify the source-code, and redistribute the software for free. This type of software is very popular due to its many advantages, since it promises to accelerate the diffusion of Information Technologies (IT) solutions in healthcare. Thereby, it can contribute to reduced development costs [
1,
2].
One the other hand, the choice of a free license software in health and medical informatics is important because it determines the user’s rights and can influence the developers’ willingness to participate in a project, the quality of the product, and the willingness of users to adopt certain applications [
1,
2].
In terms of costs, organizations can save on licensing fees and reduce expenditures on specific computer hardware. However, organizations need to welcome and train specialized collaborators in the adoption of open-source solutions. This type of situation has the hidden costs of highly skilled employees, implementation, maintenance, and a support process, which may lead to adopting proprietary solutions [
1,
2].
The acquisition of a Business Intelligence (BI) tool and its implementation are quite advantageous for the health organization. Extending their use can have a positive global impact. However, the healthcare environment has some particularities that a BI solution should be prepared to answer. For example, the BI system could lead to resource optimization in various departments; it will improve the clinical condition of the patient through efficient diagnosis and the identification and application of the best practice protocols for treatment, among others [
3].
In order to help decision-makers make the best choices, a benchmarking study of BI tools focused on the healthcare environment was performed. After a thorough review of the literature, it was determined which tools were being used in this study. These tools were selected based on their good performance in several areas, such as management, healthcare, or retail [
4,
5,
6]. Thus, the tools selected for this study were
QlikView [
7],
Palo BI Suite [
8],
Jaspersoft BI [
9],
Tableau Public [
10],
Spago BI [
11], and
Pentaho BI Suite [
12].
The analysis of this type of software emerged during a project that included the development of a BI platform in Maternity Care to visualize clinician and management indicators, as well as integrating Data Mining (DM) predictive models. All the tools were tested using real data provided by the Centro Hospital do Porto (CHP) and Centro Materno Infantil do Norte (CMIN). With this study, it is intended to select the BI tool that best suits the healthcare sector by using a practical case.
Besides the introduction, this article includes eight sections. The second section provides an approach to the background and related work, in which the concept of Business Intelligence is addressed and an introduction to the case study is presented. The third section is concerned with the application of BI tools in healthcare environments, followed by the requirements considered essential in these tools, in
Section 4. Thereafter, follows
Section 5, in which are addressed in more detail the tools selected; and
Section 6, in which we discuss our approach to the case study. Finally, the last sections correspond to the results, discussion, and conclusion.
5. Business Intelligence Tools—Benchmarking
The study was based only on the analysis of open-source solutions. To achieve this goal, some works were analyzed using scientific databases. During this process the most used open-source solutions were identified. Then the top six solutions (QlikView, Palo, Jaspersoft, Tableau, Spago, and Pentaho) were selected to be explored.
In this study, benchmarking was used in order to compare tools and define evaluation metrics, focusing on the healthcare industry. In the following subsections, an overview of the Business Intelligence (BI) tools tested during this study are presented, based on the experiments made, as well as an extended analysis, i.e., literature review, of each [
2,
4,
20,
25,
26,
29,
30,
31,
32,
33,
34]. All the BI tools were selected based on web-based studies, i.e., through a review of the literature it was determined which tools were to be used in this study.
5.1. QlikView
QlikView (QV) is a BI software developed by the Swedish company QlikTech. Although this software is a proprietary product, the company provides a full development version of the software for free. Nonetheless, the company offers various licensing options by limiting the use of the software in accordance with the license acquired by the user.
A key feature of this software is that it is fairly simple to extract data from different sources by allowing connections through Open Database Connectivity (ODBC) and Object Linking and Embedding Database (OLEDB), which are communication interfaces between the operating system and the various databases. OLAP also allows several operations by facilitating the user navigation between different dimensions through ad hoc queries. Moreover, it also enables the creation of a friendly and flexible interface with charts, pivot tables, and statistical analyses.
Another important feature of the software is that it does not use a System Management Database (DBMS) as a storage tool. The system connects itself to the base of the Transaction Processing in Real Time (OLTP), used only upon completion of the data loading process. Thus, the data are submitted to ETL processes and compressed in a file with extension.qvw, understood by QV. This QlikView file is a file that contains all the details required for data analysis, including the data itself, necessary to update the QV script file with new data from the data source; the layout information (folders, lists, charts, etc.); alerts, bookmarks, documents and reports; and information about access restrictions and the module macros.
Moreover, the distribution of information is facilitated and the analysis can be performed regardless of the original data location or network conditions. Thus, the possibility arises of the end user viewing the application in many ways; once the file.qvw is generated it may be viewed on any machine with the data and reports produced. In addition, access can be from a browser, through an application server, according to the safety rules of the customer. However, this feature is only possible in the paid version of QV. In the free version, the final document can only be used on the machine where it was initially developed.
In conclusion, the advantages of QV are the ease of reporting by end users, since the ETL processes are being held, and the fact that reports can be produced with the inclusion of graphics and basic knowledge of aggregate functions (e.g., sums and counts).
6. Business Intelligence Tools—Assessment Process
In this section, a relationship between the features referenced as the requirements of a Business Intelligence (BI) platform in a healthcare environment (
Section 4) with the BI tools previously chosen and analyzed based on web-based studies, i.e., through a review of the literature, is presented (
Section 5).
The assessment process was made in a group setting (10 people, five nurses and five IT specialists). A nurse was responsible for creating a team and providing us with the final observations. They performed two rounds of assessment and then provided us with the final assessment of each feature (group result). In order to maintain the anonymity of the results, we did not interfere in the assessment process. At the end, a meeting was held with the responsible nurse in order to understand the group’s opinions.
Thus, in
Table 1, a comparison of the tools and the requirements is given. For the comparative analysis of the selected tools, a classification criterion based on the following scale has been chosen:
Thus, for each of the characteristics, each tool was rated. This rating reveals the tool satisfaction and compliance level in accordance with the characteristic analyzed. The score was based on the experimentation and critical view of the users.
On the other hand, after analyzing each of the characteristics for each tool, the requirements were grouped by a degree of similarity among them. Six groups were considered, where all the requirements were distributed, and a percentage was attributed according to their importance in a BI platform in a healthcare context.
It must be noted that all the scores attributed and the weightings assigned to each group were defined based on the critical opinion of a multidisciplinary team of health professionals, i.e., physicians and nurses from Centro Materno Infantil do Norte (CMIN), and IT professionals from Centro Hospitalar do Porto (CHP), including the authors of this paper. Thus, the first part of the evaluation was based on collecting opinions from several professionals involved directly or indirectly with the use of BI tools and, then, the critical analysis and assessment of all the feedback collected from the professionals by the authors involved in this study.
Thus, initially, the group of indispensable features (must have), whose importance cannot be measured, i.e., they are strictly necessary, was identified.
Then, a group was defined in which the requirements were targeted for the benefit of the administrator responsible for the construction of the BI platform. This group was assigned a percentage of 5%, because they do not have great importance in the health organization. It may have some value only to the programmer who manages the tool.
Another set group was the group whose characteristics are advantageous for the end-user of the BI platform. This group was assigned a percentage of 25%. It has a higher percentage assigned because it is important that the end-user is satisfied. Otherwise, this may cause the failure of the platform’s implementation.
On the other hand, the group with the highest percentage associated (30%) corresponds to the technologies that the tool incorporates, because the main focus of these tools are study and analysis functionalities. With the non-existence of features such as OLAP technologies or dashboards, a BI platform loses all its interest to the IT professionals and health professionals.
To some other important features, a percentage of 25% was allocated.
Finally, the last group created was associated with the processing of data and has a percentage of 15% associated. In this group, all the features included would be advantageous to incorporate in the tool; however, they do not have much influence on the success of the construction of the BI platform. For example, in the ETL process, if the tool could incorporate ETL procedures, it would have many benefits. However, the ETL may be performed using other tools.
The groups, associated features, and respective percentages are all shown in
Table 2.
7. Business Intelligence Tools—Assessment Results
After the construction of
Table 1 and
Table 2, a final grade for each of the Business Intelligence (BI) tools was obtained. Thus, for each assigned classification by each requirement, the score was multiplied by the respective group percentages.
For example, in the case of the
Pentaho BI Suite, the ratings for each group are as follows:
Must Have: 5 + 5 + 5 + 4 = 19;
Administrator: (2 + 3 + 4 + 5) × 0.05 = 0.7;
End-user: (5 + 4 + 5 + 0 + 4) × 0.25 = 4.5;
Technologies: (5 + 4 + 3) × 0.3 = 3.6;
Other Important: (4 + 5 + 5 + 5 + 5) × 0.25 = 6;
Data Processing: (5 + 1) × 0.15 = 0.9.
Therefore, it can be verified that the final evaluation of the Pentaho BI Suite is 19 + 0.7 + 4.5 + 3.6 + 6 + 0.9 = 34.7. This procedure was repeated for each of the remaining tools.
In
Table 3, the respective final ratings are shown.
Analyzing the values reported in
Table 3, it could be concluded that, at first glance, the most appropriate tool would be
Spago BI. Mathematically speaking,
Spago BI is the better tool; however, the difference between it and
Pentaho Suite is insignificant. Taking this point into consideration along with the fact that
Spago BI installation is quite a complex process, there is little supporting documentation, and, in addition, this software takes up way more RAM,
Pentaho BI Suite was the tool chosen to implement the case study.
To justify this choice, a final quick comparative analysis between Pentaho BI Suite and each of the other tools was made. Thus, comparing Pentaho BI Suite with Spago BI, it appears that both tools are very good, and have very close scores on different requirements. However, the main distinguishing feature between these two tools is that the Spago BI installation is more difficult.
Considering the Jaspersoft BI, it appears that the Pentaho BI Suite provides more capabilities and better interactivity in terms of dashboards. Moreover, Pentaho BI Suite has many more developed plug-ins for use, such as CDE, Saiku, and OpenI, all available in the marketplace, unlike Jaspersoft BI, which presents a very limited number of plug-ins to date.
Regarding Palo BI Suite, it does not allow the display of KPIs and, moreover, does not allow the integration of Data Mining (DM) processes, unlike Pentaho BI Suite.
One the other hand, the biggest disadvantage of QlikView regarding Pentaho BI Suite is the failure to allow access to dashboards through browsers. This is an important feature since it is essential that the application can be accessed anywhere through a web page and also by multiple users simultaneously. Besides that, QlikView is not a completely open-source software since this feature is only available in the server version. Analyzing other features, QlikView and Pentaho BI Suite are more or less very similar.
Finally, Tableau Public is considered the tool that has the lowest number of desired characteristics, and does not support some major features, including the connection to an Oracle database, which is where all the data used in this BI project are stored.