Previous Article in Journal
A Tutorial Toolbox to Simplify Bioinformatics and Biostatistics Analyses of Microbial Omics Data in an Island Context
Previous Article in Special Issue
Cross-National Analysis of Opioid Prescribing Patterns: Enhancements and Insights from the OralOpioids R Package in Canada and the United States
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Causal Discovery for Patient Classification Using Health-Related Quality of Life Questionnaires

by
Maria Ganopoulou
1,
Konstantinos Fokianos
2,
Christos Bakirtzis
3,
Lefteris Angelis
1 and
Theodoros Moysiadis
4,*
1
School of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
2
Department of Mathematics & Statistics, University of Cyprus, Nicosia 1678, Cyprus
3
Multiple Sclerosis Center, Second Department of Neurology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
4
Department of Computer Science, School of Sciences and Engineering, University of Nicosia, Nicosia 2417, Cyprus
*
Author to whom correspondence should be addressed.
BioMedInformatics 2025, 5(2), 28; https://doi.org/10.3390/biomedinformatics5020028
Submission received: 19 February 2025 / Revised: 15 May 2025 / Accepted: 16 May 2025 / Published: 23 May 2025

Abstract

:
Background: Health-related quality of life (HRQoL) questionnaires are essential for understanding the physical, psychological, lifestyle, and social factors that impact patients’ well-being. Causal discovery demonstrates significant potential in this direction; however, it has not yet been thoroughly assessed. This study aimed to explore the perspective of utilizing causal discovery as a methodological tool for binary classification of patients based on HRQoL questionnaire data. Methods: The focus was on questionnaire structures similar to the EQ-5D-5L, which includes both ordinal and quantitative items. A customized classification algorithm is proposed, which utilizes the differences between the causal structures derived from the HRQoL questionnaire answers of patients who belong to two distinct groups. This algorithm was evaluated using the correct classification rate (CCR) and the misclassification rate (MR) based on simulated data under conditions of varying sample size and causal structures’ complexity, and within a real-world data application. Results: In both the simulation and application, the CCR exhibited larger values compared to the MR; however, the percentages that the algorithm could not result in a decision were, in general, not negligible. The adjusted CCR (algorithm yields a decision) exhibited substantially improved values compared to the CCR in both analyses. Within the application, the algorithm showed mixed performance compared to a standard stepwise binary logistic regression approach. Conclusions: The proposed algorithm has the potential to correctly classify patients, but further investigation is needed to evaluate its performance under different scenarios in a large-scale real-world setting. Determining the necessary conditions for successful classification would result in effectively exploiting causal discovery to further advance the role of HRQoL questionnaires in patient care and management.

1. Introduction

Health-related quality of life (HRQoL) questionnaires are used to evaluate the perception of individuals regarding their health and well-being. These questionnaires encompass multiple facets of life, such as physical, psychological, emotional, and social dimensions, and are widely employed in clinical research and healthcare to assess how health conditions and treatments affect people’s lives. Among the most commonly used are the MOS 36-Item Short-Form Health Survey (SF-36) [1], the World Health Organization Quality of Life-Brief Version (WHOQOL-BREF) [2], the Functional Assessment of Cancer Therapy (FACT) measurement system [3], and the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-C30 (EORTC QLQ-C30) [4]. The EuroQol questionnaire EQ-5D-5L was designed to assess five dimensions—mobility, self-care, usual activities, pain/discomfort, and anxiety/depression—each with five response levels (Likert scale) [5]. On top of these five ordinal variables/items, the EQ-5D-5L includes a quantitative variable, in which the patients are invited to assess their health on that specific day on a scale from 0 to 100 (100 representing the best health).
The HRQoL questionnaires-related literature typically employs standard descriptive, correlation, and inferential statistical analysis [6]. The standard correlation indices of Pearson and Spearman, for instance, were used to assess the relationship between the HRQoL score and demographics, clinical, biological, and other parameters [7,8,9,10,11,12,13,14]. Different subgroups of patients and controls have been compared based on their HRQoL scores [7,8,9,11,15,16,17,18,19,20,21,22]. Regression analysis has been employed to evaluate the correlation of various patients’ characteristics with the HRQoL score (e.g., [7,11,16,18,19,23]). The existence of a standard statistical correlation between two variables, however, does not necessarily signify a cause-and-effect relationship. Namely, that one of the variables is the cause of the other variable, which in this case is called the effect. Causal discovery justifies the causal nature of an association between two variables on the basis of its persistence [24]. Persistence characterizes a causal relationship, and testing for a causal relationship involves all the remaining variables of a data set and considers all circumstances [25]. Namely, the causal nature of association is expected to exist in all situations without being affected by the values of other variables. Therefore, causal relationships tend to be less spurious or volatile than statistical associations, such as correlation [24]. These aspects strongly motivate the use of causal discovery, which holds significant potential for a wide range of applications. As stated by Li et al., “Causal discovery is a step forward in data exploration and prediction” [24]. Applications of causal discovery can be found in the literature in diverse fields (see e.g., [26,27,28,29,30,31,32,33]). The two most frequently used causal discovery methods are the causal Bayesian Networks (BNs), which are graphical models that reflect the causal relationships between variables through a Directed Acyclic Graph (DAG), and the structural equation modeling (SEM), which evaluates the relationships among variables’ constructs [34].
Estimating and understanding the cause-and-effect relationships among the HRQoL questionnaire items may contribute to patient care and management by solidifying current knowledge and by revealing important insights, thus facilitating the design and implementation of more focused strategies. Despite the wide perspective, exploring the benefits of causal discovery has drawn limited interest, up until now, in the HRQoL-related literature. There have been, however, a few attempts to explore causal relationships with HRQoL questionnaires. Krethong et al. [35] investigated the causal structure involving the bio-physiological, functional, and symptom status, social support, general health perception, and HRQoL related to Thai patients with heart failure, using SEM. The method of SEM was used as well to evaluate the causal relationships between age, social support, symptom experience, self-care strategies, antiretroviral treatment, and HRQoL, for HIV/AIDS patients in the northern region of Thailand [36]. Gąsior et al. [37] employed the Greedy Fast Causal Inference technique to develop a causal diagram of sport participation for children and adolescents with heart disease [38,39]. In a previous study of our group [40], we utilized BNs and DAGs with HRQoL questionnaire data, and highlighted the most important aspects that should be taken into account when estimating the causal structure within HRQoL applications, and additionally proposed tools to interpret the obtained results.
The aim in the current study is to propose a binary classification algorithm, which employs the HRQoL questionnaire items’ causal structures that correspond to two groups of patients, to classify patients into these two groups. To the best of our knowledge, this is the first time that a causal discovery-based algorithm is used for classification aims in the literature. The two groups may refer to any grouping of interest that is potentially related to the HRQoL of the patients. For example, it could reflect worsening health conditions/disease progression (Yes/No), how well patients respond to treatment/treatment outcome (Yes/No), increased risk of depression/anxiety (Yes/No), adherence to treatment (Yes/No), etc. The focus was on questionnaire structures similar to the EQ-5D-5L. The reason is that EQ-5D-5L includes both ordinal and quantitative items, thus providing a more general example, compared to questionnaires that involve only ordinal items. Therefore, conditional linear Gaussian (hybrid) BNs (mixed discrete and normal variables) were employed.
The algorithm was initially assessed based on simulated data under diverse conditions, related to the complexity of the structure (number of items and cause–effect relationships among them) and the sample size (number of patients). A real-world data application was also employed to showcase the usage of the algorithm in a real-problem setting and to evaluate its performance. The results showed that the classification algorithm exhibited potential in correctly classifying random patients in their group; however, many aspects should be further assessed in order to better understand its behavior under different circumstances, particularly in a large-scale real-world setting. Determining the relevant conditions for successful classification would result in effectively exploiting causal discovery for this purpose and could further advance the role of HRQoL questionnaires in patient care and management.
In Section 2 of this manuscript, the detailed methodology is provided, including the definition of DAGs, the proposed algorithm design, the simulation study design, and key aspects of the real-world data application. Section 3 includes the results of the analysis, Section 4 corresponds to the discussion, Section 5 refers to the limitations of the study and future perspectives, and Section 6 concludes the paper.

2. Materials and Methods

2.1. Directed Acyclic Graph

A DAG is a structure that includes nodes/variables and directed edges. If there exists a directed edge that connects node X to Y , then X is considered to be the parent (or cause) of Y , while Y is considered to be the child (or effect) of X [41]. The acyclic nature of the DAGs signifies that there are no sequences of edges that begin from and end at the same node. The skeleton of a DAG is formed by removing the direction of all the edges in the DAG, constituting an undirected graph. If a DAG contains the edges X     Y and Y Z and does not contain an edge between X and Z , then the ordered triplet of nodes ( X ,   Y ,   Z ) is defined as a v-structure. A Markov equivalence class contains all the DAGs that have the same skeleton and the same v-structures. It is represented by a completed partially-directed acyclic graph (CPDAG) [42].

2.2. Causal Binary Classification Algorithm

The binary classification algorithm proposed herein utilizes the causal structures deriving from the answers, to a specific HRQoL questionnaire, of patients that belong to two distinct groups, which are intuitively related to the HRQoL of the patients (e.g., disease progression (Yes/No), treatment outcome (Yes/No), increased risk of anxiety (Yes/No), adherence to treatment (Yes/No)). Namely, it is expected that the answers of a patient to the HRQoL questionnaire are related to the group the patient belongs to. However, these groups should not be deterministically obtained by the HRQoL questionnaire items, in order for the classification to be meaningful.
The reasoning underlying the proposed algorithm is based on the expectation that the two groups of patients will exhibit, in general, two distinct causal structures based on the HRQoL questionnaire data. Thus, by adding a new patient belonging to one of these groups to each of these two patient groups, it is expected that the causal structure of the group the patient actually belongs to will not change, or it will change in a lesser extent compared to the respective causal structure of the group the patient does not belong to. Practically, the cause–effect relationships among HRQoL questionnaire items that are characteristic in each of the two patient groups can be utilized in order to decide in which group the new patient belongs. Utilizing the underlying causal structure constitutes a competitive advantage of the proposed algorithm over other classification approaches.
This reasoning is supported by many examples in the literature, where distinct groups of subjects exhibit distinct correlation structures. For example, in Tsanousa et al. [43], the differences in the genes’ correlation structures between two important subgroups of chronic lymphocytic leukemia (CLL) patients, namely, mutated and unmutated CLL, were employed to develop distinct gene networks with the use of SEM. These networks reflected the differences between these two subgroups (see also [44,45]). Bai et al. [46] studied the similarities and differences in the topological patterns of white matter structural networks between remitted geriatric depression patients, amnestic mild cognitive impairment patients, and healthy subjects, leading to evidence that could be used to define a population at risk of Alzheimer’s disease. Only recently, Muhetaer et al. [47] used a network approach to investigate the interconnection between depression and HRQoL domains in cancer patients. Among others, they estimated and compared the DAGs corresponding to three distinct groups, namely, patients with mild depression, patients with moderate-to-severe depression, and patients with no depression. They observed specific differences in the structures that indicated the potential impact of depression level on the overall HRQoL of the patients. These studies highlight the potential of our proposed approach, which aims to build on differences in HRQoL structures between two distinct groups of patients for classification purposes.
The steps of the proposed classification algorithm are described below.
  • Consider two distinct groups of patients (related to the patients’ HRQoL).
  • Use a causal structure learning algorithm to estimate the CPDAG for each of the two patient groups, based on their answers to a specific HRQoL questionnaire. These CPDAGs represent the causal structure corresponding to each patient group. Denote these two CPDAGs by CPDAG1 and CPDAG2.
  • Consider a new patient for whom we are interested in estimating the group they belong to. Add this new patient (answers to the HRQoL questionnaire) to both the first group of patients and the second group of patients. Estimate the causal structure (CPDAG) corresponding to the expanded first patient group (now additionally including the new patient) and to the expanded second patient group (similarly, additionally including the new patient). Denote these two CPDAGs by CPDAG1New and CPDAG2New.
  • Use the Structural Hamming Distance (SHD) [48] to compare the CPDAG1New to the CPDAG1 (SHD(CPDAG1New, CPDAG1)), and the CPDAG2New to the CPDAG2 (SHD(CPDAG2New, CPDAG2)). The SHD compares two CPDAGs and represents the number of operations required to make these CPDAGs match [48], in particular, to add or delete an undirected edge, and add, remove, or reverse the orientation of an edge.
  • If SHD(CPDAG1New, CPDAG1) < SHD(CPDAG2New, CPDAG2), the new patient is estimated to belong to the first group.
    If SHD(CPDAG1New, CPDAG1) > SHD(CPDAG2New, CPDAG2), the new patient is estimated to belong to the second group.
    If SHD(CPDAG1New, CPDAG1) = SHD(CPDAG2New, CPDAG2), the algorithm cannot conclude to which group the new patient belongs.
The design of the causal binary classification algorithm is displayed in Figure 1 as well.
Although in this study the binary case is considered (two groups of patients), the proposed classification scheme may be straightforwardly generalized and used for the classification of patients to any number of groups the patients may be divided into. More specifically, in the case of k groups of patients (step 1 of the algorithm), the estimated causal structures (CPDAGs) corresponding to each of the k patient groups would have been denoted by CPDAG1, CPDAG2, …, CPDAGk (step 2). Similarly, in step 3, the new patient would have been added to each of the k patient group resulting in CPDAG1New, CPDAG2New, …, CPDAGkNew. Steps 4–5 would aim to determine which of the k resulting SHD is the minimum, thus indicating the group the new patient is assigned to.

2.2.1. Simulation Study Design

To assess the efficiency of the binary classification algorithm, a simulation study was performed. The focus in this manuscript was on conditional linear Gaussian BNs, namely, BNs that involve both discrete and normal variables. Four hypothetical HRQoL questionnaires were considered with different numbers of items (variables). Each questionnaire included one continuous item, titled Score, reflecting the synonymous item in the EQ-5D-5L questionnaire, while the remaining items were ordinal with 4 levels. The number of levels was arbitrarily selected and reflected the fact that the items found in HRQoL questionnaires usually involve 3 to 5 answers. The ordered nature reflected the increasing health burden typically encountered in HRQoL items.
It was assumed that there were two hypothetical groups of patients, which, as previously mentioned, were related to the HRQoL of the patients. In the simulation study, the binary grouping of the patients was reflected in differences in the corresponding causal structures of these two patient groups, as it would have been expected in a real-data scenario. Namely, a different causal structure among the HRQoL questionnaire items was assumed for patients belonging to the first group compared to the respective causal structure that was assumed for patients belonging to the second group. Thus, for each hypothetical questionnaire (a specific set of items), a pair of synthetic DAGs was specified, representing the causal structures corresponding to these two patient groups. These four pairs of DAGs are displayed in Figure 2. Diverse numbers and complexity regarding the relationships (represented by the DAG edges) between the items (represented by the DAG nodes) were involved, aiming at reflecting structures of causal relationships that could be encountered in real HRQoL data. The aforementioned DAGs varied in the number of nodes involved (ranging from 12 to 27) and the quantity of directed edges among them (ranging from 12 to 26).
The tailored parameter specifications for each DAG were determined using the R package bnlearn [49] (Version 5.0.2) and can be freely accessed at a dedicated GitHub repository (https://github.com/teomoi/HRQoL-Synthetic-DAGs, accessed on 1 April 2025), which was developed for this study. These four pairs of synthetic DAGs have not been used before in the literature in any causal discovery-related or other context.
The four pairs of synthetic DAGs were then used in order to assess the proposed customized classification algorithm, as described below. More specifically, for each hypothetical HRQoL questionnaire and the corresponding pair of DAGs, the following steps have been sequentially performed within the simulation study.
  • Generate k + 1 random samples (patients) based on each one of the two DAGs (denoted as DAG1 and DAG2) using the rbn function with the R package bnlearn [49], V. 5.02 (specific details are provided as well at the https://github.com/teomoi/HRQoL-Synthetic-DAGs, accessed on 1 April 2025). The number k represented the sample size in each case, namely the number of simulated patients (completed questionnaires), and received the values of 100, 500, 1000, and 5000.
  • For each one of the two DAGs, and based, respectively, on the first k random samples (out of the k + 1 ), use the pc.stable function with the bnlearn R package, V. 5.02, to learn the CPDAG from the simulated data using the PC algorithm [50]. The PC algorithm is a constraint-based structure learning algorithm, with PC standing for Peter Spirtes and Clark Glymour [51,52]. These two CPDAGs will be denoted as CPDAG1 and CPDAG2.
  • Add the k + 1 t h random sample corresponding to DAG1 (denoted as Test1) to both the first k random samples (out of the k + 1 ) of DAG1 and, respectively, to the first k random samples (out of the k + 1 ) of DAG2. Similarly, add the k + 1 t h random sample corresponding to DAG2 (denoted as Test2) to both the first k random samples of DAG1 and, respectively, to the first k random samples of DAG2. This resulted in four datasets, denoted, respectively, as DAG1Test1, DAG2Test1, DAG1Test2, and DAG2Test2, each of which included k + 1 random samples.
  • Use the pc.stable function to learn the CPDAGs from these four datasets, denoted, respectively, as CPDAG1Test1, CPDAG2Test1, CPDAG1Test2, and CPDAG2Test2.
  • Use the shd function [49] to assess the differences between the CPDAGs based on the SHD. More specifically, use the SHD to compare the following:
    • the CPDAG1Test1 to the CPDAG1 (SHD(CPDAG1Test1, CPDAG1));
    • the CPDAG2Test1 to the CPDAG2 (SHD(CPDAG2Test1, CPDAG2));
    • the CPDAG1Test2 to the CPDAG1 (SHD(CPDAG1Test2, CPDAG1));
    • the CPDAG2Test2 to the CPDAG2 (SHD(CPDAG2Test2, CPDAG2)).
  • Repeat steps 1–5 100,000 times.
  • Compute the percentage of the cases (out of the 100,000 repetitions) that:
    • the SHD(CPDAG1Test1, CPDAG1) was smaller, larger and equal to the SHD(CPDAG2Test1, CPDAG2),
    • the SHD(CPDAG1Test2, CPDAG1) was smaller, larger and equal to the SHD(CPDAG2Test2, CPDAG2).
  • Compute the adjusted percentage that:
    • the SHD(CPDAG1Test1, CPDAG1) was smaller and larger to the SHD(CPDAG2Test1, CPDAG2),
    • the SHD(CPDAG1Test2, CPDAG1) was smaller and larger to the SHD(CPDAG2Test2, CPDAG2).
In essence, these adjusted percentages were computed given that the algorithm did arrive at a conclusion, namely, after excluding the cases in which the SHD(CPDAG1Test1, CPDAG1) was equal to the SHD(CPDAG2Test1, CPDAG2) (respectively, when the SHD(CPDAG1Test2, CPDAG1) was equal to the SHD(CPDAG2Test2, CPDAG2)).
Test1 and Test2, which were defined in step 3 of the simulation study pipeline, correspond to a new hypothetical patient belonging to the first group and the second group, respectively, for whom we are interested in estimating the group they belong to (see also step 3 in the proposed classification algorithm). The design of the simulation study is displayed in Figure 3 as well.
For the implementation of the PC algorithm with the function pc.stable, the default arguments have been employed. The analysis has been performed with the R programming language, v.4.2.2.
The reasoning underlying the assessment of the proposed algorithm design within the simulation study is based on the expectation that an observation generated by a specific DAG should be related more to observations generated by the same DAG, compared to observations generated by a different DAG. In the context of the simulations, Test1 was expected to be related more to the first k random samples generated based on the same underlying causal structure (DAG1), compared to the first k random samples generated based on a different underlying causal structure (DAG2). Consequently, the SHD between CPDAG1Test1 and CPDAG1 was expected to be smaller compared to the SHD between CPDAG2Test1 and CPDAG2, leading to the correct classification of the Test1 random sample. Similarly, the SHD between CPDAG1Test2 and CPDAG1 was expected to be larger compared to the SHD between CPDAG2Test2 and CPDAG2, leading to the correct classification of the Test2 random sample. The percentages corresponding to the correct classification rate (CCR) and the misclassification rate (MR) (out of 100,000 repetitions) were defined as:
CCR- Test 1 = N u m b e r   o f   c a s e s   S H D C P D A G 1 T e s t 1 , C P D A G 1 < S H D ( C P D A G 2 T e s t 1 , C P D A G 2 ) / 100,000
MR- Test 1 = N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 1 , C P D A G 1 ) > S H D ( C P D A G 2 T e s t 1 , C P D A G 2 ) / 100,000
CCR- Test 2 = N u m b e r   o f   c a s e s   S H D C P D A G 1 T e s t 2 , C P D A G 1 > S H D ( C P D A G 2 T e s t 2 , C P D A G 2 ) / 100,000
MR- Test 2 = N u m b e r   o f   c a s e s   S H D C P D A G 1 T e s t 2 , C P D A G 1 < S H D ( C P D A G 2 T e s t 2 , C P D A G 2 ) / 100,000
Based on the design of the simulation study, there was a non-zero probability that the algorithm would not arrive at any conclusion. More specifically, when by adding Test 1 to the first k random samples generated based on DAG1, the estimated CPDAG differed compared to the estimated CPDAG based only on these k random samples (SHD(CPDAG1Test1, CPDAG1)) to the same extent as when Test1 was added to the first k random samples generated based on DAG2 and the estimated CPDAG was compared to the estimated CPDAG based only on that k random samples (SHD(CPDAG2Test1, CPDAG2). Similarly, the algorithm did not arrive at any conclusion in the cases that adding Test2 to the first k random samples generated based on either DAG1 or DAG2, resulted in equivalent impact in the estimated CPDAGs (SHD(CPDAG1Test2, CPDAG1) = SHD(CPDAG2Test2, CPDAG2)). Therefore, adjusted percentages were computed as well to provide an adjusted measure of the correct classification rate and the misclassification rate, given that the algorithm did arrive at a conclusion. More specifically, these adjusted percentages (out of 100,000 repetitions) were defined as:
adjusted   CCR- Test 1 = N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 1 , C P D A G 1 ) < S H D ( C P D A G 2 T e s t 1 , C P D A G 2 ) N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 1 , C P D A G 1 ) S H D ( C P D A G 2 T e s t 1 , C P D A G 2 )
adjusted   MR- Test 1 = N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 1 , C P D A G 1 ) > S H D ( C P D A G 2 T e s t 1 , C P D A G 2 ) N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 1 , C P D A G 1 ) S H D ( C P D A G 2 T e s t 1 , C P D A G 2 )
adjusted   CCR- Test 2 = N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 2 , C P D A G 1 ) > S H D ( C P D A G 2 T e s t 2 , C P D A G 2 ) N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 2 , C P D A G 1 ) S H D ( C P D A G 2 T e s t 2 , C P D A G 2 )
adjusted   MR- Test 2 = N u m b e r   o f   c a s e s   S H D C P D A G 1 T e s t 2 , C P D A G 1 < S H D ( C P D A G 2 T e s t 2 , C P D A G 2 ) N u m b e r   o f   c a s e s   S H D ( C P D A G 1 T e s t 2 , C P D A G 1 ) S H D ( C P D A G 2 T e s t 2 , C P D A G 2 )
The different number of simulated patients ( k ) considered enabled the assessment of the impact of the sample size in all cases.

2.2.2. Application Design

The efficiency of the causal binary classification algorithm was assessed as well within a real-world data application. The aim of this application was to showcase the usage and performance of the algorithm in a real-problem setting, employing data related to the EQ-5D-5L questionnaire. These data were obtained from a study sample of 196 patients with multiple sclerosis. These patients were recruited by the Multiple Sclerosis Center of the Aristotle University of Thessaloniki. More details regarding the multiple sclerosis patients, including the inclusion criteria, informed consent, study protocol, ethics approval, etc., are provided in [53]. The patients completed, among others, the EQ-5D-5L questionnaire, in order for their HRQoL to be assessed. Of the 196 multiple sclerosis patients, 193 patients successfully completed all EQ-5D-5L items (the remaining 3 patients exhibited missing responses). Of these 193 patients, 170/193 (88.08%) had Relapsing–Remitting Multiple Sclerosis (RRMS), while 23/196 (11.92%) had Secondary Progressive Multiple Sclerosis (SPMS) or Primary Progressive Multiple Sclerosis (PPMS). Patients with SPMS or PPMS were merged into the progressive multiple sclerosis group (n = 23), while RRMS patients represented the non-progressive multiple sclerosis group (n = 170).
The application of the causal binary classification algorithm aimed to assess whether the differences in the corresponding causal structures between these two groups were essentially contributing predictive power to effectively classify patients into their correct group (progressive or non-progressive). Following the reasoning underlying the proposed algorithm, it is expected that these two patient groups will exhibit distinct causal structures. Hence, adding, for example, a progressive multiple sclerosis patient to each of these two patient groups, it is expected that the causal structure of the progressive group will not change, or it will change to a lesser extent compared to the respective causal structure of the non-progressive group. Similarly, by adding a non-progressive multiple sclerosis patient to each of these two patient groups, it is expected that the causal structure of the non-progressive group will not change, or it will change to a lesser extent compared to the respective causal structure of the progressive group. The causal structures of the two groups refer to the causal relationships between the five EQ-5D-5L items (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) and the Score (scale from 0 to 100).
To assess the performance of the causal binary classification algorithm, it is required to randomly split the dataset into training and test groups. However, the progressive group includes just 23 patients, so any further splitting of this group into training and test datasets might yield erroneous conclusions for the eventual classification. This is an issue (small sample sizes) that affects generally proposed algorithms. Instead, we decided to separately apply the algorithm (Figure 1) for each of the 193 patients by considering each of them as the “new patient”, excluding this “new patient” from the full dataset (n = 193) and employing the remaining 192 patients as “group 1” and “group 2” and their corresponding causal structures. In other words, the algorithm was applied 23 times for the progressive group patients (group 1: 22 remaining progressive patients, group 2: 170 non-progressive patients) and 170 times for the non-progressive group patients (group 1: 23 progressive patients, group 2: 169 remaining non-progressive patients). Then, the distribution of the algorithm’s decisions was recorded (correct, wrong, no decision). In order to estimate the CPDAG from the data (see step 2 in Section 2.2), the PC algorithm was used employing the pc.stable function with the bnlearn R package [50] (V. 5.02), similarly to the case of the simulation study.
To compare these results with a well-known binary classifier, stepwise binary logistic regression was used within the same framework. Namely, each of the 193 patients was considered as the “new patient” and was excluded from the full dataset (n = 193), while the remaining 192 patients (“group 1” and “group 2”) were used to train a stepwise binary logistic regression model assessing the five EQ-5D-5L items and the Score as potential predictive factors for the patient group. The obtained model was then used to predict the group of this “new patient”, and the percentage of correct predictions was recorded.
Since the algorithm proposed in this study does not always result in a decision, the adjusted percentages were computed as well for both approaches, taking into account only the cases in which the proposed algorithm reached a decision.
To address the fact that the progressive group of multiple sclerosis patients exhibited a much smaller percentage (11.92%) in the study group compared to an 88.08% of the non-the progressive group, we additionally considered two more scenarios (“Synthetical 1” and “Synthetical 2”), in which the progressive group was doubled and tripled in size, respectively, by generating synthetical data using the 23 available progressive patients. In particular, in the scenario of “Synthetical 1”, each of the 23 progressive patients was additionally added to the progressive group, after randomly adding to each answer to the five EQ-5D-5L items, the values −1, 0, and 1, with probabilities 0.3, 0.4, and 0.3, respectively. Moreover, the Score of these additional synthetic patients was computed by summing their original Score and a randomly generated error following a normal distribution with a mean of zero and a standard deviation that equaled the empirical standard deviation of the original Score values multiplied by 0.05. The rationale behind assigning the specific probability values of 0.3, 0.4, and 0.3 for adding −1, 0, and 1, respectively, to each answer to the five EQ-5D-5L items was to introduce a small data perturbation by giving a slight preference to the 0 category (no change). The main goal was to generate synthetic data that are close to the real data, to evaluate the proposed method. A similar approach was employed for the Score variable. Thus, the “Synthetical 1” scenario consisted of 46 progressive patients (23 observed and 23 synthetically generated). In the scenario of “Synthetical 2”, the above procedure was applied twice, resulting in 69 progressive patients (23 observed and 46 synthetically generated). These two additional scenarios enabled us to provide further insights in the case that the two groups of interest were more balanced (percentage-wise).
The analysis has been performed with the R programming language, v.4.2.2.

2.2.3. Computational Time

Regarding the computational time required to run the proposed classification algorithm, this largely depends on the causal structure learning algorithm that is employed. Within the simulation study, the PC algorithm with the function pc.stable (R package bnlearn [49], V. 5.02) using the default arguments has been employed. Computational complexity aspects related to this version of the PC algorithm were discussed by Colombo and Maathuis [50], and generally regarding the PC algorithm in [54,55,56]. In our simulation study, when considering the fourth pair of synthetic DAGs (DAG4.1, DAG4.2), which exhibited the largest number of nodes (equal to 27) and edges (equal to 26), the required time to estimate the CPDAG with the largest sample size ( k = 5000 ) was less than a second. This implies that, since the proposed classification algorithm requires to estimate CPDAG1 and CPDAG2 (step 2), CPDAG1New and CPDAG2New (step 3), and to compare them, respectively, using SHD (namely, SHD(CPDAG1New, CPDAG1) and SHD(CPDAG2New, CPDAG2)—step 4), the required time to classify a new patient will be less than a few seconds. Therefore, for HRQoL questionnaires with similar characteristics and underlying causal structures as the ones assessed (Figure 2), the computational time is not a limiting factor. This was validated as well within the real-world data application, in which the classification of all 193 patients required less than a few seconds.

3. Results

The results of both the simulation study and the real-world data application are presented below.

3.1. Simulation Study

The simulation results are displayed in Table 1 and Figure 4. In the case of DAG1.1 and DAG1.2, it was found that when considering the classification of random samples that were generated from DAG1.1, the percentage of correct classification (out of the 100,000 repetitions) was 27.64, 30.87, 35.40, and 45.10 for k = 100 ,   500 ,   1000 ,   5000 , respectively (Table 1). On the other hand, the respective values of the percentage of wrong classification were 14.48, 3.98, 3.60, and 0.15 (Table 1). In the remaining cases, the algorithm could not result in a conclusion (57.88, 65.16, 61.00, and 54.75). When considering the classification of random samples that were generated from DAG1.2, the percentage of correct classification received smaller values compared to DAG1.1 (25.29, 10.37, 9.44, and 18.36 for k = 100 ,   500 ,   1000 ,   5000 , respectively), while the percentage values of wrong classification were similar to DAG1.1, 19.96, 5.76, 0.85, and 0.15 (Table 1).
Considering DAG2.1 and DAG2.2, it was found that the CCR for random samples that were generated from DAG2.1 was 35.06, 41.25, 7.42, and 40.31 for k = 100 ,   500 ,   1000 ,   5000 , respectively (Table 1). The corresponding values of MR were 16.94, 2.85, 2.06, and 0.40. Rather similarly, the percentage of correctly classifying random samples that were generated from DAG2.2 was 27.44, 26.41, 11.61, and 43.72, while the percentage of wrongly classifying them was 21.82, 14.11, 3.29, and 1.33, respectively, for k = 100 ,   500 ,   1000 ,   5000 (Table 1).
In the case of DAG3.1 and DAG3.2, the CCR for random samples that were generated from DAG3.1 was 39.99, 30.04, 16.97, and 62.87 for k = 100 ,   500 ,   1000 ,   5000 , respectively, while the corresponding values of MR were 12.10, 9.46, 3.85, and 0.72 (Table 1). The respective percentages for random samples that were generated from DAG3.2 were much smaller for correct classification (22.71, 24.07, 11.04, and 18.09) and rather similar for wrong classification (25.21, 12.24, 6.83, and 1.10).
Considering DAG4.1 and DAG4.2, random samples that were generated from DAG4.1 were correctly classified in 39.35%, 30.65%, 28.62%, and 3.17% of the 100,000 repetitions (respectively, for k = 100 ,   500 ,   1000 ,   5000 ), and were wrongly classified in 12.30%, 14.74%, 7.65%, and 1.59% of the cases (Table 1). For random samples that were generated from DAG4.2, the corresponding percentages were 17.27, 33.15, 29.24, and 5.38 (correct classification) and 28.03, 11.63, 4.53, and 0.63 (wrong classification).
Considering the results concerning the adjusted percentages (Figure 4), it was found that in the case of DAG1.1 and DAG1.2, the adjusted CCR for random samples that were generated from DAG1.1 was 65.62, 88.58, 90.77, and 99.67 for k = 100 ,   500 ,   1000 ,   5000 , respectively (Figure 4A). On the other hand, the respective values of the adjusted misclassification rate were 34.38, 11.42, 9.23, and 0.33 (Figure 4A). Similar observations were obtained in the case of random samples that were generated from DAG1.2 for k = 1000 ,   5000 . In the case of k = 100 ,   500 , the adjusted CCR was 55.89 and 64.29, and the adjusted MR was 44.11 and 35.71, respectively.
In the case of DAG2.1 and DAG2.2, the adjusted CCR for random samples that were generated from DAG2.1 was 67.42, 93.54, 78.27, and 99.02 for k = 100 ,   500 ,   1000 ,   5000 , respectively (Figure 4B). The respective values of adjusted CCR in the case that the random samples were generated from DAG2.2 were 55.71, 65.18, 77.92, and 97.05. Similarly, as in the case of DAG1.1 and DAG1.2, the adjusted CCR was similar in both cases for k = 1000 ,   5000 and differed for k = 100 ,   500 .
In the case of DAG3.1 and DAG3.2, the adjusted CCR for random samples that were generated from DAG3.1 was similar for k = 100 ,   500 obtaining the values of 76.77 and 76.05, and increased for larger values of k   ( 1000 ,   5000 ) obtaining the values of 81.51 and 98.87, respectively (Figure 4C). When the random samples were generated from DAG3.2, it was found that the adjusted CCR received smaller values, namely 47.39, 66.29, 61.78, and 94.27 for k = 100 ,   500 ,   1000 ,   5000 , respectively (Figure 4C).
Next, in the case of DAG4.1 and DAG4.2, the adjusted CCR for random samples that were generated from DAG4.1 were 76.19, 67.53, 78.91, and 66.60 (Figure 4D). In the case of DAG4.2, it was found that the adjusted CCR received, in general, larger values, namely 38.12, 74.03, 86.59, and 89.52 for k = 100 ,   500 ,   1000 ,   5000 , respectively (Figure 4D).
The average adjusted CCR across the four hypothetical HRQoL questionnaires for Test1 was 71.50, 81.42, 82.36, and 91.04 for k = 100 ,   500 ,   1000 ,   5000 , respectively. The corresponding average adjusted CCR values in the case of Test2 were 49.28 (mainly attributed to the value 38.12 received in the case of DAG4.1 vs. DAG4.2), 67.45, 79.51, and 95.01 for k = 100 ,   500 ,   1000 ,   5000 , respectively.

3.2. Application

The results of the real-world data application on multiple sclerosis patients are displayed in Table 2 and Figure 5. Three different scenarios were considered. In the 1st scenario, only the observed data were assessed (group 1: 23 progressive patients, group 2: 170 non-progressive patients). It was found that the causal binary classification algorithm reached a correct decision in 26.09% of the progressive patients’ group, and in 8.70%, it resulted in a wrong decision. In 65.22%, it did not reach a decision (Table 2). Regarding the non-progressive patients’ group, the CCR was 19.41% of the cases, the MR was 14.71%, and in 65.88% of the cases, it did not reach a decision. The stepwise binary logistic regression approach, on the other hand, achieved a CCR of 30.43% in the case of the progressive patients’ group and 94.71% in the case of the non-progressive patients’ group (Table 2). The corresponding percentages of MR were 69.57% and 5.29%. When only the cases in which the causal binary classification algorithm reached a decision were assessed (n = 8 and n = 58, respectively), the corresponding adjusted CCRs for the binary logistic regression model were found to be 62.50% and 89.66%, respectively (Table 2). In the case of the causal binary classification algorithm, the adjusted CCRs were 75.00% and 56.90%, respectively (Table 2).
The CCRs generally substantially improved for both approaches in the scenarios that synthetical data were added to the observed ones, namely, in the scenarios “Synthetical 1” (group 1: 23 observed progressive patients + 23 simulated/synthetical progressive patients, group 2: 170 non-progressive patients) and “Synthetical 2” (group 1: 23 observed progressive patients + 46 simulated/synthetical progressive patients, group 2: 170 non-progressive patients). Focusing on the adjusted CCRs within the “Synthetical 1” scenario, it was observed that the causal binary classification algorithm led to a correct decision in 100.00% and 78.50% of the cases (n = 21 and n = 107), respectively, while the corresponding adjusted CCR values for the binary logistic regression approach were 71.43% and 96.26%, respectively (Table 2). For the “Synthetical 2” scenario, it was observed that the adjusted CCRs for the proposed binary classification algorithm were 77.78% and 83.59% (n = 36 and n = 128 cases), respectively, while the respective values for the binary logistic regression approach were 66.67% and 96.88% (Table 2).
In Figure 5, a visual representation of the adjusted CCRs is provided for all three scenarios for both the progressive and the non-progressive patients.

4. Discussion

This study proposes a new causal-based binary classification algorithm, which utilizes the HRQoL questionnaire items’ causal structures that correspond to two groups of patients, to classify patients into these two groups. The proposed approach capitalizes on the advantages of causation compared to standard statistical correlation. The reasoning underlying the proposed classification scheme is based on the premise that patients belonging to two distinct groups, such as patients with an increased risk of anxiety and patients with low risk of anxiety, are generally expected to exhibit distinct causal structures based on HRQoL questionnaire data (see also [47]). Despite the fact that the algorithm is based on the causal structures of the two patient groups, there is no requirement for prior related knowledge, since these causal structures are estimated (CPDAGs) using well-known causal structure learning algorithms during step 2 of the proposed binary classification algorithm. Hidden variables/confounders that are not taken into account by an HRQoL questionnaire and the fact that these questionnaires collect subjective patient information may distort the estimation of the causal structures, thus impacting the performance of the proposed algorithm, which is based on these estimated CPDAGs. The causal binary classification algorithm can be generalized for more than two groups and could be transferred to other scientific contexts as well.
The results of the simulation study in Table 1 have shown that in two of the cases considered (DAG2.1 vs. DAG2.2, and DAG4.1 vs. DAG4.2) the percentage values for both correct and wrong classification were similar both in the case that the random sample that was classified belonged to the first (Test1) and the second group (Test2). However, in the case of DAG4.1 vs. DAG4.2, the percentage of correct classification for k = 100 was much higher when Test1 was assessed compared to Test2 (39.35 vs. 17.27). On the other hand, in the two remaining cases (DAG1.1 vs. DAG1.2, and DAG3.1 vs. DAG3.2), the percentages exhibited substantial differences (Table 1). More specifically, in the case of DAG1.1 vs. DAG1.2, the percentage values of correct classification were higher when Test1 was assessed and much lower in the case of Test2 for k = 500 ,   1000 ,   5000 . A similar result was observed in the case of DAG3.1 vs. DAG3.2, but for all values of k . Still, the results of wrong classification were similar in these two cases as well.
In general, by increasing the number of patients ( k ), no clear pattern has been observed in the percentage values regarding either correct or wrong classification (Table 1). For instance, in the case of DAG1.1 vs. DAG1.2 and Test1, increasing k resulted in a monotonous increase in the percentage of correct classification. At the same time, this was not observed when Test2 was assessed. When comparing DAG2.1 vs. DAG2.2, the percentage values of correct classification were high for k = 100 ,   500 ,   5000 in both cases (Test1 and Test2), but very low in the case of k = 1000 (7.42 and 11.61, respectively). A similar result was observed in the case of DAG3.1 vs. DAG3.2 (Table 1). In this case, the higher percentage of correct classification was obtained at 62.87 for Test1 and k = 5000 . In the case of DAG4.1 vs. DAG4.2, it was observed that by increasing k the percentage values of correct classification exhibited a monotonous decrease when Test1 was assessed, but not in the case of Test2 (Table 1).
Despite the fact that correct classification exhibited larger values compared to wrong classification, the percentages that the algorithm could not result in a decision were, in general, not negligible (Table 1). Although it is not clearly shown in the Results Section, in these cases, to a major extent, the initial causal structures were not impacted by the addition of Test1/Test2; thus, it was not possible for the algorithm to discern in which group the corresponding random sample should be classified. Namely, in these cases, both SHD(CPDAG1Test1, CPDAG1) and SHD(CPDAG2Test1, CPDAG2) were equal to zero, and a similar result was observed in the case of Test2. This phenomenon was expected to some extent and can be attributed to the fact that adding one more patient to a sample of k simulated patients may result in the same estimated causal structure. On top of that, there were cases in which the impact on the initially estimated causal structures (CPDAG1 and CPDAG2) was identical, based on the SHD, for example, SHD(CPDAG1Test1, CPDAG1) = SHD(CPDAG2Test1, CPDAG2) = 2. This example implies that adding Test1 resulted in two differences in the estimated causal structures CPDAG1Test1 and CPDAG2Test1 compared to CPDAG1 and CPDAG2, respectively, although these differences were not necessarily the same. This phenomenon was also expected to be more intense for larger values of k , since a CPDAG that was estimated based on larger sample sizes was expected to be less affected by the addition of one more observation. Indeed, it was observed that the algorithm could not result in a conclusion on an average percentage of 51.80, 62.30, 77.20, and 69.62 for k = 100 ,   500 ,   1000 ,   5000 , respectively.
However, when assessing the adjusted percentages, in order to obtain a clearer picture regarding the performance of the algorithm, namely, by evaluating only the cases in which the algorithm did arrive at a conclusion, the results were substantially improved. In particular, the adjusted correct classification rate received higher values compared to 50% in almost all cases, with the exception of Test 2 in the case of DAG3.1 vs. DAG3.2 and DAG4.1 vs. DAG4.2, but only for k = 100 (Figure 4C,D). This is clearly shown by assessing the average adjusted CCR across the four hypothetical HRQoL questionnaires. For Test1, the average adjusted CCR was found to be 71.50, 81.42, 82.36, and 91.04 for k = 100 ,   500 ,   1000 ,   5000 , respectively. In the case of Test2, the average adjusted CCR was 49.28, 67.45, 79.51, and 95.01 for k = 100 ,   500 ,   1000 ,   5000 , respectively. Although these values were smaller compared to the Test1 case, they were still substantially higher than 50% (with the exception of k = 100 ). In both cases, it was observed that the average adjusted CCR exhibited a monotonic increase as k increased. This was observed as well for the adjusted CCR in many distinct cases, in particular, for both Test1 and Test2 in the case of DAG1.1 vs. DAG1.2, for Test2 in the case of DAG2.1 vs. DAG2.2 and for Test2 in the case of DAG4.1 vs. DAG4.2. These results were intuitively expected since the simulation study was based on the expectation that Test1 (similarly for Test2) would be related more to the k random samples generated based on the underlying causal structure of the first group, compared to the respective k random samples generated based on the underlying causal structure of the second group. Thus, increasing k was expected to result in more robust causal structures, strengthening this expectation and improving the adjusted CCR.
On the other hand, the results in Figure 4 imply that apart from the sample size, the selection of the HRQoL questionnaire and the corresponding underlying causal structures impacted the results as well. Namely, the number of the questionnaire items that were used in the simulation and the cause–effect relationships among them. More specifically, this was reflected in the lower values received, in general, by the adjusted CCR in the case of Test2 when compared to the respective ones in the case of Test1. It was reflected as well in the patterns of value evolution of the adjusted CCR, which, although they exhibited explicit similarities, they demonstrated differences as well. For example, the patterns that were observed in the case of adjusted CCR for Test2 were similar in the first two cases (Figure 4A,B), but they were clearly different compared to the other two cases (Figure 4C,D). A similar conclusion emerges by comparing the pattern evolution of the adjusted CCR in the case of Test1. The impact of the characteristics of the underlying causal structures was observed as well in the case of the percentage values regarding either correct or wrong classification (Table 1). More specifically, the differences observed in the classification results between Test1 and Test2 on the one hand, and within Test1 across the four comparisons (similarly for Test2) on the other hand, indicated that, on top of the sample size, the accuracy was influenced by the characteristics of the underlying causal structures.
The results of the real-world data application on multiple sclerosis patients have shown that the proposed algorithm was unable to reach a decision in a high percentage of cases for both the progressive and the non-progressive patient group (65.22% and 65.88%, respectively, see Table 2). However, the no-decision rates were substantially reduced in the two scenarios where the two groups were more balanced (“Synthetical 1” and “Synthetical 2”) with values 54.35%, 37.06% and 47.83%, 24.71%, respectively. The CCRs of the proposed algorithm were much higher than the corresponding MRs in all cases, and especially when the two groups were more balanced (scenarios “Synthetical 1” and “Synthetical 2”). However, as naturally expected because of the no-decision rates, they were lower than the corresponding CCRs of the stepwise binary logistic regression approach (Table 2). On the other hand, the stepwise binary logistic regression approach exhibited as well higher MRs, particularly in the case of the progressive group, which was in all scenarios the minority patients’ category with percentages 11.92% (23/192), 21.30% (46/216) and 28.87% (69/239). In the 1st scenario, the MR of the binary logistic regression model related to the progressive group was 69.57% compared to an MR of 8.70% of the proposed approach (Table 2). In the “Synthetical 1” scenario, the corresponding MRs for the progressive group were 41.30% and 0.00%, respectively. Finally, in the “Synthetical 2” scenario, the corresponding MRs were 34.78% and 11.59%, respectively (Table 2).
This is reflected as well in the comparison of the adjusted CCRs between the two classification approaches, namely, when only the cases in which the causal binary classification algorithm reached a decision were assessed. By visually inspecting Figure 5, it is clear that the causal binary classification algorithm exhibited higher adjusted CCRs in the case of the progressive group of patients (minority class), while the binary logistic regression model exhibited higher adjusted CCRs in the case of the non-progressive group (majority class). This possibly implies that the proposed approach is able to efficiently classify a patient in the minority class, even if there is a severe class imbalance in the data. This ability may be related to the expectation that the causal structure of the EQ-5D-5L items that corresponds to a small sample size of patients may be more sensitive to the addition of a patient that belongs to a different group, compared to the causal structure of the EQ-5D-5L items that corresponds to a much larger sample size of patients. On the other hand, the causal binary classification algorithm was inferior to binary logistic regression when the non-progressive group of patients was considered.
These observations are generally in agreement with the ones that emerged from the simulation study, namely, that the sample size and the characteristics of the underlying causal structures impact the accuracy of the proposed causal binary classification algorithm. Larger and more detailed real-world data applications are necessary in order to delve deeper into understanding both the performance of the proposed approach and how it is compared to other well-known binary classifiers in a real-world data setting.

5. Limitations and Future Perspectives

The simulation and real-world data application results showcased the potential of employing causal discovery for classification purposes in the context of HRQoL. However, the study is limited by the fact that the proposed algorithm does not always yield a decision. To address this issue within the application, the adjusted CCRs were employed in order to provide a more comprehensive comparison of the proposed algorithm’s accuracy against binary logistic regression.
We are planning to use the results of this study in a future communication in order to effectively design a large-scale and more detailed real-world data application, incorporating in the algorithm design proposed herein, a standard classifier that would be used only in case the proposed scheme does not reach a decision. This modified classification scheme could then be fairly compared with other classical modeling methods, such as decision trees and random forests, within a real-world setting.
It is also critical to better understand how the characteristics of HRQoL questionnaires impact the estimation of the CPDAGs involved, and, thus, the classification results in a real-world setting. Towards this direction, apart from the PC algorithm that was used in this study, other causal structure learning algorithms should be employed and compared on different cases of HRQoL questionnaires. For instance, in a recent study of our group [40], we employed and compared five causal structure learning algorithms, including the PC, to examine different aspects of structure estimation. Scutari et al. [57] compared different classes of algorithms regarding speed and accuracy of network reconstruction for discrete and Gaussian BNs. Farnia et al. [34] assessed several algorithms in terms of their effectiveness and efficiency in detecting true causal relations among variables.

6. Conclusions

The purpose of this manuscript was to highlight the dynamics of employing causal discovery for classifying patients based on their HRQoL. A customized binary classification algorithm is proposed that exploits the patients’ causal structures, obtained based on their answers to an HRQoL questionnaire, aiming to classify the patients into two distinct groups. The potential of this algorithm for effective patient classification, as shown by the simulation and real-world data application results, could further elevate the role of HRQoL questionnaires in patient care and management, involving new usage aspects.

Author Contributions

Conceptualization, M.G., L.A. and T.M.; Data curation, M.G. and T.M.; Formal analysis, M.G. and T.M.; Funding acquisition, T.M.; Investigation, M.G.; Methodology, M.G., K.F., L.A. and T.M.; Project administration, T.M.; Resources, C.B. and T.M.; Software, M.G. and T.M.; Supervision, T.M.; Validation, K.F. and L.A.; Visualization, M.G. and T.M.; Writing—original draft, M.G. and T.M.; Writing—review and editing, K.F., L.A., C.B., and T.M. All authors have read and agreed to the published version of the manuscript.

Funding

The research project was partially supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “2nd Call for H.F.R.I. Research Projects to support Post-Doctoral Researchers” (Project Number: 553).

Institutional Review Board Statement

The study was approved by the Ethics Committee of the Aristotle University of Thessaloniki, Faculty of Medicine (Protocol Code 4291; date of approval: 26 January 2021).

Informed Consent Statement

Written informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on reasonable request from the corresponding author. The data are not publicly available due to privacy reasons.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ware, J.E.; Sherbourne, C.D. The MOS 36-Item Short-Form Health Survey (SF-36): I. Conceptual Framework and Item Selection. Med. Care 1992, 30, 473–483. [Google Scholar] [CrossRef] [PubMed]
  2. The Whoqol Group. Development of the World Health Organization WHOQOL-BREF Quality of Life Assessment. Psychol. Med. 1998, 28, 551–558. [Google Scholar] [CrossRef] [PubMed]
  3. Cella, D.F.; Tulsky, D.S.; Gray, G.; Sarafian, B.; Linn, E.; Bonomi, A.; Silberman, M.; Yellen, S.B.; Winicour, P.; Brannon, J.; et al. The Functional Assessment of Cancer Therapy Scale: Development and Validation of the General Measure. J. Clin. Oncol. 1993, 11, 570–579. [Google Scholar] [CrossRef]
  4. Aaronson, N.K.; Ahmedzai, S.; Bergman, B.; Bullinger, M.; Cull, A.; Duez, N.J.; Filiberti, A.; Flechtner, H.; Fleishman, S.B.; de Haes, J.C.; et al. The European Organization for Research and Treatment of Cancer QLQ-C30: A Quality-of-Life Instrument for Use in International Clinical Trials in Oncology. JNCI J. Natl. Cancer Inst. 1993, 85, 365–376. [Google Scholar] [CrossRef]
  5. Herdman, M.; Gudex, C.; Lloyd, A.; Janssen, M.F.; Kind, P.; Parkin, D.; Bonsel, G.; Badia, X. Development and Preliminary Testing of the New Five-Level Version of EQ-5D (EQ-5D-5L). Qual. Life Res. 2011, 20, 1727–1736. [Google Scholar] [CrossRef]
  6. Bakas, T.; McLennon, S.M.; Carpenter, J.S.; Buelow, J.M.; Otte, J.L.; Hanna, K.M.; Ellett, M.L.; Hadler, K.A.; Welch, J.L. Systematic Review of Health-Related Quality of Life Models. Health Qual. Life Outcomes 2012, 10, 134. [Google Scholar] [CrossRef]
  7. Xiao, Y.; Zhang, L.; Wei, Q.; Ou, R.; Hou, Y.; Liu, K.; Lin, J.; Yang, T.; Shang, H. Health-related Quality of Life in Patients with Multiple System Atrophy Using the EQ-5D-5L. Brain Behav. 2022, 12, e2774. [Google Scholar] [CrossRef] [PubMed]
  8. Cherchir, F.; Oueslati, I.; Yazidi, M.; Chaker, F.; Chihaoui, M. Assessment of Quality of Life in Patients with Permanent Hypoparathyroidism Receiving Conventional Treatment. J. Diabetes Metab. Disord. 2023, 22, 1617–1623. [Google Scholar] [CrossRef]
  9. Brzoska, P. Assessment of Quality of Life in Individuals with Chronic Headache. Psychometric Properties of the WHOQOL-BREF. BMC Neurol. 2020, 20, 267. [Google Scholar] [CrossRef]
  10. Bat-Erdene, E.; Hiramoto, T.; Tumurbaatar, E.; Tumur-Ochir, G.; Jamiyandorj, O.; Yamamoto, E.; Hamajima, N.; Oka, T.; Jadamba, T.; Lkhagvasuren, B. Quality of Life in the General Population of Mongolia: Normative Data on WHOQOL-BREF. PLoS ONE 2023, 18, e0291427. [Google Scholar] [CrossRef]
  11. Floris, F.; Comitini, F.; Leoni, G.; Moi, P.; Morittu, M.; Orecchia, V.; Perra, M.; Pilia, M.P.; Zappu, A.; Casini, M.R.; et al. Quality of Life in Sardinian Patients with Transfusion-Dependent Thalassemia: A Cross-Sectional Study. Qual. Life Res. 2018, 27, 2533–2539. [Google Scholar] [CrossRef] [PubMed]
  12. Pamuk, G.E.; Harmandar, F.; Ermantaş, N.; Harmandar, O.; Turgut, B.; Demir, M.; Vural, Ö. EORTC QLQ-C30 Assessment in Turkish Patients with Hematological Malignancies: Association with Anxiety and Depression. Ann. Hematol. 2008, 87, 305–310. [Google Scholar] [CrossRef] [PubMed]
  13. Dean, G.E.; Redeker, N.S.; Wang, Y.-J.; Rogers, A.E.; Dickerson, S.S.; Steinbrenner, L.M.; Gooneratne, N.S. Sleep, Mood, and Quality of Life in Patients Receiving Treatment for Lung Cancer. In Oncology Nursing Forum; NIH Public Access: Bethesda, MD, USA, 2013; Volume 40, pp. 441–451. [Google Scholar]
  14. Dean, G.E.; Sabbah, E.A.; Yingrengreung, S.; Ziegler, P.; Chen, H.; Steinbrenner, L.M.; Dickerson, S.S. Sleeping with the Enemy: Sleep and Quality of Life in Patients with Lung Cancer. Cancer Nurs. 2015, 38, 60–70. [Google Scholar] [CrossRef]
  15. Jackson, I.L.; Isah, A.; Arikpo, A.O. Assessing Health-Related Quality of Life of People with Diabetes in Nigeria Using the EQ-5D-5L: A Cross-Sectional Study. Sci. Rep. 2023, 13, 22536. [Google Scholar] [CrossRef]
  16. Zhou, Z.; Yang, L.; Chen, Z.; Chen, X.; Guo, Y.; Wang, X.; Dong, X.; Wang, T.; Zhang, L.; Qiu, Z.; et al. Health-related Quality of Life Measured by the Short Form 36 in Immune Thrombocytopenic Purpura: A Cross-sectional Survey in China. Eur. J. Haematol. 2007, 78, 518–523. [Google Scholar] [CrossRef] [PubMed]
  17. Yang, R.; Yao, H.; Lin, L.; Ji, J.; Shen, Q. Health-Related Quality of Life and Burden of Fatigue in Chinese Patients with Immune Thrombocytopenia: A Cross-Sectional Study. Indian J. Hematol. Blood Transfus. 2020, 36, 104–111. [Google Scholar] [CrossRef]
  18. Hossain, M.J.; Islam, M.W.; Munni, U.R.; Gulshan, R.; Mukta, S.A.; Miah, M.S.; Sultana, S.; Karmakar, M.; Ferdous, J.; Islam, M.A. Health-Related Quality of Life among Thalassemia Patients in Bangladesh Using the SF-36 Questionnaire. Sci. Rep. 2023, 13, 7734. [Google Scholar] [CrossRef]
  19. Efficace, F.; Platzbecker, U.; Breccia, M.; Cottone, F.; Carluccio, P.; Salutari, P.; Di Bona, E.; Borlenghi, E.; Autore, F.; Levato, L.; et al. Long-Term Quality of Life of Patients with Acute Promyelocytic Leukemia Treated with Arsenic Trioxide vs Chemotherapy. Blood Adv. 2021, 5, 4370–4379. [Google Scholar] [CrossRef] [PubMed]
  20. Criscitiello, C.; Spurden, D.; Piercy, J.; Rider, A.; Williams, R.; Mitra, D.; Wild, R.; Corsaro, M.; Kurosky, S.K.; Law, E.H. Health-Related Quality of Life among Patients with HR+/HER2–Early Breast Cancer. Clin. Ther. 2021, 43, 1228–1244. [Google Scholar] [CrossRef]
  21. Zeng, X.; Sui, M.; Liu, R.; Qian, X.; Li, W.; Zheng, E.; Yang, J.; Li, J.; Huang, W.; Yang, H.; et al. Assessment of the Health Utility of Patients with Leukemia in China. Health Qual. Life Outcomes 2021, 19, 65. [Google Scholar] [CrossRef]
  22. Youron, P.; Singh, C.; Jindal, N.; Malhotra, P.; Khadwal, A.; Jain, A.; Prakash, G.; Varma, N.; Varma, S.; Lad, D.P. Quality of Life in Patients of Chronic Lymphocytic Leukemia Using the EORTC QLQ-C30 and QLQ-CLL17 Questionnaire. Eur. J. Haematol. 2020, 105, 755–762. [Google Scholar] [CrossRef] [PubMed]
  23. Claflin, S.; Campbell, J.A.; Norman, R.; Mason, D.F.; Kalincik, T.; Simpson-Yap, S.; Butzkueven, H.; Carroll, W.M.; Palmer, A.J.; Blizzard, C.L.; et al. Using the EQ-5D-5L to Investigate Quality-of-Life Impacts of Disease-Modifying Therapy Policies for People with Multiple Sclerosis (MS) in New Zealand. Eur. J. Health Econ. 2023, 24, 939–950. [Google Scholar] [CrossRef] [PubMed]
  24. Li, J.; Liu, L.; Le, T.D. Practical Approaches to Causal Relationship Exploration; Springer: Cham, Switzerland, 2015; ISBN 978-3-319-14432-0. [Google Scholar] [CrossRef]
  25. Pearl, J. Causality; Cambridge University Press: Cambridge, UK, 2009; ISBN 052189560X. [Google Scholar]
  26. Boutsika, A.; Michailidis, M.; Ganopoulou, M.; Dalakouras, A.; Skodra, C.; Xanthopoulou, A.; Stamatakis, G.; Samiotaki, M.; Tanou, G.; Moysiadis, T.; et al. A Wide Foodomics Approach Coupled with Metagenomics Elucidates the Environmental Signature of Potatoes. iScience 2023, 26, 105917. [Google Scholar] [CrossRef] [PubMed]
  27. Skodra, C.; Michailidis, M.; Moysiadis, T.; Stamatakis, G.; Ganopoulou, M.; Adamakis, I.-D.S.; Angelis, L.; Ganopoulos, I.; Tanou, G.; Samiotaki, M.; et al. Disclosing the Molecular Basis of Salinity Priming in Olive Trees Using Proteogenomic Model Discovery. Plant Physiol. 2023, 191, 1913–1933. [Google Scholar] [CrossRef]
  28. Ganopoulou, M.; Michailidis, M.; Angelis, L.; Ganopoulos, I.; Molassiotis, A.; Xanthopoulou, A.; Moysiadis, T. Could Causal Discovery in Proteogenomics Assist in Understanding Gene–Protein Relations? A Perennial Fruit Tree Case Study Using Sweet Cherry as a Model. Cells 2021, 11, 92. [Google Scholar] [CrossRef]
  29. Ganopoulou, M.; Kangelidis, I.; Sianos, G.; Angelis, L. Causal Models for the Result of Percutaneous Coronary Intervention in Coronary Chronic Total Occlusions. Appl. Sci. 2021, 11, 9258. [Google Scholar] [CrossRef]
  30. Piccininni, M.; Konigorski, S.; Rohmann, J.L.; Kurth, T. Directed Acyclic Graphs and Causal Thinking in Clinical Risk Prediction Modeling. BMC Med. Res. Methodol. 2020, 20, 179. [Google Scholar] [CrossRef]
  31. Raghu, V.K.; Zhao, W.; Pu, J.; Leader, J.K.; Wang, R.; Herman, J.; Yuan, J.-M.; Benos, P.V.; Wilson, D.O. Feasibility of Lung Cancer Prediction from Low-Dose CT Scan and Smoking Factors Using Causal Models. Thorax 2019, 74, 643–649. [Google Scholar] [CrossRef]
  32. Sachs, K.; Perez, O.; Pe’er, D.; Lauffenburger, D.A.; Nolan, G.P. Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data. Science (1979) 2005, 308, 523–529. [Google Scholar] [CrossRef]
  33. Liu, J.; Niyogi, D. Identification of Linkages between Urban Heat Island Magnitude and Urban Rainfall Modification by Use of Causal Discovery Algorithms. Urban Clim. 2020, 33, 100659. [Google Scholar] [CrossRef]
  34. Farnia, L.; Alibegovic, M.; Cruickshank, E. On Causal Structural Learning Algorithms Oracles’ Simulations and Considerations. Knowl. Based Syst. 2023, 276, 110694. [Google Scholar] [CrossRef]
  35. Krethong, P.; Jirapaet, V.; Jitpanya, C.; Sloan, R. A Causal Model of Health-related Quality of Life in Thai Patients with Heart-failure. J. Nurs. Scholarsh. 2008, 40, 254–260. [Google Scholar] [CrossRef]
  36. Tangkawanich, T.; Yunibhand, J.; Thanasilp, S.; Magilvy, K. Causal Model of Health: Health-related Quality of Life in People Living with HIV/AIDS in the Northern Region of Thailand. Nurs. Health Sci. 2008, 10, 216–221. [Google Scholar] [CrossRef] [PubMed]
  37. Gąsior, J.S.; Młyńczak, M.; Williams, C.A.; Popłonyk, A.; Kowalska, D.; Giezek, P.; Werner, B. The Discovery of a Data-Driven Causal Diagram of Sport Participation in Children and Adolescents with Heart Disease: A Pilot Study. Front. Cardiovasc. Med. 2023, 10, 1247122. [Google Scholar] [CrossRef]
  38. Varni, J.W.; Seid, M.; Kurtin, P.S. PedsQLTM 4.0: Reliability and Validity of the Pediatric Quality of Life InventoryTM Version 4.0 Generic Core Scales in Healthy and Patient Populations. Med. Care 2001, 39, 800–812. [Google Scholar] [CrossRef] [PubMed]
  39. Varni, J.W.; Burwinkle, T.M.; Seid, M.; Skarr, D. The PedsQLTM* 4.0 as a Pediatric Population Health Measure: Feasibility, Reliability, and Validity. Ambul. Pediatr. 2003, 3, 329–341. [Google Scholar] [CrossRef]
  40. Ganopoulou, M.; Kontopoulos, E.; Fokianos, K.; Koparanis, D.; Angelis, L.; Kotsianidis, I.; Moysiadis, T. Delving into Causal Discovery in Health-Related Quality of Life Questionnaires. Algorithms 2024, 17, 138. [Google Scholar] [CrossRef]
  41. Greenland, S.; Pearl, J. Causal Diagrams. In Encyclopedia of Epidemiology; Boslaugh, S., Ed.; Technical Report; Sage Publications: Thousand Oaks, CA, USA, 2022; pp. 149–156. [Google Scholar]
  42. Nagarajan, R.; Scutari, M.; Lèbre, S. Bayesian Networks in R with Applications in Systems Biology; Springer: Berlin/Heidelberg, Germany, 2013; ISBN 978-1-4614-6445-7. [Google Scholar]
  43. Tsanousa, A.; Ntoufa, S.; Papakonstantinou, N.; Stamatopoulos, K.; Angelis, L. Study of Gene Expressions’ Correlation Structures in Subgroups of Chronic Lymphocytic Leukemia Patients. J. Biomed. Inform. 2019, 95, 103211. [Google Scholar] [CrossRef] [PubMed]
  44. Tsanousa, A.; Angelis, L.; Ntoufa, S.; Papakonstantinou, N.; Stamatopoulos, K. A Structural Equation Modeling Approach of the Toll-like Receptor Signaling Pathway in Chronic Lymphocytic Leukemia. In Proceedings of the 2013 24th International Workshop on Database and Expert Systems Applications, Los Alamitos, CA, USA, 26–30 August 2013; pp. 71–75. [Google Scholar]
  45. Tsanousa, A.; Ntoufa, S.; Papakonstantinou, N.; Stamatopoulos, K.; Angelis, L. Discovering Causal Patterns with Structural Equation Modeling: Application to Toll-Like Receptor Signaling Pathway in Chronic Lymphocytic Leukemia. Pattern Recognit. Comput. Mol. Biol. Tech. Approaches 2015, 555–584. [Google Scholar] [CrossRef]
  46. Bai, F.; Shu, N.; Yuan, Y.; Shi, Y.; Yu, H.; Wu, D.; Wang, J.; Xia, M.; He, Y.; Zhang, Z. Topologically Convergent and Divergent Structural Connectivity Patterns between Patients with Remitted Geriatric Depression and Amnestic Mild Cognitive Impairment. J. Neurosci. 2012, 32, 4307–4318. [Google Scholar] [CrossRef]
  47. Muhetaer, S.; Mijiti, P.; Aierken, K.; Ziyin, H.; Talapuhan, W.; Tuoheti, K.; Lixia, Y.; Shuang, Q.; Jingjing, W. A Network Approach to Investigating the Inter-Relationship between Health-Related Quality of Life Dimensions and Depression in 1735 Chinese Patients with Heterogeneous Cancers. Front. Public Health 2024, 11, 1325986. [Google Scholar] [CrossRef] [PubMed]
  48. Tsamardinos, I.; Brown, L.E.; Aliferis, C.F. The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm. Mach. Learn. 2006, 65, 31–78. [Google Scholar] [CrossRef]
  49. Scutari, M. Learning Bayesian Networks with the Bnlearn R Package. J. Stat. Softw. 2010, 35, 1–22. [Google Scholar] [CrossRef]
  50. Colombo, D.; Maathuis, M.H. Order-Independent Constraint-Based Causal Structure Learning. J. Mach. Learn. Res. 2014, 15, 3741–3782. [Google Scholar]
  51. Spirtes, P.; Glymour, C.; Scheines, R. Discovery Algorithms for Causally Sufficient Structures. In Causation, Prediction, and Search; Lecture Notes in Statistics; Springer: New York, NY, USA, 1993; pp. 103–162. [Google Scholar]
  52. Spirtes, P.; Glymour, C.N.; Scheines, R. Causation, Prediction, and Search; MIT Press: Cambridge, MA, USA, 2000; ISBN 0262194406. [Google Scholar]
  53. Bakirtzis, C.; Artemiadis, A.; Nteli, E.; Boziki, M.K.; Karakasi, M.-V.; Honan, C.; Messinis, L.; Nasios, G.; Dardiotis, E.; Grigoriadis, N. A Greek Validation Study of the Multiple Sclerosis Work Difficulties Questionnaire-23. Healthcare 2021, 9, 897. [Google Scholar] [CrossRef]
  54. Kalisch, M.; Bühlmann, P. Robustification of the PC-Algorithm for Directed Acyclic Graphs. J. Comput. Graph. Stat. 2008, 17, 773–789. [Google Scholar] [CrossRef]
  55. Kalisch, M.; Bühlman, P. Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm. J. Mach. Learn. Res. 2007, 8, 613–636. [Google Scholar]
  56. Le, T.D.; Hoang, T.; Li, J.; Liu, L.; Liu, H.; Hu, S. A Fast PC Algorithm for High Dimensional Causal Discovery with Multi-Core PCs. IEEE/ACM Trans. Comput. Biol. Bioinform. 2016, 16, 1483–1495. [Google Scholar] [CrossRef]
  57. Scutari, M.; Graafland, C.E.; Gutiérrez, J.M. Who Learns Better Bayesian Network Structures: Accuracy and Speed of Structure Learning Algorithms. Int. J. Approx. Reason. 2019, 115, 235–253. [Google Scholar] [CrossRef]
Figure 1. Design of the causal binary classification algorithm.
Figure 1. Design of the causal binary classification algorithm.
Biomedinformatics 05 00028 g001
Figure 2. The four pairs of synthetic directed acyclic graphs (DAG1.1 vs. DAG1.2, DAG2.1 vs. DAG2.2, DAG3.1 vs. DAG3.2, and DAG4.1 vs. DAG4.2) correspond to the four hypothetical health-related quality of life questionnaires. Each node represents a questionnaire item and is labeled either with a capital letter (A, B, C, …) or with “Score”.
Figure 2. The four pairs of synthetic directed acyclic graphs (DAG1.1 vs. DAG1.2, DAG2.1 vs. DAG2.2, DAG3.1 vs. DAG3.2, and DAG4.1 vs. DAG4.2) correspond to the four hypothetical health-related quality of life questionnaires. Each node represents a questionnaire item and is labeled either with a capital letter (A, B, C, …) or with “Score”.
Biomedinformatics 05 00028 g002
Figure 3. Design of the simulation study.
Figure 3. Design of the simulation study.
Biomedinformatics 05 00028 g003
Figure 4. The classification results are presented in adjusted percentages for all four cases: (A) DAG1.1 vs. DAG1.2, (B) DAG2.1 vs. DAG2.2, (C) DAG3.1 vs. DAG3.2, and (D) DAG4.1 vs. DAG4.2.
Figure 4. The classification results are presented in adjusted percentages for all four cases: (A) DAG1.1 vs. DAG1.2, (B) DAG2.1 vs. DAG2.2, (C) DAG3.1 vs. DAG3.2, and (D) DAG4.1 vs. DAG4.2.
Biomedinformatics 05 00028 g004
Figure 5. The classification results for the Stepwise Binary Logistic Regression (SBLR) model and the proposed Causal Binary Classification Algorithm (CBCA) are presented for the progressive and the non-progressive groups of patients, within three different scenarios (Observed, Synthetical 1, and Synthetical 2) in (A) CCRs, and (B) adjusted CCRs (given only the number of cases with decision from the CBCA).
Figure 5. The classification results for the Stepwise Binary Logistic Regression (SBLR) model and the proposed Causal Binary Classification Algorithm (CBCA) are presented for the progressive and the non-progressive groups of patients, within three different scenarios (Observed, Synthetical 1, and Synthetical 2) in (A) CCRs, and (B) adjusted CCRs (given only the number of cases with decision from the CBCA).
Biomedinformatics 05 00028 g005
Table 1. The classification results are presented in percentages (out of 100,000 repetitions) for all four cases (DAG1.1 vs. DAG1.2, DAG2.1 vs. DAG2.2, DAG3.1 vs. DAG3.2, and DAG4.1 vs. DAG4.2). More specifically, the correct classification rate (CCR) and the misclassification rate (MR) (out of 100,000 repetitions) are presented, separately, for Test1 (CCR-Test1 and MR-Test1) and Test 2 (CCR-Test2 and MR-Test2) along with the percentages that the algorithm did not yield a decision (No decision-Test1 and No decision-Test2).
Table 1. The classification results are presented in percentages (out of 100,000 repetitions) for all four cases (DAG1.1 vs. DAG1.2, DAG2.1 vs. DAG2.2, DAG3.1 vs. DAG3.2, and DAG4.1 vs. DAG4.2). More specifically, the correct classification rate (CCR) and the misclassification rate (MR) (out of 100,000 repetitions) are presented, separately, for Test1 (CCR-Test1 and MR-Test1) and Test 2 (CCR-Test2 and MR-Test2) along with the percentages that the algorithm did not yield a decision (No decision-Test1 and No decision-Test2).
Percentages out of 100,000 Repetitions
Pairs of CPDAGs ComparedComparison DescriptionMeasurek = 100k = 500k = 1000k = 5000
DAG1.1 (nodes = 12, edges = 14)
vs.
DAG1.2 (nodes = 12, edges = 12)
SHD(CPDAG1Test1, CPDAG1) < SHD(CPDAG2Test1, CPDAG2)CCR-Test127.6430.8735.4045.10
SHD(CPDAG1Test1, CPDAG1) > SHD(CPDAG2Test1, CPDAG2)MR-Test114.483.983.600.15
SHD(CPDAG1Test1, CPDAG1) = SHD(CPDAG2Test1, CPDAG2)No decision-Test157.8865.1661.0054.75
SHD(CPDAG1Test2, CPDAG1) < SHD(CPDAG2Test2, CPDAG2)MR-Test219.965.760.850.15
SHD(CPDAG1Test2, CPDAG1) > SHD(CPDAG2Test2, CPDAG2)CCR-Test225.2910.379.4418.36
SHD(CPDAG1Test2, CPDAG1) = SHD(CPDAG2Test2, CPDAG2)No decision-Test254.7483.8789.7181.49
DAG2.1 (nodes = 16, edges = 16)
vs.
DAG2.2 (nodes = 16, edges = 19)
SHD(CPDAG1Test1, CPDAG1) < SHD(CPDAG2Test1, CPDAG2)CCR-Test135.0641.257.4240.31
SHD(CPDAG1Test1, CPDAG1) > SHD(CPDAG2Test1, CPDAG2)MR-Test116.942.852.060.40
SHD(CPDAG1Test1, CPDAG1) = SHD(CPDAG2Test1, CPDAG2)No decision-Test148.0055.9090.5259.29
SHD(CPDAG1Test2, CPDAG1) < SHD(CPDAG2Test2, CPDAG2)MR-Test221.8214.113.291.33
SHD(CPDAG1Test2, CPDAG1) > SHD(CPDAG2Test2, CPDAG2)CCR-Test227.4426.4111.6143.72
SHD(CPDAG1Test2, CPDAG1) = SHD(CPDAG2Test2, CPDAG2)No decision-Test250.7459.4885.1054.95
DAG3.1 (nodes = 22, edges = 20)
vs.
DAG3.2 (nodes = 22, edges = 22)
SHD(CPDAG1Test1, CPDAG1) < SHD(CPDAG2Test1, CPDAG2)CCR-Test139.9930.0416.9762.87
SHD(CPDAG1Test1, CPDAG1) > SHD(CPDAG2Test1, CPDAG2)MR-Test112.109.463.850.72
SHD(CPDAG1Test1, CPDAG1) = SHD(CPDAG2Test1, CPDAG2)No decision-Test147.9260.5079.1936.41
SHD(CPDAG1Test2, CPDAG1) < SHD(CPDAG2Test2, CPDAG2)MR-Test225.2112.246.831.10
SHD(CPDAG1Test2, CPDAG1) > SHD(CPDAG2Test2, CPDAG2)CCR-Test222.7124.0711.0418.09
SHD(CPDAG1Test2, CPDAG1) = SHD(CPDAG2Test2, CPDAG2)No decision-Test252.0863.6982.1380.81
DAG4.1 (nodes = 27, edges = 24)
vs.
DAG4.2 (nodes = 27, edges = 26)
SHD(CPDAG1Test1, CPDAG1) < SHD(CPDAG2Test1, CPDAG2)CCR-Test139.3530.6528.623.17
SHD(CPDAG1Test1, CPDAG1) > SHD(CPDAG2Test1, CPDAG2)MR-Test112.3014.747.651.59
SHD(CPDAG1Test1, CPDAG1) = SHD(CPDAG2Test1, CPDAG2)No decision-Test148.3554.6163.7395.24
SHD(CPDAG1Test2, CPDAG1) < SHD(CPDAG2Test2, CPDAG2)MR-Test228.0311.634.530.63
SHD(CPDAG1Test2, CPDAG1) > SHD(CPDAG2Test2, CPDAG2)CCR-Test217.2733.1529.245.38
SHD(CPDAG1Test2, CPDAG1) = SHD(CPDAG2Test2, CPDAG2)No decision-Test254.7055.2266.2393.99
Table 2. The classification results for the Stepwise Binary Logistic Regression (SBLR) model and the proposed Causal Binary Classification Algorithm (CBCA) are presented in percentages and adjusted percentages (given only the number of cases with decision from the CBCA), separately for the progressive and the non-progressive groups of patients, within three different scenarios (Observed, Synthetical 1, and Synthetical 2).
Table 2. The classification results for the Stepwise Binary Logistic Regression (SBLR) model and the proposed Causal Binary Classification Algorithm (CBCA) are presented in percentages and adjusted percentages (given only the number of cases with decision from the CBCA), separately for the progressive and the non-progressive groups of patients, within three different scenarios (Observed, Synthetical 1, and Synthetical 2).
1st2nd3rd
ScenarioObservedSynthetical 1Synthetical 2
Patient GroupProgressiveNon-ProgressiveProgressiveNon-ProgressiveProgressiveNon-Progressive
Number of cases231704617069170
SBLR: CCR30.43%94.71%58.70%93.53%65.22%93.53%
CBCA:
CCR26.09%19.41%45.65%49.41%40.58%62.94%
MR8.70%14.71%0.00%13.53%11.59%12.35%
No Decision65.22%65.88%54.35%37.06%47.83%24.71%
Number of cases with decision from CBCA8582110736128
SBLR: Adjusted CCR62.50%89.66%71.43%96.26%66.67%96.88%
CBCA: Adjusted CCR75.00%56.90%100.0%78.50%77.78%83.59%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ganopoulou, M.; Fokianos, K.; Bakirtzis, C.; Angelis, L.; Moysiadis, T. Causal Discovery for Patient Classification Using Health-Related Quality of Life Questionnaires. BioMedInformatics 2025, 5, 28. https://doi.org/10.3390/biomedinformatics5020028

AMA Style

Ganopoulou M, Fokianos K, Bakirtzis C, Angelis L, Moysiadis T. Causal Discovery for Patient Classification Using Health-Related Quality of Life Questionnaires. BioMedInformatics. 2025; 5(2):28. https://doi.org/10.3390/biomedinformatics5020028

Chicago/Turabian Style

Ganopoulou, Maria, Konstantinos Fokianos, Christos Bakirtzis, Lefteris Angelis, and Theodoros Moysiadis. 2025. "Causal Discovery for Patient Classification Using Health-Related Quality of Life Questionnaires" BioMedInformatics 5, no. 2: 28. https://doi.org/10.3390/biomedinformatics5020028

APA Style

Ganopoulou, M., Fokianos, K., Bakirtzis, C., Angelis, L., & Moysiadis, T. (2025). Causal Discovery for Patient Classification Using Health-Related Quality of Life Questionnaires. BioMedInformatics, 5(2), 28. https://doi.org/10.3390/biomedinformatics5020028

Article Metrics

Back to TopTop