Predicting the Occurrence of Advanced Schistosomiasis Based on FISHER Discriminant Analysis of Hematological Biomarkers

We established a model that predicts the possibility of chronic schistosomiasis (CS) patients developing into advanced schistosomiasis (AS) patients using special biomarkers that were detected in human peripheral blood. Blood biomarkers from two cohorts (132 CS cases and 139 AS cases) were examined and data were collected and analyzed by univariate and multivariate logistic regression analysis. Fisher discriminant analysis (FDA) for advanced schistosomiasis was established based on specific predictive diagnostic indicators and its accuracy was assessed using data of 109 CS. The results showed that seven indicators including HGB, MON, GLB, GGT, APTT, VIII, and Fbg match the model. The accuracy of the FDA was assessed by cross-validation, and 86.7% of the participants were correctly classified into AS and CS groups. Blood biomarker data from 109 CS patients were converted into the discriminant function to determine the possibility of occurrence of AS. The results demonstrated that the possibility of occurrence of AS and CS was 62.1% and 89.0%, respectively, and the accuracy of the established model was 81.4%. Evidence displayed that Fisher discriminant analysis is a reliable predictive model in the clinical field. It’s an important guide to effectively control the occurrence of AS and lay a solid foundation for achieving the goal of schistosomiasis elimination.


Introduction
Schistosomiasis is a zoonotic parasitic disease that is caused by the trematode flukes of Schistosoma spp., mainly endemic in 78 countries and territories of Asia, Africa, and South America [1], which is still one of the most serious global public health problems [2]. Schistosomiasis japonica is widely distributed in 12 provinces (cities and autonomous regions) in China [3], posing a serious threat to human health and socio-economic development [4,5]. Although schistosomiasis has been in a state of low endemicity in China [6], new challenges have emerged. Some patients with chronic schistosomiasis who have not received timely and thorough treatment are gradually developing into advanced schistosomiasis [7,8]. A total of 30,170 cases of advanced schistosomiasis still exist in China [9], while some of them are discovered and reported in areas where the transmission of schistosomiasis has been interrupted [10].
Clinical presentation varies in chronic schistosomiasis cases, but many cases present mild symptoms or an absence of symptoms thus leading to misdiagnosis or a lack of treatment. Advanced schistosomiasis is mainly characterized by liver and spleen lesions, such as periportal hepatic fibrosis, portal hypertension, spleen enlargement, and congestion [11]. Tissue fibrosis that is caused by egg deposition is the most serious outcome of schistosome infection. The long course of the disease, high treatment costs, and poor prognosis impose huge psychological and economic burdens on patients and their families [12][13][14]. Schistosomiasis is an immune disease, and its pathological basis is the immune response of the host to schistosomes and eggs [15]. The schistosome eggs that are produced by adult worms in the portal venous system enter the liver with the bloodstream. Due to the large diameter of the eggs, they stay in front of the hepatic sinus to block the blood vessels and release egg antigens to sensitize delayed allergic T-lymphocytes, resulting in delayed sensitization. Allergy-sensitized T-lymphocytes produce a series of lymphokines attracting inflammatory cells (including macrophages, lymphocytes, eosinophils, etc.) to gather around the eggs, causing a series of inflammation to form egg granulomas. In the egg granulomas, macrophages release pro-fibrotic cytokines to activate hepatic stellate cells (HSCs) to transform into myofibroblasts, which produce a large amount of extracellular matrix (ECM) and secrete pro-fibrotic factors [16][17][18], resulting in an imbalance between fiber generation and degradation. Without effective intervention, fibrogenesis progresses further and eventually develops into advanced schistosomiasis. Complete recovery from advanced schistosomiasis is difficult to achieve, therefore, with the increasing disease burden of advanced schistosomiasis cases in the entire schistosomiasis disease spectrum, it is necessary to provide effective preventive strategies for early detection, early diagnosis, and early treatment, which will benefit the control and elimination of schistosomiasis and decrease the burden of the schistosomiasis population. However, few studies that are related to this have been conducted so far.
Currently, the main diagnostic methods for schistosomiasis liver fibrosis are: (1) histological examination, which is a reliable method for diagnosing liver fibrosis, but has great trauma to the tissue, and the sampling site cannot reflect the overall condition of the patient's liver; (2) imaging methods, ultrasonography is the most widely used method for diagnosing schistosomiasis liver disease or assessing curative effectiveness based on grading of liver fibrosis; (3) detecting biomarkers of serum or plasma, including extracellular matrix components, degradation products, and metabolism of enzymes and cytokines, etc., which is readily assayed and non-invasive [19,20], but its specificity needs to be assessed as liver parenchymal damage that is caused by schistosomiasis, chronic hepatitis B, and cirrhosis may lead to abnormal synthesis of coagulation factors, anticoagulant substances, and fibrinolytic substances [20]. As the liver is the main site for the synthesis of coagulation factors, anticoagulant substances (such as AT-III, PC, PS) and fibrinolytic substances (such as PLG) in the blood, liver parenchymal damage may lead to abnormal synthesis of those factors or substances [20]. In our previous study, we found that more than 50% of patients with advanced schistosomiasis had abnormal levels of D-dimer in plasma, and there were weak positive correlations between the concentration of D-dimer and the grade of hepatic fibrosis [21]. The result indicated that some special biomarkers that were detected in peripheral blood might be used for predicting the occurrence of advanced schistosomiasis.
This study was designed to analyze and compare peripheral blood-related metabolic indicators in patients with chronic schistosomiasis and advanced schistosomiasis to select specific predictive diagnostic indicators and establish a Fisher discriminant analysis for advanced Schistosomiasis japonica, to achieve early detection and early intervention, and effectively reduce the new cases of advanced schistosomiasis japonica.

General Information of Patients
A total of 139 AS patients and 132 CS patients were included in this study. The ratios of males and females participating in this study were 1:0.76 and 1:1.03 for AS and CS patients, respectively (χ 2 = 1.567, p = 0.211). The average age of male AS and CS patients was 64.1 years and 61.8 years, and females with AS and CS aged 57.8 and 59.9 years old, respectively, with no statistically significant difference that was detected between the two groups with the same gender (t Male = 2.401, t Female = 1.484, p all > 0.05). The AS and CS groups were matched for gender and age (Table 1).

Data Preprocessing
Taking AS and CS patients as state variables, and the detection value of blood biomarkers as test variables, the ROC curve coordinates corresponding to the sensitivity and specificity were used to calculate the maximum Youden Index, which is the optimal diagnostic cut-off value of biomarkers for two groups of patients (File S1). As 8 of the 30 blood biomarkers including WBC, PLT, EOS, ALT, TP, ALB, AFP, and IV-C could not calculate the optimal diagnostic cut-off value, they were discarded in further analysis ( Table 2).

Univariate Analysis
After converting the quantitative values of the selected 22 blood biomarkers into qualitative 0 and 1 based on the optimal diagnostic cutoff value, the Chi-square test was performed with six biomarkers including SJ and five hepatitis B items. The results showed that except for seven biomarkers such as SJ, RBC, HbsAg, HbsAb, HbeAg, HbeAb, and HbcAb, without statistical significance being detected, other biomarkers presented significant differences between the AS group and CS group (Table 3). However, due to the p-value of RBC and HbcAb < 0.2, we still included it in the multivariate analysis according to the initial statistical constraints.

Multivariate Logistic Regression Analysis
The 23 blood biomarkers with p < 0.2 in the univariate analysis were subjected to multivariate analysis, and finally, nine blood biomarkers including HGB, LYM, MON, DBiL, GLB, GGT, APTT, Fbg, and VIII were entered into the following analysis (Table 4).

Establishment of Classification Model
The AS group and the CS group were distinguished according to the blood biomarkers, and the FDA was constructed based on the nine biomarkers that were screened by multivariate analysis. According to the results of the statistical analysis, seven variables with statistical significance were screened out: HGB (X 1 ), MON (X 2 ), GLB (X 3 ), GGT (X 4 ), APTT (X 5 ), VIII(X 6 ), and Fbg (X 7 ), thus the following discriminant function was obtained (Wilks' lambda = 0.624, χ 2 = 125.033, df = 7, p = 0.000): C AS = 0.923X 1 + 3.058X 2 + 2.672X 3 + 2.694X 4 + 4.364X 5 + 2.226X 6 + 7.744X 7 − 8.211 C CS = 1.843X 1 + 1.930X 2 + 1.002X 3 + 1.586X 4 + 2.893X 5 + 2.863X 6 + 8.875X 7 − 7.621 Then, the quantitative values of the selected seven blood biomarkers of patients were converted into qualitative 0 or 1 based on the optimal diagnostic cutoff value, which was substituted into the C AS and C CS equations, and the values of these two equations were calculated, respectively. By comparing the C AS and C CS values, the patients were classified according to the following principles: if C AS > C CS , it was determined to be AS patients; otherwise they were judged as CS patients. The accuracy of the FDA was assessed by cross-validation. The results showed that 86.7% of the participants were correctly classified into AS group and CS group ( Table 5).

Prediction Accuracy of the FDA Model
We further replaced the discriminant functions with blood biomarker data of 109 CS patients and calculated the values of C AS and C CS , respectively. By comparing the C AS and C CS values, the CS patients who were not included in the statistical analysis were classified according to the set principles. A return workup was also conducted in 2020 to determine whether these patients have developed into AS cases. A total of seven CS patients were lost in this visit. The results showed that among the 29 patients who were judged to be AS cases by the discriminant function, 18 patients eventually developed into AS cases, with a coincidence rate of 62.1%. Among the 73 patients who were judged to be CS cases, eight patients eventually developed into AS cases, accounting for the number of visitors. The overall coincidence rate was 81.4% (Table 6).

Discussion
Schistosomiasis is a serious parasitic disease that is characterized by immunopathological damage. The deposition of eggs in the liver induces an immune response leading to liver inflammation and fibrosis [22,23]. If the process of liver fibrosis for patients cannot be effectively controlled, it will develop into advanced schistosomiasis, manifesting portal hypertension, splenomegaly, and complications including hypersplenism, esophageal variceal bleeding, and hepatic encephalopathy, etc. [4,13]. Patients may experience physical, psychological, and economic burden due to prolonged and recurrent illness and high treatment costs [24,25]. Significant fibrosis is a hallmark of liver disease progression, and early diagnosis guides significance for subsequent clinical treatment decisions. In addition, early and effective intervention for a population that once had chronic schistosomiasis can avoid the occurrence of advanced schistosomiasis, which is not only beneficial to public health service for the residents in endemic areas, but also is of great significance for improving prognosis and survival quality.
A liver biopsy is a special test for diagnosing liver fibrosis, providing auxiliary information for accurate diagnosis and prognosis of liver disease, but its invasiveness makes it limited in clinical applications. In recent years, with the use of non-invasive liver fibrosis markers in the clinical practice, the prediction, and monitoring of chronic liver disease are increasingly independent of liver biopsy. Studies on non-invasive markers of liver fibrosis are seen in other chronic liver diseases. Whereas few studies have been conducted on the diagnosis of hepatic fibrosis in schistosomiasis, it was observed in our follow-up that patients with chronic schistosomiasis could have fibrosis progression after discontinuation of anthelmintic drugs.
In this study, we retrospectively analyzed 36 humoral (serum/plasma) markers and related clinical data of 271 schistosomiasis patients (132 CS cases, 139 AS cases) and found that HGB, MON, GLB, GGT, APTT, VIII, and Fbg were related to fibrosis. The new non-invasive diagnostic function model that we established can predict the possibility of chronic schistosomiasis cases developing into advanced schistosomiasis cases. The model is sufficiently reliable in diagnosing fibrosis and predicting the progression of liver fibrosis.
Previously, it was found that hepatitis B virus infection and schistosomiasis interacted in liver damage [26]. Recent studies have also shown that there was no significant correlation between HBV infection status and the development of CS to AS (ascites type) [27], the correlation analysis of this study further showed that HBV infection in CS cases was not an influencing factor in the development of AS.
There are three markers reflecting coagulation function in the FDA model that was constructed from seven markers in this study. The parasitism and migration of S. japonicum in the venous system, and the deposition of eggs in liver tissue causes specific pathological reactions in its definitive hosts [28]. Theoretically, vascular injury first induces a local inflammatory response, followed by an imbalance of the coagulation and fibrinolytic systems, and their interactions ultimately lead to a systemic pathological response in the host. Systemic coagulation, as a follow-up response to inflammation of schistosomes, plays an important role in the compensation of parasitic immunity [29][30][31]. However, to maintain the homeostasis of the blood system, the oversecreted fibrin in coagulation needs to be further degraded by fibrinolytic factors such as plasminogen and plasmin [32]. Previous studies have reported abnormal coagulation function in patients with schistosomiasis, especially in patients with advanced schistosomiasis [29][30][31][32]. If the coagulation and anticoagulation systems, as well as the fibrinolytic and antifibrinolytic systems are balanced, a hypercoagulable state or bleeding tendency may occur, with elevations of VIII mainly seen in hypercoagulable states and thrombotic diseases [33][34][35], this may be due to the deposition of a large number of eggs in the mesenteric blood vessels after infection with schistosomes, and the egg antigen stimulates the blood vessel wall and activates the coagulation system. Currently, there are no clinically useful serum biomarkers or assays to assess fibrosis in patients with advanced schistosomiasis. APTT-activated partial thromboplastin time, Fbg fibrinogen, and factor VIII activity may be good candidates.
In summary, we found that the discriminant function that was established by the use of body fluid (serum/plasma) markers can effectively warn the occurrence of AS. FDA is a reliable prediction model with strong practicability. It plays an important guiding role in effectively controlling the occurrence of AS and lays a solid foundation for achieving the goal of schistosomiasis elimination. We will further carry out early treatment intervention for CS patients who may develop into AS cases, and establish the best treatment strategy for patients with chronic schistosomiasis to effectively prevent the development of AS.
Being an exploratory study with a small sample size, the participants in our study were selected from different schistosomiasis endemic counties (cities, districts) of Jiangxi province which might present a certain representativeness. In addition, a series of independent comparisons were conducted in our study which might increase the statistical error. Thus, further study that is based on an increased sample size or random sampling strategy should be conducted to verify the results that were explored in our study.

Patients
The case-control study included two groups of cases from eight counties (Nanchang, Xinjian, Jinxian, Xingzi, Duchang, Yongxiu, Poyang, Yugan) in the schistosomiasis severely endemic area of Poyang Lake District, Jiangxi Province, from February to March 2013. A total of 271 patients were recruited, including 139 AS patients and 132 CS patients who were diagnosed according to the "Diagnostic Criteria for Schistosomiasis" (WS261-2006) that was issued by the Ministry of Health of the People's Republic of China [36] (File S2). These cases exclude diseases such as metabolic hereditary liver disease, other parasitic infections, tumors, cardiovascular system, kidney disease, respiratory system, digestive system, diabetes, infection and tissue necrosis, bacteremia, and systemic lupus erythematosus, while minimizing the confounding effects of other liver diseases (except hepatitis B). In addition, we also retained 109 CS patients to observe whether they had developed into AS by 2020 to evaluate the accuracy of the discriminant function warning.

Blood-Based Biomarkers
For all study subjects, elbow venous blood was collected and sent to the First Affiliated Hospital of Nanchang University within 2 h. Blood routine examinations and biomarkers that were related to liver function, fibrin degradation product D-dimer, coagulation index, HBV, alpha-fetoprotein, and 36 tests of liver fibrosis were conducted (Table 7). Biomarker detection operations were carried out in strict accordance with the instructions.

Data Preprocessing
Among the 36 blood biomarkers, 30 blood biomarkers were continuous variables followed by normality testing. Although the logarithm was taken for correction, there were still some data that were not normally distributed. Therefore, in order to facilitate the unified analysis of the data, we intend to convert these continuous variables into categorical variables. The receiver operating characteristics (ROC) curve was applied to evaluate the various biomarkers in AS and CS patients to find the optimal clinical diagnosis critical value and complete the qualitative classification. The biomarkers were assigned a value as "0" when it is less than the critical value, otherwise is "1". Biomarkers with area under the ROC curve (AUC) ≤ 0.5, or p > 0.05 were excluded.

Statistical Analysis
Statistical analysis was performed using the SPSS Statistics 22.0 software (SPSS Inc., Chicago, IL, USA) with a test level of α = 0.05. Data analysis methods include the following parts: First, we generally characterized the participants according to gender and age to ensure consistency between the samples. Then, univariate analysis was performed on the differences between the biomarker classification variables of the AS group and the CS group by Chi-square test to select the variables that could be used for analysis in the next step.
We also consider that there may be a certain correlation between variables with no significant difference in univariate analysis and other confounding variables. To avoid the true effect of this variable being masked by the effect of other confounding indicators, all variables with p < 0.2 were included in the multivariate analysis [37], and then variables with p < 0.10 remained in the multivariate model after the stepwise backward selection process. The final report is presented as odds ratios (ORs) with 95% confidence intervals (95% CI) and significance levels (p-values).
Fisher discriminant analysis (FDA) is a commonly used multivariate statistical method [38], which uses projection techniques for dimensionality reduction to determine a linear function of variables to maximize differences between the samples of multiple classes and minimize the differences between the samples of the same class [39]. Therefore, we use the variables that were screened out by multivariate analysis to establish the discriminant function by using the FDA model selection step method and conduct a self-test on the established discriminant function.
Author Contributions: F.H., A.N. and S.X. proposed the conception, designed this study, wrote and edited the manuscript, and jointly supervised the project. F.H. and F.Y. collected and analyzed the data; prepared the figures, tables, and manuscript; and contributed equally to this work. S.X., H.X., Z.G. and J.X. contributed to the collection of clinical samples, data interpretation, and manuscript edit. All authors have read and agreed to the published version of the manuscript. This study conforms to the provisions of the Declaration of Helsinki (as revised in Seoul, Korea, October 2008).

Informed Consent Statement:
All of the participants were informed about the purpose of the study through an informed consent form and written informed consent was obtained from each participant. We also provided inpatient medical assistance for AS patients, and praziquantel (PZQ) for CS patients.

Data Availability Statement:
The data that support the figures within this paper and other findings of this study are available from the corresponding authors upon reasonable request.