Network Analysis for Uncovering the Relationship between Host Response and Clinical Factors to Virus Pathogen: Lessons from SARS-CoV-2

Analysing complex datasets while maintaining the interpretability and explainability of outcomes for clinicians and patients is challenging, not only in viral infections. These datasets often include a variety of heterogeneous clinical, demographic, laboratory, and personal data, and it is not a single factor but a combination of multiple factors that contribute to patient characterisation and host response. Therefore, multivariate approaches are needed to analyse these complex patient datasets, which are impossible to analyse with univariate comparisons (e.g., one immune cell subset versus one clinical factor). Using a SARS-CoV-2 infection as an example, we employed a patient similarity network (PSN) approach to assess the relationship between host immune factors and the clinical course of infection and performed visualisation and data interpretation. A PSN analysis of ~85 immunological (cellular and humoral) and ~70 clinical factors in 250 recruited patients with coronavirus disease (COVID-19) who were sampled four to eight weeks after a PCR-confirmed SARS-CoV-2 infection identified a minimal immune signature, as well as clinical and laboratory factors strongly associated with disease severity. Our study demonstrates the benefits of implementing multivariate network approaches to identify relevant factors and visualise their relationships in a SARS-CoV-2 infection, but the model is generally applicable to any complex dataset.


Introduction
The analysis of complex datasets is a major challenge in all branches of medicine, as these datasets often include diverse clinical, demographic, laboratory, and personal data. In addition, there is considerable heterogeneity between patients in terms of clinical manifestations, the presence or absence of multiple individual factors contributing to clinical symptoms, and longitudinal changes in multiple factors throughout the disease course. Univariate analysis (e.g., one laboratory factor versus one clinical factor) is still commonly used in this context, as shown, for example, by studies analysing the relationship between the host immune response and clinical factors in a severe acute respiratory syndrome 2 (SARS-CoV-2) infection [1][2][3][4][5]. However, univariate analysis cannot reveal information about the complex relationships among multiple factors. Moreover, it violates the independence assumption for correlated factors.
Therefore, multivariate analyses are a preferable approach when analysing complex datasets, providing a more realistic basis for robust and accurate clinical decisions [6]. Moreover, multivariate analysis enables the assessment of the contribution of multiple factors concerning one or more clinical factors to reflect reality, reveal relationships between the factors analysed, and reduce the bias of univariate patient characteristics across studies [6][7][8][9]. Nowadays, several multivariate approaches are available that consider complex, multidimensional relationships between factors. These approaches can be divided into four groups according to the objectives of the analysis [10]: (i) comparison of treatment groups influenced by experimental treatment structure using multivariate analysis of variance (MANOVA); (ii) dimensionality reduction techniques such as principal component analysis (PCA); (iii) discriminate techniques such as canonical discriminant analysis (CDA); and (iv) cluster analysis-there are many different algorithms that produce different-sized clusters.
However, these traditional multivariate approaches have several limitations in analysing complex datasets [11], particularly in terms of interpreting the data and visualising the contribution of individual factors. the interpretation of the results of multivariate analysis is generally a difficult task. Examples include the interpretation of derived factors and the number of components obtained by PCA or the interpretation of clusters and their number and size produced by one of the many cluster analysis techniques. There are also limitations in the visualisation options, as complex datasets usually work with many factors (dozens or more are common). For visualisation, however, two or three factors are needed at most. the multidimensional data space must, therefore, be transformed into 2D or 3D so that the result is understandable to the observer. This problem is then solved by dimension reduction, either by feature extraction, where PCA is an example, or by feature selection, i.e., choosing two or three factors. This is followed by visualisation in 2D (e.g., scatter plots) or 3D. Although the result is usually clear, it can be somewhat confusing. In both cases, the observer loses some information, the lack of which is most evident when studying the details of individual points in the data space, in our case, patient profiles.
One innovative multivariate approach to analyse complex biomedical datasets is a network approach, which is based on the realisation that the similarities of the patient profiles are essential for a reasonable interpretation for us as observers [12][13][14][15][16]. This relationship is pairwise, and traditional visualisations lack an exact representation of it. In traditional visualisation, we perceive this relationship as a metric distance of points in the visual form. However, when reducing the dimension, this distance may not represent what describes reality and what we deduce from the visualisation [12,17,18]. As explained later, in this approach, networks in which a sufficiently high similarity of a pair of patients (more precisely, their profiles) is expressed straightforwardly by their ties in the visualisation, and the strength of this tie represent the degree of this similarity. In this respect, networks are a tool that allows visualisation of what is missing in traditional approaches. the main and major advantage of multivariate network analysis is its depth of insight due to the visualisation, allowing us to interpret the data and extract meaning from it, regardless of the type of data.
Here, we investigated the applicability of network analysis for uncovering the relationship between the host immune response and clinical and laboratory factors of virus infections using a SARS-CoV-2 infection as an example. Despite the growing number of studies on coronavirus disease (COVID-19), interpreting the data and drawing meaningful conclusions from the data are challenging. We evaluated a comprehensive dataset from patients infected with the SARS-CoV-2 virus during the first and second COVID-19 waves (period from March to November 2020) of the pandemic in the Czech Republic, enabling us to compare obtained data with published studies. the main objective of this study was to compare the results obtained from univariate analysis and multivariate network analysis, which provides visualisation as an analytical approach to help interpret complex data and identify minimal immune signals useful as a potential predictor of disease severity or persistence of complications.

Patients
The study cohort consisted of 250 patients (124 men and 126 women; mean age ±SD: 53.5 ± 14.0 years) infected with the SARS-CoV-2 virus between March and November 2020. the predominant SARS-CoV-2 variants detected during this period, corresponding to the first and second waves of COVID-19 in the Czech Republic, were B.1, B.1.1.266 and B.1.258 [18]. None of the patients enrolled in this study were vaccinated at the time of the SARS-CoV-2 infection and sampling. of the patients, 83 (33%) had been admitted to the hospital; 80 (32%) had anosmia/ageusia, and 111 (44%) had pneumonia. Clinical evaluation, lung function data, and samplings (peripheral blood) were performed four to eight weeks after a positive SARS-CoV-2 diagnostic PCR test. Fourteen patients were excluded from analyses because of missing specimens for flow cytometry. For more details on clinical characteristics, see Table 1.

Characteristics of Analysed Data
For enrolled COVID-19 patients,~85 immunological (cellular and humoral) and~70 clinical factors available at the date of sampling were analysed.
All patients underwent a chest X-ray and pulmonary function tests, including vital capacity, forced vital capacity, forced expiratory volume in 1 s, forced expiratory flow/vital capacity ratio, peak exploratory flow, total lung capacity, carbon monoxide diffusing capacity (DLCO), and carbon monoxide transfer coefficient. In the case of residual findings on chest X-rays indicating the persistence of lung interstitial changes, high-resolution chest computed tomography was subsequently performed. In addition, clinically relevant medical history data related to COVID-19, e.g., history of hospitalisation, pneumonia, persistent dyspnoea, symptoms, and anosmia/ageusia (partial/complete), were collected in all patients.
Among immunological factors, the main immune cell populations and subpopulations and their activation in peripheral blood were determined using flow cytometry. Whole blood samples were prepared for eight-colour flow cytometry as previously described [19]. Isotype-matched conjugated irrelevant antibodies and fluorescence minus one controls were used. All antibodies and isotype controls used were purchased from BioLegend ( The serum levels of IgG and IgM were assessed by ELISA using the recombinant SARS-CoV-2 RBD Wuhan variant. In addition, 96-well plates (Nunc) were coated with RBD (50 ng/well) overnight at 4 • C, washed, and blocked with 1% BSA/PBS/Tween-20 for three hours at room temperature (RT). Sera were diluted 1:1000 in blocking buffer (in triplicates), incubated overnight at 4 • C, washed, and incubated with secondary rabbit anti-human IgG and IgM antibodies conjugated with horseradish peroxidase (Sigma-Aldrich, Saint-Louis, MO, USA) diluted 1:10,000 in blocking buffer for three hours at RT. the signal developed with OPD-H 2 O 2 was measured at 492 nm and expressed as optical density. the threshold of positivity was based on a comparison with a cohort of noninfected subjects sampled before the SARS-CoV-2 pandemic.

Patient Similarity Network (PSN)
To visually study and assess the relationships between patient profiles and groups of similar patients, we need to convert the patient data into a patient similarity network (PSN). This conversion must work with multiple aspects. the first is the selection of small combinations of such (immunological and clinical) factors that are relevant for similarity assessment. For example, if we have measured 150 factors, we can expect that most of them are noise for the problem we intend to solve. the second aspect is the method of measuring similarity, which must be chosen based on domain knowledge; we use a so-called Gaussian function that converts a distance metric to a similarity metric from an interval [0, 1], as in approaches similar to ours; before applying the Gaussian function, the values of each factor were re-scaled to the same interval. the third aspect is the application of a method that constructs a network from the selected factor in the patient profile; here, we use the LRNet method [17, LRNet application: https://homel.vsb.cz/~kud007/lrnet_files (accessed on 30 October 2022)], which both preserves the essence of the relationships in the original data and reveals not-so-obvious features hidden in the data. the last two aspects are the assessment of the quality of the network in terms of its density (degree of connectivity) and the separation of the parts into clusters of different sizes; here, we use a method to detect clusters in the network (Louvain modularity [20]). In particular, we measure the quality of the separation between clusters using two characteristics, which are the Louvain modularity from interval [−0.5, 1] and silhouette from interval [−1, 1]. Simply put, the higher the modularity, the better clusters of similar patients are separated in the network. the higher the silhouette (positive values), the more reliably individual patients belong to the cluster to which they have been automatically assigned.
Other descriptive characteristics are based on the measurement of univariate statistics within clusters. Here, we assume that patients should have similar values in each factor in a particular cluster, and at least for some factors, their values should differ between clusters. Arithmetic means, standard deviations, and confidence intervals are computed for all factors in the clusters to assess the satisfaction of both assumptions.
Using computers makes it possible to work with all of the above aspects at the same time by having a computer program automatically generating networks for small but varying combinations of factors. the quality of the generated networks is automatically measured based on the abovementioned characteristics. As a result, the combination of factors and networks proposed by the computer program are of the highest quality concerning all mentioned aspects. From these proposals, a network is then manually selected that captures the domain knowledge and, thus, also meets the requirements for clinical applicability. In our case, we considered and investigated combinations of 3-5 factors out of 85 immunological factors and the best combination based on the network quality measures and clinical relevance was nominated. the modularity of the nominated network was 0.616, and 82.6% of patients had a positive silhouette expressing their unambiguous placement in one of the clusters. In addition, a selection from 70 clinical factors was used for visualisation within the network.

Statistics
The Kruskal-Wallis one-way ANOVA test in three or more groups and the Wilcoxon-Mann-Whitney test between the two groups were used to compare the distribution of immune cells, their activation, IgG/IgM levels, and clinical factors. the achieved levels of statistical significance (p-values) were carried out using the R language (3.6.1) in the R-Studio 1.2× programming environment (http://www.r-project.org/ accessed on 1 October 2022). p-values of <0.05 were considered significant. the data are presented as a mean and 95% confidence interval (CI).  For each condition/clinical factor of interest from the comprehensive dataset, statistical significance is calculated and reported as the difference in the value of a given factor (higher, lower, or no difference) between the subgroups of patients studied. For example, our patients with COVID-19 who had confirmed pneumonia had elevated percentages of activated CD8+ T-lymphocytes, activated CD4+ T-lymphocytes, NK cells, monocytes, and eosinophils but lower percentages of B-lymphocytes and immature B-lymphocytes as well as lower percentages of T-lymphocytes expressing checkpoint molecules PD-1 and CTLA-4 at post-COVID sampling compared to patients without pneumonia. No difference in CD8+ T-lymphocytes, CD4+ T-lymphocytes, neutrophils, and basophils was detected between these two studied groups (Table 2c, Figure 1c). Regarding another example of COVID-19 patients with anosmia/ageusia, elevated percentages of B-lymphocytes and immature B-lymphocytes but lower percentages of activated CD8+ T-lymphocytes, activated CD4+ T-lymphocytes, and monocytes were detected when compared to patients without anosmia/ageusia. No difference in the distribution of lymphocytes, CTLA-4+ T-lymphocytes, neutrophils, eosinophils, and basophils was detected between patients with and without anosmia/ageusia. In addition, a trend to higher percentages of CD4+ Tlymphocytes, PD-1+ T-lymphocytes, and lower NK cells was detected in patients reporting anosmia/ageusia; however, the difference did not reach significance. All other conditions and factors could be evaluated in the same manner.

Multivariate Patient Similarity Network Analysis (PSN)
In the next step, we applied a multivariate unsupervised PSNs approach utilising the clustering based on the similarity in the distribution of circulating immune cells, their immunophenotypes, and clinical, functional, and laboratory factors within patients, creating a network and subsequent visualisation of other factors of interest ( Figure 2). In it, we can see the top left panel containing colour-coded clusters of patients that are similar to each other (connected by a tie) in three factors that proved crucial when converting the dataset into the network. As additional information, the averages of the patient profiles in each cluster are shown here in the bottom left panel as a bar chart. Due to the different colouring of the network on the right panel, we have the same network in several variants providing different information. What is particularly evident in the visualisations is that the data are heterogeneous, and the trends captured in the data are more important than the statistical values.
Despite numerous deregulated immune populations and their activation markers across the cohort (Figure 1), clustering based on a percentage of activated (CD69+) CD4+ T-lymphocytes, CTLA-4+ T-lymphocytes, and immature B-lymphocytes (CD19+ CD27− CD38+(bright)) revealed the best subdivision of patients based on COVID-19 severity and lung function status four to eight weeks after initial infection ( Figure 2).
As visualised in Figure 2, a majority of patients with a high percentage of activated CD4+ T-lymphocytes and low percentages of CTLA-4+ T-lymphocytes and immature Blymphocytes had pneumonia, a history of hospitalisation, impaired lung function, and persistent dyspnoea at the post-COVID-19 check-up. In addition, PSNs allow multiple factors to be displayed simultaneously in the network, as demonstrated, for example, in a subanalysis of anosmia/ageusia in women and men who had moderate and severe manifestations of COVID-19 ( Figure 2, lower part). . Cluster 1 (C1, dark green), cluster 2 (C2, green), and cluster 3 (C3, blue) were predominantly associated with mild disease; cluster 4 (C4, violet) and cluster 5 (C5, orange) with severe disease. Dark red/green network vertices represent individual patients with/without a history of pneumonia, hospitalisation, persistent dyspnoea, and anosmia/ageusia (red/green = women/men; light red/green were patients not suffering from anosmia/ageusia; dark red/green were patients suffering from anosmia/ageusia). Pulmonary function and serum IgM and IgG levels in individual patients four to eight weeks post-COVID-19 are indicated by the intensity of red (light for the lowest values and dark for the highest values). C: cluster; LYM: lymphocytes; DLCO: diffusing capacity from carbon monoxide in the lungs.

Discussion
The healthcare system generates a significant amount of data. In addition, a large amount of omics data has been collected in the literature and public databases. However, we often need to better understand these data, extract the meaning from them, and make the most rational use of their potential to generate the knowledge we need. This is also important for uncovering relationships between many laboratory, clinical, and personal factors in complex diseases/conditions, which can help in data interpretation and better management of patients. the situation is also similar for the host response to individual pathogens, where not a single factor but a combination of multiple factors contributes.
This study focused on uncovering the host response to the SARS-CoV-2 virus as an example of a complex condition. We studied the relationship between a hundred clinical, demographic, laboratory, and personal factors available from 250 patients with COVID-19 by applying both univariate analyses and multivariate network analyses. All patients were sampled during the first and second COVID-19 waves of the SARS-CoV-2 pandemic. At that time, patients were monitored thoroughly, with very good communication between physicians and patients, and none of the patients had been vaccinated. the advantage of the patient cohort used is also the significant amount of published data from different populations that are currently available for comparison.
The majority of published studies on patients with COVID-19 still rely on univariate analysis [1][2][3][4][5]. This type of analysis enables us to understand the distribution of values for a single condition, but it cannot understand the relationship between two and more variables. As shown by our study, the visualisation of univariate analysis is easy for the observer to understand; they can see the average ranking of the laboratory factors for each pair of clinical factors/conditions being compared, as well as which patient subgroup has the higher average value. Additional information here is the CIs of the significance of the differences. the first problem is that if we want to study a single patient and its similarity to another, we will not find a solution in this visualisation. the second issue is the overall view of the patient dataset, which raises many questions. For example, how can we see whether one or more factors cluster patients in one place? How large is a subset of factors sufficient to translate the patient dataset into an understandable visual form so that we can look for answers to multiple similar questions? Although the profiles of immune cells vary between comparisons, it is impossible to identify the most significant combination of factors (here, immune cell subpopulations) associated with disease manifestations. As shown in our dataset, the univariate analysis revealed factors associated with a particular condition, such as pneumonia, anosmia/ageusia, or hospitalisation, but, for example, such analysis does not answer questions about whether patients with pneumonia also had anosmia/ageusia or were hospitalised, and which combination of factors are associated with particular patient subgroups. Generally, the results of univariate analysis lack deeper insight into the relationships between the spectrum of factors and their relevance to reality, as well as they are not being suitable for patient groups that are very heterogeneous in terms of phenotypes, immune responses, and clinical manifestations.
Therefore, we introduced in this study an innovative multivariate network analysis to assess the relationship between host response and SARS-CoV-2 infection. Over the years, there has been increasing evidence of the benefits of multivariate analysis in complex diseases [14], as the outcome may be caused by multiple factors, and there may be several different disease phenotypes with different factors/mechanisms involved. We applied a network analysis called PSN that constructs a network from vector data and then clusters the patients based on the similarity in their factors [12,21]. It is suitable for any data (clinical, laboratory, demographic, and functional) of any type (binary and continuous) and cohort size. This makes it highly suitable for the analysis of biomedical datasets [12][13][14][15][16]22]. the advantage of using a PSN network and its visualisation also helps to study and identify the relevant factors for the given conditions, which allow for the subgrouping of patients. Indeed, this multivariate analysis of our real-world cohort identified a minimal immune signature consisting of a high percentage of activated CD4+ T-lymphocytes and low percentages of CTLA-4+ T-lymphocytes and immature B-lymphocytes that were strongly associated with disease severity and manifested even four to eight weeks after COVID-19. Other immune features showed high variability across the cohort, indicating significant heterogeneity in the immune response in individual patients.
Our data are in line with other studies in which a higher representation of activated (CD69+) interferon-γ-producing CD4+ T-lymphocytes was detected in hospitalised COVID-19 patients with pneumonia [23]. Dysregulation in circulating B-cell subpopulations, particularly a reduced number of immature B-cells and normalisation within three to six months of convalescence, has been recently reported in patients with severe COVID-19 [24,25]. Our data indicates that the B-cell disturbance observed in acute severe COVID-19 is extended for at least four to eight weeks after acute infection. This appears to be a response similar to the forced maturation of B-cells towards plasmablasts, achieving up to 30% of the total peripheral blood B-cells, reported in a subgroup of COVID-19 patients [26]. A patient subset (8%) with the mild disease had high levels of checkpoint inhibitor CTLA-4 on T-cells, probably involved in the downregulation of immune activation during COVID-19-related cytokine storms. In the presented study, the patients with severe disease had lower CTLA-4 expression on T-cells, allowing us to hypothesise that in such patients, less controlled activation of T-cells could support the development of severe inflammation in the lungs. In contrast, other authors hypothesised the inhibition of CTLA-4 as the potential therapeutic approach supporting antiviral T-cell activity and inhibiting T-cell exhaustion [27].
Furthermore, the PSN multivariate analysis indicated that anosmia/ageusia was more frequent with mild disease, consistent with previous studies [28]. Moreover, in our cohort, more women than men suffered from olfactory and taste dysfunction. This may be because women are more sensitive to altered olfactory perception than men, as evidenced in a recent meta-analysis [29]. Nevertheless, our knowledge of sex-related differences in the expression of ACE2 and TMPRSS2, two key receptors required for SARS-CoV-2 entry, on non-neural-type sustentacular cells in the olfactory epithelium, which are responsible for SARS-CoV-2-related taste and smell impairments, is very limited and is largely based on animal models [30].
In addition, visualisation of clinical and laboratory factors in patient clusters detected by PSN showed that a history of severe COVID-19 was associated with higher levels of IgM and IgG compared with mild disease. There is evidence that both IgM and IgG can be detected around the same time (~2 weeks) after a SARS-CoV-2 infection, with the development of the class-switched, high-affinity IgG response for long-term immunity and immunological memory [31]. Recent findings have demonstrated a significant correlation of memory B-cells with both IgG1 and IgM responses to the SARS-CoV-2 spike protein receptor-binding domain in most seropositive subjects [32]. IgM+ memory cells are detectable over three months together with switched memory cells with somatic hypermutations, which increase in frequency for several months after the onset of symptoms and persist stably for at least six to eight months [33]. This may explain the high levels of IgM and IgG in our patients with severe disease two months post-COVID-19. Our observation of low levels of immature B-cells in severe disease agrees with the report that patients with IgM memory B-cell depletion died [32] and further highlights the statement that B-cells, particularly memory IgM B-cells, are a critical indicator of disease severity and resolution.
Our study showed several advantages of PSN in a clinically well-characterised realworld cohort of Caucasian patients with the full spectrum of clinical variability of a SARS-CoV-2 infection. Moreover, our study shows that the PSN approach allows for the visualisation and interpretation of relevant information in complex datasets while maintaining the interpretability and explainability of outcomes for clinicians and patients, which is impossible in binary comparisons. In comparison to traditional multivariate approaches, the presented PSN has a big advantage of independence on a pre-set number and size of the clusters (=patients with similar profiles); it allows the integration of diverse data types and handling of sparse data, selection of the most relevant combinations of factors, the possibility to reanalyse each patient cluster, and excellent model interpretability [12,22,34]. Moreover, this approach allows the comparison of datasets obtained for other pathogen strains/time points/conditions. Finally, the obtained models may be improved by adding novel patients and/or factors that may lead to a more precise model. However, we do not consider using networks as the only option and rather see it as an alternative or important complement to traditional visualisation methods.
This study also has limitations. First, we analysed only a group of patients from the first and second waves of the COVID-19 pandemic who were not vaccinated. Currently, immune profiles and other analysed factors may be affected by the vaccination, which has been performed in approximately two-thirds of the population [35,36], reinfections with SARS-CoV-2, and the emergence of other virus strains [37]. Second, our study focused mainly on patients with severe disease manifestations, which does not allow us to compare with patients infected with the current Omicron variant, because most patients have mild to moderate disease manifestations that do not require hospitalisation. Hospitalised patients infected with the Omicron variant are mainly those with comorbidities or who are at risk or those who refuse vaccination and participation in clinical and research studies [35,38,39]. However, the used cohort of patients enabled us to compare the obtained data with previously published studies from the same period.

Conclusions
By exploring and visualising multiple variables from our SARS-CoV-2 real-world dataset using an unsupervised network analysis approach PSN, we have shown that it is possible to obtain (i) a detailed model of the relationships among multiple factors and (ii) actionable and interpretable observations in real-world datasets. Specifically, we identified a minimal immune signature consisting of three parameters: a high percentage of activated CD4+ T-lymphocytes, a low percentage of CTLA-4+ T-lymphocytes, and a low percentage of immature B-lymphocytes, that were strongly associated with COVID-19 severity expressed as the need for hospitalisation, pneumonia, impaired lung functions, and a persistent dyspnoea four to eight weeks after the COVID-19 diagnostic PCR test. In addition, visualisation of clinical and laboratory factors in patient clusters detected by PSN showed that the minimal immune signature associated with a history of severe COVID-19 disease also associated with higher levels of IgG and IgM and impaired lung function four to eight weeks after COVID-19. Furthermore, anosmia/ageusia in the first and second waves of COVID-19 was more frequently associated with a mild course of disease, and more women than men had olfactory and gustatory dysfunction. Based on results consistent with published findings from the first and second waves of COVID-19, we have shown that it is possible to find a model that can be translated into a visual and easily interpretable form.
Taken together, this study demonstrates the advantages of using multivariate analysis over univariate analysis when studying complex datasets. Although we have presented the advantages of network analysis to study the host response to a SARS-CoV-2 infection, the model is generally applicable to any complex dataset. Therefore, network analysis may be important for uncovering the relationship between many laboratory, clinical, and personal factors in complex diseases/conditions, as extracting meaning from complex datasets can help with data interpretation and better patient management.

Institutional Review Board Statement:
The study was performed in accordance with the ethical standards of the institutional and/or national research committee, respected the 1964 Helsinki Declaration and its later amendments or comparable relevant ethical standards, and was approved by the Institutional Ethics Committee of Palacký University Olomouc and University Hospital Olomouc (ID: NU22-A-105).
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data are available from the corresponding author upon reasonable request.