Application of Drug Efﬁciency Index Metric for Analysis of Post-Traumatic Stress Disorder and Treatment Resistant Depression Gene Expression Proﬁles

: Post-traumatic stress disorder (PTSD) is a severe mental illness with grave social, political, economic, and humanitarian implications. To apply the principles of personalized omics-based medicine to this psychiatric problem, we implemented our previously introduced drug efﬁciency index (DEI) to the PTSD gene expression datasets. Generally, omics-based personalized medicine evaluates individual drug action using two classes of data: (1) gene expression, mutation, and Big Data proﬁles, and (2) molecular pathway graphs that reﬂect the protein–protein interaction. In the particular case of the DEI metric, we evaluate the drug action according to the drug’s ability to restore healthy (control) activation levels of molecular pathways. We have curated ﬁve PTSD and one TRD (treatment-resistant depression) cohorts of next-generation sequencing (NGS) and microarray hybridization (MH) gene expression proﬁles, which, in total, comprise 791 samples, including 379 cases and 413 controls. To check the applicability of our DEI metrics, we have performed three differential studies with gene expression and pathway activation data: (1) case samples vs. control samples, (2) case samples after treatment or/and observation vs. before treatment, and (3) samples from patients positively responding to the treatment vs. those responding negatively or non-responding patients. We found that the DEI values that use the signaling pathway impact activation (SPIA) metric were better than those that used the Oncobox pathway activation level (Oncobox PAL) approach. However, SPIA, Oncobox PAL, and DEI evaluations were reliable only if there were differential genes between case and control, or treated and untreated, samples.


Introduction
Post-traumatic stress disorder (PTSD) is a pathological mental health condition triggered by a terrifying event-either experiencing it or witnessing it. Symptoms may include flashbacks, nightmares, severe anxiety, and uncontrollable thoughts about the event [1,2]. Given the drastic, humiliating, and oppressing conditions that provoke PTSD, it poses a grave and ever-actual social, political, and humanitarian threat worldwide [3,4]. Most people who go through traumatic events may have temporary difficulty adjusting and coping, but with time and good self-care, they usually get better. If the symptoms worsen, last for more than a month or longer, and interfere with day-to-day functioning, the patient may have PTSD [5,6].
In the current work, we have attempted to use and modify our gene expression analysis software for the DEI metric [51] to analyze the gene expression data in PTSD patients. We found that the DEI metric is meaningful if there are enough differentially expressed genes between the case and control samples and the pre-treatment and posttreatment samples. Otherwise, the DEI metric is not reliable, which correlates with high p-values for the differential Student's t-test.

PTSD Gene Expression Datasets
We processed six gene expression datasets for PTSD and treatment-resistant depression (TRD) patients before and after treatment or observation, as well as corresponding gene expression profiles for the control/healthy samples (See Table 1). For each dataset, except dataset GSE185855, there was no specific treatment for PTSD given. In other words, the main expectation was that time cures stress. The untreated cases were cases before the observation period, whereas the treated cases were those studied after treatment.
The microarray gene expression data were normalized using the quantile normalization (QN) method [78], and the next-generation sequencing (NGS) profiles were normalized using the Differentially Expression in Sequencing 2 (DESeq2) approach [79].

Pathway Activation Level Assessment
Previously, we introduced two computational methods for the activation levels of cell signaling pathways. One approach, called Oncobox PAL [67], calculates for each pathological case sample the following quantities:

1.
Log-fold-change, LFC n , i.e., log 2 ratio of gene n mRNA concentrations in the test sample and in the control pool (median value in the control group) 2.
Pathway p activation level, PAL p = ∑n NII np · ARR np · LFC n /∑ n |ARR n |. Here, the node involvement index, NII np , is the Boolean flag for gene product n concerning the pathway p. The discrete value ARR np (activator/repressor role) is determined for a gene n in the pathway p as follows: ARR np = −1; gene product n is a repressor of pathway p −0.5; gene product n is more a repressor than an activator of pathway p 0; unclear repressor or activator role in pathway p 0.5; gene product n is more an activator than a repressor of pathway p 1; gene product n is an activator of pathway p.
Another method we have proposed for pathway-based assessment of individual drug reactions is the DEI metric, as we have applied it to cannabis-based drugs. It uses the SPIA method [51,71] for pathway activation levels [51,71]. Among the other methods for gene expression-based assessment of signaling pathway activation, including TAPPA [68], TBScore [69], and PE [70], the SPIA [71] approach showed the best statistical performance during the comparison between pathway-based and gene-based values [51,89].
We calculate the SPIA values [51,71] for a pathway graph, G(V, E), where V = {g 1 , g 2 , . . . , g n } is the set of graph nodes (vertices), and E = (g 1 , g 2 ) genes g i and g j interact is the set of graph edges. We define the adjacency matrix a as follows: a ij = 1, if i = j or g i , g j ∈ E and a ij = 0, if g i , g j / ∈ E. We consider then perturbation factors (PF) for the all genes g of the pathway K.
Here, ∆E(g) is the signed log 2 -fold change of the gene g expression level in a given sample compared with the average value for the pool of control samples. The latter term expresses the summation over all the genes γ that belong to the set U g of the upstream genes for the gene g. The value of n down (γ) denotes the number of downstream genes for gene γ. The weight factor β γg indicates the interaction type between γ and g: β γg = 1 if γ activates g and β γg = −1 when γ inhibits g. We apply the depth-first search method to upstream/downstream genes [51,71].
We use the second term in the formula for the perturbation factor (PF) from the previous paragraph, resulting in the accuracy value, Acc(g) = PF(g) − ∆E(g). Tarca et al. [71] showed that accuracy vector as expressed as follows: I is the identity matrix, and We calculated the overall score for the pathway perturbation, calculated as SPIA = ∑ g Acc(g) [51,71].
According to the SPIA method [51,71], we calculated the activation levels for 241 signaling pathways. We borrowed the graph topology from the Qiagen pathway database [51,88].
We applied the SPIA calculation with and without the differential expression gene filter. If in a case sample the gene g had an expression level at least 1.5 times higher or lower than the median expression levels of the control samples, we considered this gene differentially expressed and used it for the SPIA calculations.

Evaluation of the Individual Drug Action
Having calculated the Oncobox PAL and SPIA values, we then performed the following statistical analysis [51],

1.
Obtain the SPIA (PAL) values for each dataset and biological pathway.

2.
Calculate the values of the pathway weight (w p ) factor as follows. For pathways with a positive mean SPIA (PAL) score of the case samples, w p = ((number of case samples with positive SPIA (PAL) score)/(total number of case samples)). For pathways with a negative mean SPIA (PAL) score of the case samples, w p = ((number of case samples with negative SPIA (PAL) score)/(total number of case samples)).

4.
Perform a Student's t-test if the values of SPIA µ (PAL µ ) for the pool of case samples are different from 0 (for the pool of control samples, the values of SPIA µ (PAL µ ) are clearly equal to 0). During the Student's t-test, the following case classes are taken into account: the untreated case (U), i.e., the pathological state before drug application, should be far from the control (C).

4.1
The treated case (T), i.e., the pathological state after drug application, should be close to the control; 4.2 The following values are the results for such calculations: |t U | = absolute t-value for the Student's t-test for U-vs-C profiles; 4.3 |t T | = absolute t-value for the Student's t-test for T-vs-C profiles.
In addition to the old DEI metric [51] for individual drug action, 2· we have now introduced an alternative metric, called mirrored DEI, calculated as follows, The DEI M metric equals 1 when t T = −t U ; this is the maximal possible value of this metric.
Similar to the old DEI metric, DEI M = 0, when in t T = t U , and when |t T | |t U |, DEI M = −1.
The third metric, balanced DEI: DEI B = CDEI+CDEI M 2 , is the mean value over DEI and mirrored DEI.

Robust Marker Gene and Pathway Analysis
We applied (see Figure 1) the leave-one-out (LOO) procedure [90,91] to complete the robust core marker gene/pathway set that distinguishes: (1) Case samples from control samples (PTSD or TRD), the disease-vs-healthy, D_vs_H test; (2) Case samples after treatment or/and observation from those before treatment (untreated), the post-vs-ante, P_vs_A test; (3) Patients positively responding to the treatment from negatively or non-responding patients, the responders-vs-non-responders, R_vs_N test.
The third metric, balanced DEI: , is the mean value over DEI and mirrored DEI.

Robust Marker Gene and Pathway Analysis
We applied (see Figure 1) the leave-one-out (LOO) procedure [90,91] to complete the robust core marker gene/pathway set that distinguishes: (1) Case samples from control samples (PTSD or TRD), the disease-vs-healthy, D_vs_H test; (2) Case samples after treatment or/and observation from those before treatment (untreated), the post-vs-ante, P_vs_A test; (3) Patients positively responding to the treatment from negatively or non-responding patients, the responders-vs-non-responders, R_vs_N test.
We called a gene or a pathway upregulated if it had an expression or activation level higher for the disease state than for the healthy state after the treatment versus before the treatment and for the responders versus non-responders. If the expression/activation levels were lower, they were downregulated. We selected the marker genes/pathways according to the area under the receiveroperating curve (AUC) metric. The ROC (receiver-operator curve) is a widely-used graphical plot that illustrates the diagnostic ability of a binary classifier system, as its discrimination threshold is varied. The ROC is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The area under the ROC curve, called ROC AUC, or simply AUC, is routinely employed for classifier quality [92].
In the resulting robust marker gene and pathway sets, we selected only those genes/pathways with AUC > 0.70 (the conventional threshold for a significant marker). The robust set of marker features (genes or pathways) should assure the reliable distinction between different classes of samples (cases vs. controls, untreated vs. treated cases, and positively responding patients vs. those with no positive response). From the theory of machine learning, we know that the number of robust marker features should not exceed the number of samples. Since the smallest cohort, GSE860, contained 33 samples, we selected 30 as the maximal number of possible marker features ( Figure 1).
For the genes in robust marker set, we retrieved the description for the corresponding protein and the relevant biological processes (Gene Ontology) using Metascape, an online resource for gene annotation and analysis [93].

Alternative Methods for Differential Gene Expression and Enrichment Analysis
To check the overall pattern for marker genes/pathways for all three tests (D_vs_N, P_vs_A, and R_vs_N), we also applied the conventional procedure for differential gene expression (DEG) analysis. According to conventional procedure [94], we considered a We called a gene or a pathway upregulated if it had an expression or activation level higher for the disease state than for the healthy state after the treatment versus before the treatment and for the responders versus non-responders. If the expression/activation levels were lower, they were downregulated.
We selected the marker genes/pathways according to the area under the receiveroperating curve (AUC) metric. The ROC (receiver-operator curve) is a widely-used graphical plot that illustrates the diagnostic ability of a binary classifier system, as its discrimination threshold is varied. The ROC is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The area under the ROC curve, called ROC AUC, or simply AUC, is routinely employed for classifier quality [92].
In the resulting robust marker gene and pathway sets, we selected only those genes/ pathways with AUC > 0.70 (the conventional threshold for a significant marker). The robust set of marker features (genes or pathways) should assure the reliable distinction between different classes of samples (cases vs. controls, untreated vs. treated cases, and positively responding patients vs. those with no positive response). From the theory of machine learning, we know that the number of robust marker features should not exceed the number of samples. Since the smallest cohort, GSE860, contained 33 samples, we selected 30 as the maximal number of possible marker features ( Figure 1).
For the genes in robust marker set, we retrieved the description for the corresponding protein and the relevant biological processes (Gene Ontology) using Metascape, an online resource for gene annotation and analysis [93].

Alternative Methods for Differential Gene Expression and Enrichment Analysis
To check the overall pattern for marker genes/pathways for all three tests (D_vs_N, P_vs_A, and R_vs_N), we also applied the conventional procedure for differential gene expression (DEG) analysis. According to conventional procedure [94], we considered a gene differentially expressed if it simultaneously satisfies the two criteria for DESeq2 test [79]: the adjusted p-value for the negative binomial model, pAdjusted < 0.05, and the log2-fold change, |LFC 2 | > 0.5. We plotted the results of the DESeq2 analysis using the EnhancedVolcano R package.

Cohorts Used in the Analysis
We aimed to analyze all available transcriptome data for PTSD/TRD patients using our previously published DEI approach. Our goal was to test whether the DEI approach would differentiate the responders vs. the non-responders, the case vs. control samples, and treated vs. non-treated based on their gene expression levels. We curated six cohorts of 791 gene expression profiles, including 379 case samples and 413 control samples ( Table 2; see more details in Table 1).

Comparison of DEI Options for Different Approaches to the Assessment of Pathway Activation Levels
We have attempted three methods for calculation of PAL: 1. SPIA with differential expression filter [51]; 2.
The DEI value we suggested [51] is a normalized indicator of treatment success. We measured the deviation of PAL values from the healthy reference using the Student's t-test, which produces the metrics t(U) and t(T) for the untreated and treated cases, respectively. It equals 1 when the cell's physiological conditions after treatment are undistinguishable from the healthy state. Higher DEI values may indicate that the treatment was efficient and that the post-treatment state is close to a healthy reference in terms of gene expression. The positive DEI value means that the post-treatment conditions are less dysregulated than the pre-treatment conditions, while a negative DEI value means that the post-treatment conditions are more dysregulated than the pre-treatment conditions. A strongly dysregulated post-treatment state may have a DEI value approaching −1.
To select the best PAL metric, we required that the corresponding DEI value is positive as frequently as possible. According to this criterion, we obtained the best results using the SPIA method with a differential expression filter; see Table 3.
However, for the TRD dataset GSE185855 [84], all the DEI scores for responders to ketamine treatment were unexpectedly negative, although they were insufficiently reliable in terms of the differential Student's t-test with pathway activation values, at least for the patients without positive response to the treatment.
Additionally, for the datasets GSE81761 [82] and GSE185855 [84], the DEI values for the patients with no improvement were higher than for the patients with positive changes.
These negative values may indicate the following disadvantage of the DEI metric: it does not consider the sign flip from positive to negative, which indicates a transition through the zero value.
Perhaps, to overcome the disease, the treatment should change the pathway perturbation and the most effective treatment, which inverses the sign of the SPIA deviation from the control level. Thus, the best treatment should result in t T = -t U rather than t T = 0. Therefore, in addition to the old DEI [51], we introduced an alternative metric, called mirrored DEI (DEIm), as well as the average of the DEI and DEIm, called balanced DEI (DEIb) (see Materials and Methods).
Note that the DEIm and DEIb metrics helped the GSE81761 cohort but not the GSE185855 cohort, where the results for non-responders were always statistically unconvincing according to the Student's t-test (Table 3).

Robust Marker Gene/Pathway Analysis, Differentially Expressed Gene (DEG) Analysis, and Gene Ontology (GO) Enrichment Analysis
To have a closer and more in-depth view of the pattern of genes and pathways that discriminate different types of samples from one another, we identified robust marker sets [90,91] of genes and pathways for three different comparison groups described above: D_vs_H, P_vs_A and R_vs_N For these three tests, we considered a gene/pathway upregulated when the expression/activation level was higher for the case samples than the control samples, for the post-treatment cases than the pre-treatment counterparts, and for the patients with clinical improvements than the persons with no improvement.
Using the online gene analytical portal Metascape [93], we specified the protein description and biological functions (Gene Ontology) for the identified robust marker genes.
Additionally, for these three comparison groups, we analyzed the differentially expressed genes using the DESeq2 tool [79].
We analyzed the intersection of robust marker genes and pathway sets between differential tests for all cohorts ( Figure 2) and between cohorts for all comparison groups ( Figure 3). Table 4 shows the number of robust marker genes (marked as Exp) and pathways. We calculated the pathway activation levels using the SPIA method with the differential expression filter (marked as SPIA). These genes and pathways are also listed in Supplementary Files S1-S6. Exp shows the number of robust marker genes, while SPIA shows the number of robust pathways.

Dataset Exp_D_vs_H Exp_P_vs_A Exp_R_vs_N SPIA_D_vs_H SPIA_P_vs_A SPIA_R_vs_N
As can be seen from the analysis above, the number of the robust marker genes for all tests and datasets was relatively small, with little to no overlap between cohorts. To confirm this data, we also performed a classical DEG analysis, where we selected the genes with an expression level higher or lower than log2 of 0.5 and an adjusted p = value lower than 0.05 (see methods section for details). Table 5 summarizes the data seen in Figures 3-5, and Supplementary File S8 has a list of all genes significantly (p < 0.05) altered by log2 > 0.5 or log2 < −0.5. Cohort GSE81761 had a large number of genes in two categories, A_vs_P and R_vs_N; in both, most of the genes were downregulated. Cohort GSE64814-GPL6244 also contains a large number of genes in A_vs_P category.  A detailed description of the function of genes in various comparisons is presented in Supplementary File S7.
As can be seen from the analysis above, the number of the robust marker genes for all tests and datasets was relatively small, with little to no overlap between cohorts. To confirm this data, we also performed a classical DEG analysis, where we selected the genes with an expression level higher or lower than log2 of 0.5 and an adjusted p = value lower than 0.05 (see methods section for details). Table 5 summarizes the data seen in Figures 3-5, and Supplementary File S8 has a list of all genes significantly (p < 0.05) altered by log2 > 0.5 or log2 < −0.5. Cohort GSE81761 had a large number of genes in two categories, A_vs_P and R_vs_N; in both, most of the genes were downregulated. Cohort GSE64814-GPL6244 also contains a large number of genes in A_vs_P category.   We then compared the number of robust genes (as per AUC criteria) and DESeq2 genes. It appears that only for cohort GSE860 was the number of genes higher per AUC criteria. For the GSE97356 and GSE64814-GPL11154 cohorts, the numbers were comparable, while for GSE64814-GPL6244, GSE81761 and GSE185855, the numbers were higher in the DESeq2 category. The most dramatic differences were observed for the GSE81761 cohort, especially in P_vs_A and R_vs_N categories.
We then analyzed whether there are any overlapping genes in both categories ( Table  6 and Supplementary File S9). For the GSE860 cohort, there was one gene overlap in the D_vs_H category: PAX3, encoding the protein involved in sensory perception of sound and mechanical stimuli, as well as muscle development. In the GSE64814-GPL6244 cohort, D_vs_H also had one gene overlap: EPSTI1-epithelial stromal interaction 1, while We then compared the number of robust genes (as per AUC criteria) and DESeq2 genes. It appears that only for cohort GSE860 was the number of genes higher per AUC criteria. For the GSE97356 and GSE64814-GPL11154 cohorts, the numbers were comparable, while for GSE64814-GPL6244, GSE81761 and GSE185855, the numbers were higher in the DESeq2 category. The most dramatic differences were observed for the GSE81761 cohort, especially in P_vs_A and R_vs_N categories.
We then analyzed whether there are any overlapping genes in both categories (Table 6 and Supplementary File S9). For the GSE860 cohort, there was one gene overlap in the D_vs_H category: PAX3, encoding the protein involved in sensory perception of sound and mechanical stimuli, as well as muscle development. In the GSE64814-GPL6244 cohort, D_vs_H also had one gene overlap: EPSTI1-epithelial stromal interaction 1, while P_vs_A had an overlap of 17 genes. Among these, there was a large number of genes encoding snoR-NAs, including SCARNA7, SCARNA17, SNORD6, SNORD24, SNORDD43, SNORD68, and SNORA65 as well as regulators of transcription, DRAP1, PHTF1, and NELFE. For the GSE81761 cohort, there were seven gene overlaps in the R_vs_N category, all encoding Zn finger proteins involved in spindle formation, regulation of non-coding RNA biogenesis, and regulation of transcription and protein ubiquitination. For the GSE185855 cohort, the was a complete overlap of 11 genes between AUC and DESeq2 analyses for the R_vs_N category; they included proteins involved in differentiation of natural killer cells, basophils, and erythrocytes; suppression by virus of apoptotic processes; regulation of autophagy in neurons; and the adiponectin-activated signaling pathway. The failure of DEI, DEIm, and DEIb metrics to reflect differences between responders and non-responders for the TDR cohort (GSE185855) may be due to the small number, or lack of, differentially expressed genes in the D_vs_H and P_vs_A tests for this cohort (Figures 4 and 5, respectively), which makes the PAL/SPIA and DEI calculations highly unreliable. High p-values for the differential Student's t-test with pathway values for untreated and treated non-responders confirm this unreliability. On the other hand, the differential genes for the R_vs_N test with the GSE185855 dataset were rather numerous ( Figure 6A)-most of these genes showed a more drastic dysregulation in the responder group, thus reflecting the negative value for DEI/DEIm/DEIb (Table 3).
Psychoactives 2023, 2,13 P_vs_A had an overlap of 17 genes. Among these, there was a large number of genes encoding snoRNAs, including SCARNA7, SCARNA17, SNORD6, SNORD24, SNORDD43, SNORD68, and SNORA65 as well as regulators of transcription, DRAP1, PHTF1, and NELFE. For the GSE81761 cohort, there were seven gene overlaps in the R_vs_N category, all encoding Zn finger proteins involved in spindle formation, regulation of non-coding RNA biogenesis, and regulation of transcription and protein ubiquitination. For the GSE185855 cohort, the was a complete overlap of 11 genes between AUC and DESeq2 analyses for the R_vs_N category; they included proteins involved in differentiation of natural killer cells, basophils, and erythrocytes; suppression by virus of apoptotic processes; regulation of autophagy in neurons; and the adiponectin-activated signaling pathway. The failure of DEI, DEIm, and DEIb metrics to reflect differences between responders and non-responders for the TDR cohort (GSE185855) may be due to the small number, or lack of, differentially expressed genes in the D_vs_H and P_vs_A tests for this cohort (Figures 4 and 5, respectively), which makes the PAL/SPIA and DEI calculations highly unreliable. High p-values for the differential Studentʹs t-test with pathway values for untreated and treated non-responders confirm this unreliability. On the other hand, the differential genes for the R_vs_N test with the GSE185855 dataset were rather numerous ( Figure 6A)most of these genes showed a more drastic dysregulation in the responder group, thus reflecting the negative value for DEI/DEIm/DEIb (Table 3). Figure 6. Differential gene expression analysis for the R_vs_N test, i.e., PTSD/TRD patients who were responsive or non-responsive to treatments in GSE185855 (A) and GSE81761 (B) cohorts. In green, genes with the expression values of log2 > 0.5 and log2 < −0.5; in blue, genes significantly (p < 0.05) differently expressed; in red, genes that are significantly differently expressed with the differences of log2 > 0.5 and log2 < −0.5. Figure 6. Differential gene expression analysis for the R_vs_N test, i.e., PTSD/TRD patients who were responsive or non-responsive to treatments in GSE185855 (A) and GSE81761 (B) cohorts. In green, genes with the expression values of log2 > 0.5 and log2 < −0.5; in blue, genes significantly (p < 0.05) differently expressed; in red, genes that are significantly differently expressed with the differences of log2 > 0.5 and log2 < −0.5.
The systems bioinformatics approach to personalized drug prescription uses different paradigms. First, we can focus on pathway activation data, which have higher reliability and robustness than single/handfuls of gene expression and mutation profiles [89]. We can use aggregated (cumulative) PAL and mutation burden values as personalized markers for the analysis of drug efficiency [67]. We refer to this paradigm for personalized drug prescription as an a posteriori, or non-hypothesis-driven, approach. It relies on direct mathematical association of molecular and clinical features in the training dataset, using machine learning (ML)/artificial intelligence (AI) applications [90,91].
Another paradigm, an a priori, or hypothesis-driven, approach links molecular pathway activation with specific targets of drugs and ranks these drugs according to their ability to restore the healthy state within the cell [52]. Despite its greater complexity, we can say that this approach may be more appropriate for clinical implementation.
On the other hand, we have already proposed and verified (for the action of cannabis drugs on the skin, oral, and intestinal epithelial cells) the method and software for omicsbased personalized scoring of treatment efficiency. Like many, if not most, other omicsbased personalized medicine projects, our DEI (drug efficiency index) metric [51] uses the following methodology:

•
The Big Data full expression profiles, obtained via ether microarray hybridization (MH) or next-generation sequencing (NGS) of mRNA are used as input data for PAL calculations; • For each gene, we calculate the case-to-control log-fold changes (LFCs) using full expression profiles; • We then unitize these LFCs for the assessment of PAL, including Oncobox PAL [67], or SPIA (signaling pathway impact analysis) metrics [71], according to the proteinprotein interactions graphs for each pathway.
However, unlike the Oncobox [52] approach to the assessment of personalized drug efficiency, our DEI [51] method does not exploit the purely a priori paradigm. Instead, it compares the SPIA values before and after treatment to check which state (pre-or posttreatment) differs more from healthy conditions. Therefore, our DEI method is more a posteriori by type than the Oncobox approach.
To check whether our DEI method is helpful in psychiatry, we have attempted it for five PTSD cohorts and one TRD cohort. These six cohorts enlisted 379 cases and 413 control samples (overall, 791 profiles). Except for the TRD cohort (GSE185855), which contained patients treated with ketamine, all patients from the five PTSD cohorts did not receive any pharmaceutical treatment; instead, they were under observation during psychotherapy. Thus, we could not apply the Oncobox drug efficiency scoring (DES) approach [52,107] since psychotherapy has no molecular targets.
Our analysis of the DEI metric for the psychotherapy of PTSD, as well as for the ketamine therapy of TRD, implied the following differential analysis of PAL values (both SPIA and Oncobox PAL), as well as of gene expression values: • Case samples (PTSD or TRD) from control samples; • Case samples after treatment or/and observation from before treatment (untreated); • Samples from patients positively responding to the treatment from responding negatively or non-responding.
To assess the adequacy of the DEI metric for treatment success, we checked whether DEI values are positive as frequently as possible. According to this criterion, we achieved the best results for the DEI with SPIA values with a differential expression filter. The SPIA values without differential expression filter and Oncobox PAL values produced worse DEI results.
Only one PTSD (GSE81761) and the only TRD cohort (GSE185855) contained information on the clinical success of the therapy. For these cohorts, we tested the quality of the DEI metric in identifying responders versus non-responders only in these two cohorts; our analysis could differentiate, that is, provide the higher values for the responders.
We introduced two DEI modifications, mirrored DEI (DEIm) and balanced DEI (DEIb, the mean value over our original DEI, and mirrored DEI). The DEIm and DEIb values provided the proper difference between groups for the PTSD cohort. However, they did not produce meaningful results for the TRD cohort, perhaps because there were no reliable differentially expressed genes (DEG) in the disease-vs-healthy and after-vs-before treatment tests. This lack of DEG made every PAL, SPIA, and DEI calculation questionable, despite the abundance of DEG in the responders-vs-non-responders comparison.
The analysis of dysregulated genes showed the genes falling into the following categories: gene expression regulators on the level of chromatin, transcription, splicing, mRNA stability, and histone ubiquitination; immune response to bacteria, viruses, and fungi; signal transduction; ion transport; neuron morphogenesis, autophagy, and oxidative stress; cell division and differentiation. On a molecular level, dysregulation of these genes may cause multiple symptoms of PTSD. Indeed, a number of reports documented the dysfunction of the immune system in PTSD patients, including an increase in pro-inflammatory cytokines, immunosenescence, and decreased circulation of T cells [108]. PTSD also shows changes in neuronal morphology, disrupted neural plasticity, and substantial remodeling of dendrites [109].
Another question for the further application of the DEI/DEIm/DEIb metrics concerns cohort merging. For example, the cohort GSE64814 contained two sub-cohorts, one with data obtained with microarray hybridization (MH) Affymetrix Human Gene 1.0 ST Array (GPL6244), and the other with data obtained by sequencing with the NGS device Illumina HiSeq 2000 (GPL11154). Some methods can merge the unlimited number of MH and NGS cohorts. One of the most general examples is the uniformly shaped harmonization algorithm; however, this algorithm may affect the LFC values [110]. Therefore, this may invalidate the SPIA calculations with the differential expression filter. Thus, applying DEI/DEIm/DEIb metrics after transcriptomic harmonization may require more detailed research.

Conclusions
There is no consensus on the best treatment strategy for PTSD; some practitioners argue that the proverb "time cures" suggests the most proper treatment. To assess the clinical aspects of PTSD at the molecular and cell physiology level, we have applied the multi-omics personalized medicine approach to PTSD treatment. This approach utilizes Big Data on multi-omics (genome, transcriptome, proteome, etc.) profiling to study the physiological properties of healthy and pathological samples at the level of different cellular pathways.
Previously, we introduced the DEI, a metric for a drug's ability to restore a healthy state in the gene expression pathways. Here, we have applied the DEI concept to the transcriptomics data from six cohorts of PTSD and TRD patients. We found that the best treatment should not only make the pathway activation levels similar to the healthy state (as we suggested previously) but should also encourage the pathological changes to disappear by reverting the sign of pathway activation levels.
In addition, we concluded that any type of DEI is adequate only when there are significant differences between the healthy vs. pathological and treated vs. untreated gene expression profiles. The differential gene expression analysis (based on the area under the receiver-operating curve (robust AUC) and negative binomial distribution (DESeq2)) confirms this finding.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/psychoactives2020007/s1, File S1: Robust marker gene (with Gene Ontology) and pathway set for the dataset GSE860; File S2: Robust marker gene (with Gene Ontology) and pathway set for the dataset GSE64814-GPL6244; File S3: Robust marker gene (with Gene Ontology) and pathway set for the dataset GSE64814-GPL11154; File S4: Robust marker gene (with Gene Ontology) and pathway set for the dataset GSE97356; File S5: Robust marker gene (with Gene Ontology) and pathway set for the dataset GSE81761; File S6: Robust marker gene (with Gene Ontology) path pathway set or the dataset GSE185855; File S7: Description of function of genes in various cohort comparisons. File S8: Differentially expressed genes from different cohorts and categories; File S9: Genes overlapping between the robust maker genes (AUC criteria) and the differentially expressed genes (DESeq2 criteria).

Conflicts of Interest:
The authors declare no conflict of interest.