Investigating Core Signaling Pathways of Hepatitis B Virus Pathogenesis for Biomarkers Identification and Drug Discovery via Systems Biology and Deep Learning Method

Hepatitis B Virus (HBV) infection is a major cause of morbidity and mortality worldwide. However, poor understanding of its pathogenesis often gives rise to intractable immune escape and prognosis recurrence. Thus, a valid systematic approach based on big data mining and genome-wide RNA-seq data is imperative to further investigate the pathogenetic mechanism and identify biomarkers for drug design. In this study, systems biology method was applied to trim false positives from the host/pathogen genetic and epigenetic interaction network (HPI-GEN) under HBV infection by two-side RNA-seq data. Then, via the principal network projection (PNP) approach and the annotation of KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, significant biomarkers related to cellular dysfunctions were identified from the core cross-talk signaling pathways as drug targets. Further, based on the pre-trained deep learning-based drug-target interaction (DTI) model and the validated pharmacological properties from databases, i.e., drug regulation ability, toxicity, and sensitivity, a combination of promising multi-target drugs was designed as a multiple-molecule drug to create more possibility for the treatment of HBV infection. Therefore, with the proposed systems medicine discovery and repositioning procedure, we not only shed light on the etiologic mechanism during HBV infection but also efficiently provided a potential drug combination for therapeutic treatment of Hepatitis B.


Introduction
Hepatitis B Virus (HBV) infection remains a prevalent health challenge worldwide, and is greatly associated with substantial morbidity and mortality due to serious liver diseases, such as hepatitis, fibrosis, cirrhosis, and even hepatocellular carcinoma [1]. According to the latest report of World Health Organization (WHO) [2], HBV affected an estimated 257 million people and causing 600,000 deaths per year. While current therapies seek to control the progression of the disease, life-long treatment and surveillance are still needed because resistance develops during treatment and reactivation often occurs after medication discontinuation [3,4]. In recent years, dedication to new drug design and immunotherapy development has been made by researchers to hopefully silence HBV in infected hepatocytes [5,6]. However, the host/pathogen cross-talk mechanism contributing p H i (t + 1) = p H i (t) + where p H i (t), p H n (t), g H i (t) and p P v (t) indicate the expression levels of the ith host protein, the nth host protein, the ith host gene, and the vth pathogen protein at time t, respectively; a H in and b H iv denote the interactive ability between the ith host protein and nth host protein and between the ith host protein and vth pathogen protein, respectively; N i and V i signify the number of host proteins and pathogen proteins interacting with the ith host protein; and I typifies the total number of the human proteins in candidate PPIN. α H i , −γ H i and β H i indicate the translation rate from the corresponding mRNA, the degradation rate, and the basal expression level of the ith host protein, respectively; the basal level denotes the interactions with unknown factors, such as acetylation, methylation, and ubiquitination; and v H i (t) represents the stochastic noise of the ith host protein at time t. Note that the intraspecies and interspecies biological mechanism of human proteins in candidate PPIN are represented in Equation (1) as the form of N i n=1 a H in p H i (t)p H n (t) and , respectively. Now that the interactions between the host and pathogen proteins are mutual and their biological characteristics involved in PPIs are similar, the dynamic PPIN of the qth pathogen protein in candidate HPI-GEN can be described by the analogous equation as follows: p P q (t + 1) = p P q (t) + Nq n=1 a P qn p P q (t)p H n (t) + Vq v=1 b P qv p P q (t)p P v (t) +α P q g P q (t) − γ P q p P q (t) + β P q + v P q (t), for q = 1, 2, . . . , Q, α P q ≥ 0 and − γ P q ≤ 0 (2) where p P q (t), p H n (t), g P q (t) and p P v (t) indicate the expression levels of the qth pathogen protein, the nth host protein, the qth pathogen gene, and the vth pathogen protein at time t, respectively; a P qn and b P qv denote the interactive ability between the qth pathogen protein and nth host protein and between the qth pathogen protein and vth pathogen protein, respectively; N q and V q signify the number of host proteins and pathogen proteins interacting with the qth pathogen protein; and Q typifies the total number of the pathogen proteins in candidate PPIN. α P q , −γ P q and β P q indicate the translation rate from the corresponding mRNA, the degradation rate, and the basal expression level of the qth pathogen protein, respectively; the basal level β P q denotes the interactions with unknown factors such as acetylation and ubiquitination; and v P q (t) represents the stochastic noise of the qth pathogen protein at time t. Similarly, the intraspecies and interspecies biological mechanism of pathogen proteins in candidate PPIN are represented in Equation (2) as the form of N q n=1 a P qn p P q (t)p P n (t) and V q v=1 b P qv p P q (t)p H v (t), respectively. For the GRN of host genes in the candidate HPI-GEN, the dynamic regulatory model of the jth host gene can be described as follows: where g H j (t), p H τ (t), m H µ (t) and l H (t) indicate the expression levels of the jth host gene, the τth host TF, the µth host microRNA, and the th host lncRNA at time t, respectively; c H jτ , −e H jµ and d H j represent the regulation ability of the τth host TF, the µth host microRNA, and the th host lncRNA on the jth host gene, respectively; T j , M j and L j denote the number of host TFs, host miRNAs, and host lncRNAs, respectively, which regulate the expression level of the jth host gene; J indicates the total amount of genes in GRN; −λ H j and δ H j indicate the degradation rate and the expression basal level of the jth host gene, respectively; and H j (t) denotes the stochastic noise due to modeling residue at time t. The basal level δ H j denotes the unknown regulations such as methylation. Note that since microRNAs silence gene expression by binding to target mRNAs [39], repression ability −e H jµ in candidate GRN model should be constrained negative. Likewise, −λ H j degradation rate should be restricted non-positive due to the depression regulation of mRNA. Furthermore, though supporting evidence comes from numerous research showing that HBV proteins play a vital role in the regulation of viral and host gene expression, different from other viruses such as HIV, whose viral protein directly binds to nucleic acids [40], HBV proteins are thought to affect host gene expression through transregulation, e.g., transactivation and transrepression [41][42][43][44]. Thus, it seems tenable to exclude the regulations between pathogen proteins and host gene in the Equation (3).
In the same way, for the GRN of pathogen genes in the candidate HPI-GEN, the dynamic regulatory model of the uth pathogen gene can be described as follows: e P uµ g P u (t)m H µ (t) −λ P u g P u (t) + δ P u + P u (t), for u = 1, 2, . . . , U, −e P uµ ≤ 0 and − λ P u ≤ 0 (4) where g P u (t), p H τ (t), m H µ (t) and l H (t) indicate the expression levels of the uth pathogen gene, the τth host TF, the µth host microRNA, and the th host lncRNA at time t, respectively; c P uτ , −e P uµ and d P u represent the regulation ability of the τth host TF, the µth host microRNA, and the th host lncRNA on the uth pathogen, respectively; T u , M u and L u denote the number of host TFs, host miRNAs, and host lncRNAs, respectively, which regulate the expression level of the uth pathogen; U indicates the total amount of pathogen genes in GRN; −λ P u and δ P u indicate the degradation rate and the expression basal level of the uth pathogen, respectively; and P u (t) denotes the stochastic noise due to modeling residue at time t. It is worth noting that there is a consensus on the opinion about the indispensability of host protein synthesis machinery for pathogen protein production and gene transcription [45,46]. Walsh et al. [47] even suggested that viruses are fully reliant on the translation machinery of their host cells to produce the polypeptides that are essential for viral replication. As a result, only gene regulation conducted by host transcription factors p H τ (t) instead of pathogen proteins p P (t) is included in Equation (4).
Since the expression of the µth host miRNA m H µ (t) and the th host l H (t) in Equation (3) at time t might also be regulated by other regulators, the µth miRNA can be described by the following stochastic dynamic regulatory equation: , for µ = 1, 2, . . . , M, −k H µr ≤ 0 and − ϕ H µ ≤ 0 (5) where m H µ (t), p H τ (t), m H r (t) and l H (t) indicate the expression levels of the µth host gene, the τth host TF, the rth host miRNA and the th host lncRNA at time t, respectively; f H µτ , −k H µr and h H µ represent the regulation ability of the τth host TF, the rth host microRNA and the th host lncRNA on the µth host miRNA, respectively; T µ , M µ and L µ denote the number of host TFs, host miRNAs and host lncRNAs, respectively, which regulate the expression level of the µth host miRNA; M indicates the total number of miRNAs in GRN; −ϕ H µ and η H µ indicate the degradation rate and the expression basal level of the µth host miRNA, respectively; and ε H µ (t) denotes the stochastic noise due to modeling residue at time t. Along the same line, the dynamic regulatory model of the host lncRNA in candidate GEIN can be described by the following dynamic equation: −σ H l H (t) + φ H + ψ H (t), for = 1, 2, . . . , L, −z H µ ≤ 0 and − σ H ≤ 0 (6) where l H (t), p H τ (t), m H µ (t) and l H o (t) indicate the expression levels of the th host lncRNA, the τth host TF, the µth host microRNA and the th host lncRNA at time t, respectively; x H τ , −z H µ and y H o represent the regulation ability of the τth host TF, the µth host microRNA and the oth host lncRNA on the th host Biomedicines 2020, 8, 320 7 of 48 lncRNA, respectively; T , M and L denote the number of host TFs, host miRNAs and host lncRNAs, respectively, which regulate the expression level of the th host lncRNA; L indicates the total number of lncRNA in GRN; −σ H and φ H indicate the degradation rate and the expression basal level of the th host lncRNA, respectively; and ψ H (t) denotes the stochastic noise due to modeling residue at time t.

Parameter Estimation of the Dynamic Models of Candidate HPI-GEN by System Identification Approach
In order to identify real HPI-GEN, the accurate model parameters training process by executing a system identification approach to the candidate HPI-GEN after the construction of dynamic model Equations (1)-(6) is requisite. Consequently, the stochastic Equations (1) and (2), which depict the relation of protein interactions of the ith host and qth pathogen protein, can be converted into the linear regression forms as below, respectively: The ith host protein: The qth host protein: where φ HP i (t) and φ PP q (t) represent the regression vectors that can be obtained from the expression data and θ HP i and θ PP q are the unknown parameter vectors to be estimated for the ith host protein, qth pathogen protein in host PPIN, respectively.
Thence, the Equations (7) and (8) of the ith host and qth pathogen protein can further be augmented for Y i and Y q time points as the following forms, respectively: The ith host protein: the Equation (9) can also be simply represented as The qth host protein: the Equation (11) can also be simply represented as After a succession of equation transformation procedure in advance, the parameters in vectors θ HP i and θ PP q of Equations (10) and (12) can be estimated by individually employing the following constrained least-squares estimation problems, θ HP i parameters estimation: θ PP q parameters estimation: Subsequently, we can acquire the interaction parameters in PPIN Equations (1) and (2) individually by resolving the least-squares problems in (13) and (14) with the help of the function lsqlin in MATLAB optimization toolbox and simultaneously ensure the protein translation rate α H i and α P q to be a non-negative value and the protein degradation rate −γ H i and −γ P q to be a non-positive value; that is to say α H i , α P q ≥ 0 and −γ H i , −γ P q ≤ 0. In the same manner, the dynamic Equations (3)-(6) which depict the relationship of gene regulations for the jth host gene, uth pathogen gene, µth host miRNA, and th host lncRNA can be rewritten into the linear regression forms below, respectively: The jth host gene: The uth pathogen gene: The µth host miRNA: Biomedicines 2020, 8, 320 10 of 48 The th host lncRNA: where φ HG j (t), φ PG u (t), φ HM µ (t) and φ HL (t) represent the regression vectors that can be obtained from the expression data and θ HG j , θ PG u , θ HM µ and θ HL are the unknown parameter vectors to be estimated for the jth host gene, uth pathogen gene, µth host miRNA, and th host lncRNA in host GRN, respectively.
So far, the structure of dynamic regression model construction is complete, yet if we substitute the expression data into the linear model right away, infinite solutions problems will emerge due to the lack of enough data points, in other words, information deficiency in comparison with parameters for evaluation, that is, overfitting. Thus, the cubic spline for extra numbers of time-profile data points interpolation is applied to prevent this trouble (For more details, readers can refer to Appendix A). Then, with the processed RNA-seq data, the accurate solutions of the constrained least-squares estimation problems in (13)- (14) and (27)- (30) can be attained.
Moreover, since the measurement technology of genome-wide protein expression in HepaRG cells and Hepatitis B Virus during infection hasn't been realized yet, and supporting evidence comes from research showing that the cellular concentrations of proteins correlate with the abundances of their corresponding mRNAs, which implied that the variance in protein abundance can be explained by that of mRNA [48][49][50][51] and a correlation coefficient of 48% was also obtained between the mRNA and protein abundances in human liver [49], i.e., the RNA-seq data of gene expressions can substitute protein expressions and contribute to sufficient information for resolving above constrained least-squares parameter estimation problems in (13) and (14) and (27)- (30). One could retain more information in our previous research [52].

Determination of Significant Interaction Pairs
On account of approaching the best combination to constitute the robust HPI-GEN in the parameter identification procedures, we exploited the cost function to obtain the optimal fit for the host/pathogen RNA-seq data, to be more precisely, estimation of the expected relative distance between the fitted model and the unrecognized authentic molecular pathogenic mechanism that actually generated the observed data. On the whole, as the cost function of the parameter estimation for linear regression model, mean squared error (MSE) is enough to calculate the residual variance. However, the cost from the model complexity that might also influences the performance should be taken into consideration as well. Akaike Information Criterion (AIC), taking the place of MSE, was thereby selected to assess the residual variance and model complexity at once. As the expected residual variance declines with rising parameter numbers for inadequate model complexities, there should be a minimum around the correct parameter number after repeated coordination [52]. Meanwhile, owing to computational efficiency, computing the AIC statistics for all possible combinations is impracticable. Stepwise methods are developed to decreasing the complexity of exhaustive searching [52]. Finally, while reaching the minimum of AIC value, we could acquire the real number of interactions or regulations for each protein (gene) one by one in the candidate HPI-GEN. Those insignificant regulations or interactions out of the real number determined by AIC should be pruned to obtain the real HPI-GEN.
For each model composing PPIN, the AIC values of the ith host and the qth pathogen protein can be defined individually as the following equations: The ith host protein: Biomedicines 2020, 8, 320

of 48
The qth pathogen protein: for i = 1, . . . , I and q = 1, . . . , Q, where AIC HP i (N i , V i ) with the model complexity N i + V i and AIC PP q N q , V q with the model complexity N q + V q denote the ith host and the qth pathogen protein in PPI model, respectively; (σ HP and between P P q and Φ PP qθ PP q , respectively; andθ HP i andθ PP q represent the estimated parameters acquired from the solutions of the parameter estimation problems in (13) and (14), respectively. Suppose N * i , V * i and N * q , V * q could minimize AIC HP i (N i , V i ) and AIC PP q N q , V q in (31) and (32), respectively. Then, N * i , V * i and N * q , V * q denote the real number of PPIs in the ith host protein and the qth pathogen protein. The insignificant PPIs out of these numbers should be pruned as false positives from candidate PPIs to obtain real PPIs in HPI-GEN.
For each model comprising GRN, the AIC values of the jth host gene, the uth pathogen gene, the µth host miRNA, and the th host lncRNA can be defined individually as following equations: The jth host gene: The uth pathogen gene: The µth host miRNA: The th host lncRNA: for j = 1, . . . , J, q = 1, . . . , Q, µ = 1, . . . , M, and = 1, . . . , L where AIC HG j T j + M j + L j , AIC PG u (T u + M u + L u ), AIC HM µ T µ + M µ + L µ and AIC HL (T + M + L ) with the model complexity T j + M j + L j , T u + M u + L u , T µ + M µ + L µ and T + M + L denote the AIC values of the jth host gene, the uth pathogen gene, the µth host miRNA, and the th host lncRNA in GRN  andθ HL represent the estimated parameters acquired from the solutions of the parameter estimation problems in (27)- (30), respectively. Similarly, the real numbers of regulation by TF, miRNA, lncRNA are obtained by minimizing the corresponding AICs in (33)- (36).
The insignificant regulations out of these real numbers should be pruned as false positives to obtain the real regulations by TF, miRNA, lncRNA in HPI-GEN.
So far, through a sequence of systematic identification processes, the construction for real HPI-GENs is broadly accomplished by pruning those insignificant interactions and regulations out of the corresponding AIC. Unfortunately, due to the excessively enormous network architecture, we could hardly extract the core pathways to investigate the pathogenesis during HBV infection, let alone screening of candidate proteins as new targets for drug development. We thereby extracted the core network from the real HPI-GEN via applying the principal network projection (PNP) method.

Extracting Core Network Structure from the Real HPI-GEN by Using PNP Approach
Prior to core network extraction from the real HPI-GEN through applying PNP method, it is essential to integrate the network estimated parameters from PPIN and GRN into a system matrix A as follows, where HP, PP, HG, HM, HL and PG denote the host protein, pathogen protein, host gene, host miRNA, host lncRNA and pathogen gene, respectively;â H in ,b H iv andâ P qn ,b P qv mentioned in (1) and (2) could be acquired inθ HP i andθ PP q by resolving the parameter estimation problems in (13) and (14) and pruning false positives by AIC method in (31) and (32) (36) respectively. Note that since the regulations from pathogen proteins weren't taken into consideration, the corresponding parameters in matrix A are padded with zeros. Thereafter, we extract the core components of HPI-GEN by PNP approach, a principal network analysis method for dimensionality reduction based on network structure projection technique, that is, projecting matrix A to its principal singular vectors space. Accordingly, the combined network matrix A can be denoted by singular value decomposition form below, Also, we define the expression fraction E w of the interaction ability of the proteins and the transcriptional regulatory ability of the genes, lncRNAs, miRNAs as the normalization of singular value form, According to the diminishing property of the singular values, under the basic premise of sustaining system energy from the whole network structure, we pick out the minimum X such that Σ X w=1 E w ≥ 0.85, i.e., the vector space spanned by the orthonormal bases composed of the principal singular vectors corresponding to the top X singular values contains 85% energy of HPI-GEN structure.
Afterwards, we define and apply the projection value of each node, including gene, miRNA, lncRNA, and protein in the real HPI-GEN to the vector space spanned by the orthonormal basis composed of the principal singular vectors corresponding to the top X singular values as below, where A row,i denotes the ith row vector of A for i = 1, . . . , 2I + 2Q + L + M; A col,j signifies jth column vector of A for j = 1, . . . , I + Q + L + M; V T * ; and U * represent the vector spaces spanned by the orthonormal basis {v 1 , v 2 , . . . , v k } and {u 1 , u 2 , . . . , u k } for k = 1, . . . , X, respectively; Pro j R (A row,i , V T * ) and Pro j C (A col,j , U * ) indicate the projection value from A row,i and A col,j to V T * and U * vector spaces, respectively. While the projection value Pro j R (A row,i , V T * ) or Pro j C (A col,j , U * ) in (39) approaches zero, it intimates that the corresponding node i or j isn't the pivotal factor or independent to the core network extracted via the PNP procedure. Conversely, the larger projection value it gets, the greater the node contributes to the core network.
Eventually, we select the first 3000 components on the top of the list with higher energy in (39) as the core host-pathogen network from real HPI-GEN of HBV infection by ranking the projection value of each node and subsequently uploading those nodes into DAVID to obtain the KEGG pathways as shown in Table 3. Taking KEGG pathways as reference, we systematically identify and investigate the molecular pathogenic mechanism of HBV infection from the core HPI-GEN as shown in Figure 4 and select the significant biomarkers for the further novel drug discovery procedure.

Data Preparation
Following the identification of promising pathogenic biomarkers, a potent drug-target interaction (DTI) model to predict the interactions between the identified biomarkers and their corresponding molecule drugs is crucial for discovering favorable molecular compounds for the therapeutic treatment. Thus, we constructed a deep learning-based DTI model to collect viable interactions to identify available drugs for the selected biomarkers. Meanwhile, since most patients in need of attention and treatment are infected with chronic hepatitis, the highest priority was assigned to efficacious drugs with lower toxicity that would not cause irreparable harm to health.
First, we integrated databases from UniProt [53], DrugBank [54], ChEMBL [55], Pubchem [56], and BindingDB [57] to assemble applicable drug-target interactions. As feature descriptors are simple numerical vectors designed to delineate the complicated information of objects, they have been widely utilized to describe the genomic sequences and chemical properties of molecules such as the characteristics from 2D, 3D spectrum of structure, molecular weight, predictive values of LogP, etc. of late [58,59]. In view of this property, we employed both the off-the-shelf build-in molecular and protein descriptor functions of python package pyBioMed to transform the features from each drug and its target among the previously collected drug-target interactions into descriptor under python 2.7 environment, respectively. These features include molecular dockings such as amino acid composition and dipeptide composition. For more details about the descriptor transformation, readers could access the documents of pyBioMed. Thereafter, we united the descriptors of both drugs and their relevant targets into a matrix as the training dataset to train the DTI model. The vector v drug−target corresponding to the descriptor of each drug-target pair is given below: where v drug−target denotes the vector of the descriptor for each the drug-target pair comprising two parts. The former part D is the descriptor of the drug and d i represents the ith drug feature; the latter portion T is the descriptor of the target and t j indicates the jth target characteristic; I is the total number of drug features; J is the total number of target features. Before performing drug-target interaction prediction, to keep our DTI model from inferior performance arising from between-class imbalance and distracting variation, some data preprocessing procedures are requisite. The entire preprocessing procedure in our work encompasses down sampling, data partitioning, feature scaling, and dimensionality reduction.

•
Down Sampling: There are two categories of drug-target interaction in our samples: 16,000,000 samples for the unknown interactions (negative instances) and 60,000 samples for the known interactions (positive instances). Though deep learning methods scale well with the quantity of data and can often leverage extremely large datasets for good performance, imbalanced class distribution often causes the model to be overwhelmed by the large class and ignore the minority one. So as to mitigate the effect arising from class imbalance, we reduced the number of majority samples (negative instances) to 60,000 which will allow the model to learn from both classes equally.

•
Data Partitioning: After the down sampling procedure, we partition the data (the reconstructed matrix of drug-target pairs) into two sets, three-fourth for training and one-fourth for testing.

•
Feature Scaling: Since the variables of the features in each drug-target pair are measured at different scales, they do not contribute equally to the model fitting and might end up with creating a bias. To deal with this potential problem, a feature-wise scaling is usually implemented prior to model fitting. As powerful techniques of feature scaling, Min-max scaling (normalization) and Standardization are both widely used in data analysis. However, despite possessing the capability to shift and rescale the data into a limited range of values, normalization is sensitive to the outliers [60]; in contrast, standardization maintains useful information about outliers and makes the model less sensitive to them [61]. Thus, we apply Standardization on each feature and the corresponding mathematical formulation is shown as follows: where d i and d * i represent the ith feature of the drug before and after Standardization, respectively; µ i and σ i signify the mean and standard deviation of the ith drug feature; t j and t * j indicate the jth feature of the target before and after Standardization, respectively; ω j and δ j denote the mean and standard deviation of the ith target feature; I is the total number of features in the drug while J is that of the target.

•
Dimensionality Reduction: Since the high-dimensional patterns in samples might augment the number of neurons in the network and elevate the computational complexity for training, we adopted principal components analysis (PCA) to distill 692 features from original 1014 features of the samples. One could retain as much information of interested [62,63].
Note that it is illegal and useless to carry out data preprocessing to the testing set solely relying on the characteristics in testing data in that we would never get any information from the features of testing data until the training phase is over. Hence, aside from the aforementioned transformation performed on the training data sets, we also provide the identical transformation on the testing data during the process of feature scaling and dimensionality reduction. More exactly, we extracted the mean and standard deviation variables of each feature in the training data set to standardize the corresponding feature in the testing data. Likewise, the transformation matrix for executing PCA of the features in training data was concurrently used to reduce the dimension of the features in testing data.
So far, the training data for tuning the network parameters of the deep learning-based DTI model and the testing data for evaluating the model performance are ready. Nonetheless, if learned merely on the features from the training set, the hyperparameters would always choose the maximum possible model capacity, leading to overfitting [63]. As a result, we randomly split out one-tenth of the training data to estimate the generalization error during training, allowing for the hyperparameters to be updated in every epoch. In addition, Early Stopping method was also employed to specify an arbitrarily large number of training epochs and stop training once improving the model's fit to the training data comes at the expense of increased generalization error. Eventually, through setting the optimizer as Adam [64] and learning rate = 0.001, we then trained our DTI model for 40 epochs with 100 samples in each mini-batch.

Parameters Tuning Process Based on Deep Learning Algorithm
In the network architecture, each layer can be simplified into a function below: where x n andĥ n indicate the input and output vectors corresponding to the descriptor of the nth drug-target pair, respectively; δ denotes the activation function (ReLU for each hidden layer and Sigmoid for the output layer); and the vectors w and b are given as follows: where the weights parameters w 1 , w 2 , . . . w e in w and bias parameters b 1 , b 2 , . . . b e in b are free variables that capture the model's representation of the data in each layer and are learned from each sample. To evaluate the model performance, the predicted output is compared with the true label to compute a loss for the current set of model weights. Since drug-target interaction prediction issue is a binary classification problem, binary cross-entropy is chosen to calculate the loss of each iteration. The summation of the loss L(y,ŷ) for totally N samples is given by the expression as below: where C n (y n ,ŷ n ) is loss of the nth sample calculated by binary cross-entropy, in which y n is the nth actual label (1 or 0) andŷ n is the nth predicted probability distribution. During the learning process, the parameter set θ (46) in each layer should achieve the minimization of the objective function (47) to obtain the optimal network parameters θ * . Accordingly, backward propagation learning algorithm (48) is designed to update the network parameters of both weights (w 1 , w 2 , . . . , w e ) and bias (b 1 where i is the ith iteration of the learning process; η is the learning rate (set as 0.001); and ∇L(θ i−1 ) is the gradient of L(θ i−1 ) given as follows: Consequently, by back-propagating a corrective error signal through the network, weighted connections between neurons in the DTI model are iteratively adjusted and assessed on the basis of the drug-target pairs in the training and the validation sets.

Measurement of Prediction Quality
Assessing a model is quite tricky. It isn't reliable to evaluate model performance (training accuracy of the final epoch) merely based on one specific validation set, in that the accuracy obtained from one validation set can be very different to that from another. Thus, we took advantage of a general technique named 10-fold cross validation to avoid the possible bias. That is, the original samples for training is randomly partitioned into 10 equal size subsamples; one is retained as the validation data for testing the model, and the remaining 9 subsamples are used as training data. Through repeatedly training and validating the models 10 times with different partitions, we then compute an average score over the rounds to give an estimate of the model's predictive performance. Afterwards, the model parameters with different validation errors were applied to the testing data for generating the final test accuracy and loss.
Aside from that, to compare the effectiveness of our DTI model with that of other state-of-the-art ML-based approaches, we adopted AUC-ROC curve [65] to tell how good a specific model can distinguish two classes, that is, whether or not a drug interacts with a target. ROC represents the relationship between benefit (true positive rate) and cost (false positive rate), while AUC is a value in the range of 0 to 1 that indicates the degree of separability between classes. The higher AUC we obtain, the more accurate outcome we will get from the model. The formulas for AUC-ROC curve are shown as below: where True Positive (TP) means that the actual value is positive and is judged correctly; False Positive (FP) shows that the actual value is positive but is judged by mistake; True Negative (TN) indicates the actual value is negative and is judged accurately; False Negative (FN) represents the actual value is negative but is judged in error.

Overview of Systems Medicine Discovery Procedure
Though a great advance has been made to design and evaluate medicine based on the application of diversified drug repositioning approaches, it still takes a large amount of time and vigor if without a potent avenue to select the promising biomarkers and drugs for the clinical trials. In this study, we proposed a systematic biology method containing two sections including potential biomarkers identification and novel drugs discovery.
The overall flowchart of the proposed systems biology procedure is shown in Figure 1. On one hand, we applied a systematic identification analysis procedure to the constructed candidate HPI-GEN, considering both accuracy and model complexity with the help of the information from the two-side real-time profile RNA-seq. Note that the total nodes and edges respectively given in Tables 1 and 2 of the real HPI-GEN in Figure 2 abstracted from the candidate HPI-GEN has considerably shrunk in comparison with that of the candidate HPI-GEN, which indicates that the false-positives were eliminated during the identification process. Supported with the PNP approach, we then identified the core HPI-GEN in Figure 3 by selecting top-ranked 3000 nodes with significant projection values that could reflect 85% of the real HPI-GEN. The higher the projection value is, the stronger regulatory capacity it possesses in the host/pathogen cross-talk mechanism during HBV infection. In the meantime, given identified KEGG pathways derived through uploading nodes in core HPI-GEN into DAVID in Table 3, we selected the core signaling pathways so as to explore the underlying etiologic mechanism in HBV infection as shown in Figure 4, involving microenvironmental factors by surveying relevant papers and research.  Flowchart of applying systems biology method to construct the candidate HPI-GEN, real HPI-GEN, core HPI-GEN, and core signaling pathways during the infection of HBV to identify the potential biomarker for later drug discovery repurposing. The grey blocks in the shape of a cloud indicate the online databases for constructing candidate HPI-GEN and later drug repositioning; the white diamond blocks are the corresponding data types of each database; the orange diamond blocks signify candidate protein-protein interaction network (PPIN) and gene regelation network (GRN); the blue blocks represent the input information including the raw RNA-seq data for system identification and the collected articles for BioBERT to augment the interspecies interactions and regulations in PPIN and GRN, respectively; the red blocks denote the models comprising BioBERT and stochastic regression models utilized in the systems biology approach; the white rectangle blocks indicate the systematic methods applied to constructing the candidate HPI-GEN and extracting core HPI-GEN; the purple blocks imply the drug discovery and repositioning procedure proposed afterwards; and the yellow rounded rectangular blocks are the real HPI-GEN and core HPI-GEN during the infection of HBV.
to augment the interspecies interactions and regulations in PPIN and GRN, respectively; the red blocks denote the models comprising BioBERT and stochastic regression models utilized in the systems biology approach; the white rectangle blocks indicate the systematic methods applied to constructing the candidate HPI-GEN and extracting core HPI-GEN; the purple blocks imply the drug discovery and repositioning procedure proposed afterwards; and the yellow rounded rectangular blocks are the real HPI-GEN and core HPI-GEN during the infection of HBV.     On the other hand, after determining the biomarkers from signaling pathways of the core HPI-GEN as shown in Table 4, we proposed a deep learning-based drug-target interaction (DTI) method to predict the potential interactions (dockings) between the identified biomarkers and their corresponding drugs for further candidate drugs discovery. The outcome of the DTI model is presented in a form of probability. The higher probability analyzed by the DTI model, the connection between the selected biomarker and the predicted drug is more likely to exist. However, given the enormous number of candidate drugs, the general drug specifications such as regulation ability, toxicity, and sensitivity from multiple databases were taken into consideration to reduce the search space and ensure the safety and efficacy in clinical trials. The pertinent flow-process diagram of the systems molecular drug design procedure is shown in Figure 5.

Extracting Core Signaling Pathways from Identified HPI-GEN and Core HPI-GEN in HBV Infection
For the purpose of analyzing the molecular mechanism under HBV infection, extracting core signaling pathways from candidate HPI-GEN is critical. The process to construct candidate HPI-GEN and identify real HPI-GEN as well as core HPI-GEN has been shown in the flowchart of Figure 1.
Thereafter, we identified the real HPI-GEN via parameter identification concerning the cost of model complexity under the auspices of corresponding RNA-seq data to prune false positives from candidate HPI-GEN (refer to Section 2.4-Section 2.6 for more details). The total numbers of the nodes of proteins, Transcription factors (TFs), miRNAs, lncRNAs, and Virus proteins as well as the edges of their interactions and regulations for the candidate HPI-GEN and identified real HPI-GEN are given in Tables 1 and 2, respectively. In addition, through employing the network visualizing software Cytoscape, the network of real HPI-GEN under HBV infection is shown in Figure 2.
Since the immense complexity of real HPI-GEN limits the possibility to efficiently investigate the pathogenic mechanism under HBV infection, applying principal network projection (PNP) and subsequently abstracting the top 3000 nodes with significant projection values on the 85% pivotal network structures to distill core HPI-GEN are essential. Meanwhile, the results of core HPI-GEN under the infection of HBV is shown in Figure 3.
Moreover, we employed DAVID Bioinformatics Resources version 6.8 to analyze the enrichments of relative pathways and specific cellular functions of core HPI-GEN as shown in Table 3. Through the multifaceted assessment of the core HPI-GEN based on existing studies and the annotation of KEGG signaling pathways, we could attain the core signaling pathways associated with microenvironmental factors including cytokines, chemokines, etc., for the pathogenesis of HBV infection as represented in Figure 4. Then, on the basis of the investigating the consequence of the systematic pathogenic molecular mechanism, we could identify the potential biomarkers holding great promise for the development of therapeutic targets.

Analysis of Core Interspecies Cross-Talk Pathways to Investigate Host/Pathogen Offensive/Defensive Mechanism during HBV Infection
Based on the core host/pathogen cross-talk signaling pathways in Figure 4, several identified processes leading to dysfunction of antiviral machinery are investigated to afford a stepwise understanding of the host/pathogen offensive and defensive mechanism under HBV infection.

Inception of HBV Infection
Hepatitis B Virus (HBV) has a 3.2-kb circular and partially double-stranded DNA genome that contains four genes named S, C, P, and X genes coding for the envelope proteins, the core protein and a related protein termed precore protein, the viral DNA polymerase, and the multifunctional regulatory protein, respectively [66,67]. The journey of HBV virions starts from attaching to host hepatocytes through heparan sulfate proteoglycans (SDC2 and HSPG2) and promptly transfers to sodium acid cotransporter (SLC10A1), initiating the HBV life cycle. Once entering the cells, the viral nucleocapsid containing the partially double-stranded DNA, known as the relaxed circular DNA (rcDNA), will be released into the cytoplasm and transported into the nucleus. The plus strand of the rcDNA is then repaired and completed to generate the covalently closed circular (cccDNA), which later forms the HBV minichromosome and is transcribed into RNAs for viral replication [68].

Liver Microenvironment and Immune Pathogenesis under HBV Infection
Inflammatory and innate immune responses impact the pathogenesis of numerous diseases and participate in the activation of common inflammatory mediators and regulatory pathways. On top of the Toll pathways that could be directly stimulated by the viral particles, in response to the foreign invasions, leukocytes, e.g., neutrophils and macrophages that phagocytose and kill pathogens, might simultaneously coordinate additional host responses by synthesizing a wide range of inflammatory mediators, such as TNF-α, IL-6, IFN-α, and IFN-γ [69]. These cytokines and chemokines provoke the secretion of antiviral effectors and a series of immunoregulatory mechanisms against HBV infection. In general, cellular and molecular events and interactions would effectively mitigate impending injury or infection during inflammatory responses [70]. However, along with the frequent recurrence of inflammation, once imbalances of the microenvironment occur, it might lead to abnormal epigenetic variations and even immune dysregulation [71], which consequently incurs viral relapse.

Toll Pathway as the First Line of Defense against Infection
Toll-like receptors (TLRs) have long been recognized as the first line of antiviral immunity in that they initiate intracellular signaling pathways to induce antiviral mediators, such as interferons (IFNs) and other cytokines [72]. Upon the recognition of foreign invaders (HBV-associated molecular patterns), TLRs elicited phosphorylation of IKKβ (IKBKB) through several signaling transduction proteins, including MyD88, IRAK1, TRAF6, TAB1, and TAK1. IκBα (NFKBIA) is a downstream protein of IKKβ keeping NF-κB sequestered in an inactive state in cytoplasm [73]. When phosphorylated by IKKβ, IκBα dissociated from NF-κB, resulting in the nuclear translocation of NF-κB and the induction of its several targets, in which the Interleukin 6 (IL6), the interferon-alpha receptor 1 (IFNAR1), the interferon-gamma receptor 2 (IFNGR2), and BCL3 were recognized as shown in Figure 4. Among them, IL-6 is the pleiotropic cytokine encoded by IL6 gene that catalyzes immune reaction and inflammatory responses [74,75]; IFNAR1 is the receptor that binds type I interferons and activates the JAK-STAT signaling pathway, along with MAPK, PI3K, and Akt signaling pathways associated with multiple cellular functions, such as inflammation and immune response regulation [76]; IFNGR2 is a receptor that binds to interferon-γ (IFN-γ), which is the immune interferon produced predominantly by natural killer (NK) and natural killer T (NKT) cells in response to viral infection [77]; and BCL3 is a protein that inhibits DNA damage-induced apoptosis and acts as a transcriptional coregulator of NF-κB [78,79].
In addition to stimulation of the downstream NF-κB pathway, the phosphorylated IKKβ also activated the pathway of mammalian target of rapamycin (mTOR) by triggering the disruption of TSC1/TSC2 complex. In general, TSC1 stabilizes TSC2 through direct binding, thereby preventing TSC2 from ubiquitination and degradation. Once IKKβ phosphorylates TSC1 incurring collapse of its connection with TSC2, it obliquely leads to the stabilization of RhebGTP, which mediates signaling to mTOR [80,81]. As for the substrates of mTOR, among all the identified downstream proteins in the core host/pathogen signaling pathways as shown in Figure 4, the translational repressor 4EBP1 (EIF4EBP1) stood out to be a pivotal constituent. It is well acknowledged that when the 4EBP1 protein is hyperphosphorylated by mTOR kinase, it can't bind and hijack the eIF4E factor, i.e., a general initiator for translational regulation [82,83]. Concurrently, it is reported that the released eIF4E might form a complex with the 5 stem-loop structure of the HBV genome, thus ensuring viral replication [84].

TNF-α-Stimulated Signaling Pathways
TNF-α is a multifunctional cytokine that plays a principal role in cell proliferation, apoptosis, inflammation, and immune response. From the core host/pathogen cross-talk signaling pathways in Figure 4, as soon as TNF-α bound to TNF receptors (TNFRs), it contributed to the phosphorylation of Akt via both TNFR2/TRAF2/PIK3CB and TNFR1/TRADD/TRAF2/PIK3CB signaling transductions. Akt is an effector of Class I PI3K triggering multiple downstream pathways and has been shown to be a critical mediator of cell proliferation and survival in a variety of cell types [85]. In Figure 4, a few pathways of Akt were incorporated into the core host/pathogen cross-talk network. First of all, the activated Akt inhibited the induction of autophagy through transactivating mTOR, i.e., impeding GTPase-activating protein complex TSC1-TSC2 as previously described. It is worth noting that TSC2 rather than TSC1 was directly phosphorylated by Akt, which destabilized TSC2 and disrupted its affinity with TSC1 [86]. Secondly, Akt also enhanced MDM2-mediated ubiquitination leading to p53 (TP53) degradation [87] and catalyzed the nuclear localization and stabilization of BCL3 by phosphorylation [79], where BCL3 is the aforementioned transcription coregulator having been proven to interact with NF-κB so as to facilitate its transcriptional ability. Moreover, it even caused nuclear exclusion and cellular dysfunctions of FoxO proteins through phosphorylation and repression [88]. Therefore, as the identified target genes of FoxO proteins in the core host/pathogen cross-talk pathways, BNIP3, BMI1, SOD2, and cyclin-dependent kinase inhibitor 1 (CDKN1A) encoding p21, might probably be down-regulated. Within, BNIP3 is a pro-apoptotic factor that induces cell death through interacting with anti-apoptotic proteins [89]; BMI1 is a critical epigenetic repressor that functions through chromatin remodeling and plays a central role in DNA damage repair [90,91]; and SOD2 is the antioxidant defense enzyme that participates in apoptotic signaling and oxidative stress regulation [92]. However, it is well documented that FoxO1/FoxO3 could rapidly relocalize into the nucleus in response to STAT3 activation by IL-6, and the accumulation of FoxO1/FoxO3 in nuclei coincided with elevated expression of CDKN1A and other genes [93]. This phenomenon could interpret the high expression of the target genes BNIP3, BMI1, SOD2, and CDKN1A, as well as the promotion of viral replication, since activator FOXO1 has also been found to bind to HBV DNA and activate its transcription [94].
Aside from the activation of PI3K/Akt pathway, in response to cytokine signals, TRAF2 also recruited TAK1/TAB3 complex and IKK complex, which is composed of IKKα, IKKβ, and IKKγ (IKBKG), bringing about the autophosphorylation-dependent activation of TAK1 [95] and the activation of its substrates, such as IKKβ and MKK7. MKK7 then upregulated the transcriptional activity of c-JUN and ATF2 via activation of the JNK1. c-Jun is an AP-1 family transcription factor involved in the regulation of cell death and survival [96] as well as inflammation and innate immune response [97]. In Figure 4, the induction of c-Jun is discovered to be associated with the overexpression of the proinflammatory cytokine receptor IL18R1 essential for IL18 mediated signal transduction [98], the ligand PD-L1 (CD274) regulating the cellular immune responses to prevent inflammation [99], and the major cyclins CCND1 controlling the progression of a cell through the G1 phase by conjugating CDK4/6 enzymes required for the synthesis of cell cycle [100]. Besides, c-Jun not only modulated the expression of BCL3 by serving as either the coactivator of NF-κB or the activator of BCL3 gene to reinforce cell survival [101,102], but also induced miR-221 to interfere with the regulation ability of the histone deacetylase 6 (HDAC6), which has been reported to both suppress NF-κB activation and control autophagosome maturation essential for ubiquitin-selective quality-control autophagy [103]. Apart from that, c-Jun could even impact the modulation of Fas expression, which sparked off the loss of Fas-ligand (FasL)-mediated apoptosis by means of down-regulating the activity of p53.

TAK-STAT Signaling Pathways
The Janus kinase (JAK)-signal transducer and activator of transcription (STAT) signaling pathway is a chain of interactions between proteins involved in various physiological processes, including immune function, cell growth, and cell death [104]. The activation of STAT3 was mainly provoked by JAK1 while JAK1 was brought in the proximity of receptors including the Glycoprotein 130 (IL6ST) and the gamma interferon receptor 1 (IFNGR1), which receive the signal from IL-6 and IFN-γ, respectively. However, the activation of STAT1 primarily stemmed from the phosphorylation of TYK2, a member of JAK family that was previously described by other research groups [105], by the alpha interferon receptor 1 (IFNAR1) residing in IFN-α stimulation. In spite of forming homodimers or heterodimers and translocating to the nucleus where they regulated the transcription of target genes, STAT3 and STAT1 played different even in part opposite roles in the process of pathogen infection. STAT3 constitutive activity is essential for cell survival and the extent of inflammation. Among the identified STAT3 targets in the host/pathogen cross-talk signaling pathway in Figure 4, an anti-inflammatory and anti-apoptosis factor referred to as PD-L1 and the major cyclin CCND1 involved in proliferation and cell cycle progression as mentioned previously were included. Additionally, STAT3 could also inhibit apoptosis by suppressing Fas transcription and p53 expression.
In contrast, STAT1 promoted apoptosis by inducing the expression of CXCL10 and strengthened the antiviral ability through modulating the expression of GBP1. Amongst these two downstream targets, CXCL10 is a ligand for the receptor CXCR3 involved in chemotaxis, induction of apoptosis, and regulation of cell growth associated with a variety of human diseases including Hepatitis C [106] and Endotheliitis [107]. GBP1 is a binding protein which has been shown to provide broad host protection against different pathogen classes through expediting oxidative killing and delivering antimicrobial peptides to autophagolysosomes [108]. Furthermore, STAT1 also negatively regulated the cell cycle though inducing the expression of the CDK inhibitors CDKN1A, a target of transcription factors FoxOs as stated previously.
Albeit STAT3 was directly activated by JAK1, it could also be indirectly triggered through the cytoplasmic signaling cascade of SHC/ERK pathway composed of signaling proteins, i.e., SHC1, GRB2, and SOS1, and the classical MAPK/ERK pathway comprised of signaling proteins, i.e., HRAS, RAF1, MEK1, and ERK1. Except inducing the activation of STAT3, MAPK/ERK pathway also contributed to the MSK1-mediated phosphorylation of c-Jun and ATF2, the inhibition of FoxO3 nuclear translocation, and most importantly, the regulation of eIF4E activity having been elucidated chiefly via phosphorylation on serine 209 of eIF4E by other research groups [83,109], which ultimately proliferated viral replication. More exactly, this was carried out by mitogen-activated kinase2 (MNK2), the downstream kinase of ERK1.

TNFs-Induced Apoptotic Pathways
As a component of the cell response to viral infection, apoptosis mediated by the sequential activation of caspases is indispensable for aborting the production and release of progeny virus. While diverse signaling pathways inducing apoptosis have been found, caspase-mediated proteolysis of downstream substrates is still a critical element of the execution pathway common to all forms of apoptosis [110]. And among all the avenues conducive to apoptosis progression, the pathways contributing to the modulation of caspase-3 (CASP3), a frequently activated death protease that catalyzes the specific cleavage of many integral cellular proteins, occupied a key position and a majority of them began with the activation of FADD in Figure 4.
Induction of FADD, the upstream activating protein of caspase-10 (CASP10), was mediated by not only the bifurcated TNFR1-TRADD signaling cascades but also the interaction with the FasL receptor (FAS) and the TNF-related apoptosis-inducing ligand receptor (DR5). Upon stimulation, FADD in turn recruited the initiator caspase caspase-10 to form the death-inducing signaling complex, which leads to activation of a caspase cascade and eventual cell-death. Having been shown to activate the executioner caspase-3 through direct processing, caspase-10 is described to be functional-independent to caspase-8 (CASP8) in initiating Fas-and tumor necrosis factor-related ligand-mediated apoptosis [111,112]. Based on the cleavage potential of caspase-3 associated with the dismantling of the cell and the formation of apoptotic bodies, the identified critical targets including Beclin-1 (BECN1), DNA-PK (PRKDC), and the DNA fragmentation factor subunit alpha (DFFA) were specifically cleaved by caspase-3 with subsequent loss of its original capability. Among them, BECN1 is the component of the phosphatidylinositol-3-kinase (PI3K) complex that induces the appearance of autophagosome [113]; PRKDC is the nuclear serine/threonine protein kinase that takes part in degraded DNA repair [114]; and DFFA is a subunit of DFF. Further, according to previous study, whilst DFFA is cleaved, the DNA fragmentation factor subunit beta (DFFB) would dissociate from the cleaved fragments of DFFA and trigger both DNA fragmentation and chromatin condensation during apoptosis [115].
Additionally, it is worthwhile to note that Akt, including AKT1 and AKT2, has been reported to interact with X-linked inhibitor of apoptosis protein (XIAP) and protect it from ubiquitination and degradation [116]. Once phosphorylated by Akt, the XIAP would exploit its second baculovirus IAP repeat domain (BIR2) to inhibit the apoptotic executioner caspase-3 [117]. We believe that this identified episode carries a foreshadowing for the succeeding virus-induced malfunctions of programmed cell death.

Virus Proteins-Induced Pathogenic Mechanism
Proteins encoded by the viral genomes, especially HBx, a 154 amino acid gene product of the X gene, have emerged as promiscuous transactivators activating the transcription of host genes by interacting directly with nuclear transcription factors or by obliquely activating various signal transduction pathways in the cytoplasm and have been proven to be potent epigenetic modifying factors in hepatocytes.
HBx regulated the expression of BCL3 to subvert apoptotic signaling pathways via both synergistically cooperating with p300 (EP300) to enhance c-Jun transcriptional activity and activating NF-κB by either directly interacting with NF-κB for augmenting its binding ability to its target DNA sequence or promoting phosphorylation of IκB as shown in Figure 4. In addition, apoptosis inhibition could also be induced by dysfunction of caspase-3 through enhancing the ability of Akt, an upstream kinase of caspase-3 as stated earlier, by HBx. Once caspase-3 lost its function, in response to signal from damaged DNA, DNA-PKs, the substrate of caspase-3, would act in concert with XRCC4 and a number of tightly coupled proteins to repair DNA. Further, it was suggested that the intense activity or high expression of XRCC4 might be beneficial to the promotion of HBV cccRNA formation and thereby facilitate the viral replication [118]. Since Hepatitis B Virus supremely requires components from the host cell to complete its replication cycles, merely one developed avenue to hijack host signaling pathways can't meet its demand. In addition to the significant translation initiation factor eIF4E induced by HBx through Akt/mTOR pathway as just mentioned, eIF4E was activated via the MAPK/ERK pathway as HBx could also trigger Ras and its downstream signaling pathways. Hence, forming a node of convergence of the PI3K/Akt/mTOR and Ras/Raf/MAPK signaling pathways, the regulation of eIF4E is clearly consequential for HBx to promote viral replication. Additionally, overexpression of cyclin G1 (CCNG1) resulted from the loss of microRNA-122 (a liver-specific miRNA) induced by HBx could also augment virus proliferation. It was reported that CCNG1 specifically interacts with p53 and blocks its specific binding to HBV enhancer elements, which simultaneously abrogates p53-mediated inhibition of HBV transcription and releases cell proliferation from G1/S arrest [119].
p53 acts as a sequence-specific DNA binding protein that activates the transcription of substantial target genes associating with the regulation of cell cycle progression, apoptosis, DNA repair, and senescence. In Figure 4, HBx interacted with p53 both directly and indirectly to inhibit its DNA consensus sequence binding and transcriptional activity, which has been reported to efficiently block p53-mediated cellular functions [120,121]. Apart from that, HBx also inhibited p53 by conjugating c-Jun and STAT3 as well as keeping MDM2, the principal cellular antagonist of p53 acting to limit the p53 growth-suppressive function [122], from ubiquitin-directed self-degradation.
As for the targets of p53, apoptosis-associated proteins, such as the proapoptotic protein Bax, which could permeabilize mitochondria and engage the apoptotic program giving rise to activation of caspase-3 [123,124], the apoptosis antigen Fas triggering the Fas signaling pathway as mentioned earlier, and the DDIT4 negatively regulating cell growth, proliferation, and survival via inhibition of the activity of mTOR [125] were identified in our core host/pathogen cross-talk signaling pathways in Figure 4. Further, having been reported to participate in the generation of ROS and p53-dependent DNA damage [126], DDIT4 was negative-regulated by the HBx-induced upregulation of microRNA-221 to promote aberrant proliferation. Such significant inverse correlation between DDIT4 mRNA levels and miR-221 expression has been observed in a mouse model of liver cancer as well [127]. Besides, the direct suppression of histone deacetylases 6 (HDAC6) by HBx-induced miR-221 would result in autophagosome maturation failure; nevertheless, HBx yet also bound to Vps34 (PI3KC3) to enhance its activity, inducing the autophagic response crucial for the formation of autophagosomes as well as the intensification of viral replication. STAT1 and STAT3 are both critical elements in viral replication, despite acting as distinct roles from each other. Functional characterization of miR-338-3p indicated that miR-338-3p inhibited cell proliferation by inducing cell cycle arrest at the G1/S phase. The identified downregulation of miR-338-3p and tyrosine phosphorylation of STAT3 triggered by HBx might cause CCND1 overexpression beneficial to virus cell cycle progression [128]. In addition, forced low expression of miR-338 by HBx also directly led to the increased level of autophagy. This process was mainly through the up-regulation of LC3 by the released transcription factor ATF2, which has been identified as a direct target of miR-338 [129]. Furthermore, instead of FoxO3, HBx appeared to assist FoxO1 in counteracting the reduction of its nuclear relocalization arising from Akt phosphorylation by inducing STAT3 activation, which is conducive to the transcription of cccDNA as previously described. In contrast, since interferons (IFNs) activated STATs via phosphorylation [130], once recruited by HBx, histone acetyltransferase (HAT) CBP (CEBBP) might dynamically regulate STAT1 acetylation to counteract IFN-catalyzed STAT1 phosphorylation, nuclear translocation, DNA binding, and target gene regulation. On the other hand, STAT1 nuclear translocation induced by IFN-α might also be deprived by HBV polymerase [131]. Previous studies demonstrated that miR-122 regulates type I IFN expression through both down-regulating SOCS1 and SOCS3 to enhance interferon-mediated suppression of HBV [132,133]. Nonetheless, it appears that merely the regulation of SOCS1 by miR-122 was identified in Figure 4. Finally, as a target gene of FoxOs and STAT1, CDKN1A overexpression was mediated by HBx-induced upregulation of miR-29a. Serving as a direct upstream of HDAC4, miR-29a could thereby alleviate the deacetylation of CDKN1A through HDAC4 inhibition.

Drugs Discovery and Repositioning Based on Selected Biomarkers for HBV
Through reviewing the cellular dysfunctions triggered by the core host/pathogen cross-talk interspecies signaling pathways under HBV infection shown in Figure 4, we selected the significant biomarkers including AKT1, NFKBIA, EIF4EBP1, HDAC6, STAT1, STAT3, and TP53 as drug targets shown in Table 4. However, there is still a long way to the clinical trials on account of the intrinsic difficulties for drug design and development. To this end, we proposed a systematic drug discovery and repositioning approach to uncover the multiple-molecule drug for therapeutic treatment based on the deep learning and data mining technique as shown in Figure 5. Table 4. The drug targets identified for HBV infection.

Disease Drug Targets
HBV infection AKT1, NFKBIA, EIF4EBP1, HDAC6, STAT1, STAT3, TP53 Figure 5. The flowchart of the proposed systems drug discovery and repositioning procedure. According to the potential biomarkers identified in Figure 4, we propose a systems drug discovery and repositioning method for figuring out promising multiple-molecule drug to alleviate HBV infection. First, we collected data in respect to drug-target interaction from various online databases. Then, we turned each drug and target into descriptors and assembled them into a matrix of drug-target pairs. To keep our model from poor performance, data preprocessing prior to training is indispensable. Accordingly, we adopted data preprocessing approaches including down sampling, data partitioning, feature scaling, and dimensionality reduction (For more details, readers can refer to Section 2.8.1 in Materials and Methods) based on the property of the data. After training the Deep learning-based drug-target interaction (DTI) model shown in Figure 6 with the constructed training data, we evaluated the model performance (as shown in Table 5 and Figure 5. The flowchart of the proposed systems drug discovery and repositioning procedure. According to the potential biomarkers identified in Figure 4, we propose a systems drug discovery and repositioning method for figuring out promising multiple-molecule drug to alleviate HBV infection. First, we collected data in respect to drug-target interaction from various online databases. Then, we turned each drug and target into descriptors and assembled them into a matrix of drug-target pairs. To keep our model from poor performance, data preprocessing prior to training is indispensable. Accordingly, we adopted data preprocessing approaches including down sampling, data partitioning, feature scaling, and dimensionality reduction (For more details, readers can refer to Section 2.8.1 in Materials and Methods) based on the property of the data. After training the Deep learning-based drug-target interaction (DTI) model shown in Figure 6 with the constructed training data, we evaluated the model performance (as shown in Table 5 and Figure 7) and compared the model efficacy of DTI with that of other traditional machine learning (ML)-based methods (as shown in Figure 8). Since there is only a neuron in the output layer which delivers the probability of whether the relation (docking) between a drug and a target exists, the DTI model enabled us to stress the importance on specific interactions (with output score approaching 1) and to identify candidate drugs. Next, concerning the availability of the predicted drugs (candidate drugs) via general criteria for drug design specifications such as regulation ability, toxicity, and sensitivity obtained from databases, we further filtered promising drugs that modulate multiple molecular targets and possess the efficacy with lower dosages. After all, low-dose drugs are more attainable and cause less harm to patients. In this regard, a set of multi-target drugs with appropriate toxicity for clinical investigations were selected as a multiple-molecule drug to overcome HBV infection.  Figure 7) and compared the model efficacy of DTI with that of other traditional machine learning (ML)-based methods (as shown in Figure 8). Since there is only a neuron in the output layer which delivers the probability of whether the relation (docking) between a drug and a target exists, the DTI model enabled us to stress the importance on specific interactions (with output score approaching 1) and to identify candidate drugs. Next, concerning the availability of the predicted drugs (candidate drugs) via general criteria for drug design specifications such as regulation ability, toxicity, and sensitivity obtained from databases, we further filtered promising drugs that modulate multiple molecular targets and possess the efficacy with lower dosages. After all, low-dose drugs are more attainable and cause less harm to patients. In this regard, a set of multi-target drugs with appropriate toxicity for clinical investigations were selected as a multiple-molecule drug to overcome HBV infection.

Deep Learning-Based Drug-Target Interaction Prediction Model
To effectively predict whether the identified targets interact with some existing drugs, we provide a deep learning-based DTI model as shown in Figure 6. It is a fully-connected neural network containing an input layer and five layers with weights; four hidden layers (512, 256, 128, and 64 neurons in each layer) and an output layer (1 neuron). The intermediate hidden layers are applied with rectified linear unit (ReLU) activation function that thresholds negative signals to 0 and passes through positive signal. This type of activation function allows faster learning in comparison with alternatives, e.g., sigmoid or tanh unit [134]. In contrast, the output of the last layer is fed to a sigmoid function which produces a likelihood probability (0,1) score of the corresponding predicted docking where a higher value symbolizes a more reliable drugtarget interaction. In addition, to further reducing overfitting, we incorporated a dropout layer [135] with a probability of 0.5 following the output of each hidden layer.
For avoiding a poor performance due to assessing the model with a data set unfairly sampled (outliers or other anomalies are presented in one set), we introduced 10-fold cross validation measure to authenticate the learning effectiveness as shown in Table 5 and the corresponding learning curves that plot the learning performance of DTI model over experience and time are also presented in Figure 7. Note that since the model has been diagnosed well-fit and continuous training will likely lead to an overfit, it automatically

Deep Learning-Based Drug-Target Interaction Prediction Model
To effectively predict whether the identified targets interact with some existing drugs, we provide a deep learning-based DTI model as shown in Figure 6. It is a fully-connected neural network containing an input layer and five layers with weights; four hidden layers (512, 256, 128, and 64 neurons in each layer) and an output layer (1 neuron). The intermediate hidden layers are applied with rectified linear unit (ReLU) activation function that thresholds negative signals to 0 and passes through positive signal. This type of activation function allows faster learning in comparison with alternatives, e.g., sigmoid or tanh unit [134]. In contrast, the output of the last layer is fed to a sigmoid function which produces a likelihood probability (0,1) score of the corresponding predicted docking where a higher value symbolizes a more reliable drug-target interaction. In addition, to further reducing overfitting, we incorporated a dropout layer [135] with a probability of 0.5 following the output of each hidden layer. For avoiding a poor performance due to assessing the model with a data set unfairly sampled (outliers or other anomalies are presented in one set), we introduced 10-fold cross validation measure to authenticate the learning effectiveness as shown in Table 5 and the corresponding learning curves that plot the learning performance of DTI model over experience and time are also presented in Figure 7. Note that since the model has been diagnosed well-fit and continuous training will likely lead to an overfit, it automatically stops learning at the epoch of 70. Subsequently, via calculating the loss and accuracy of each DTI model yields on the test set, we received an average accuracy of 92.6 (%) with the standard deviation of 0.294 (%). Moreover, with the increasement of experimental data, numerous machine learning approaches such as Random Forest (RF) [136], Support Vector Machine (SVM) [137], Nearest Neighbor [138], etc. have been applied to predict drug-target interactions. Therefore, to evaluate the model performance, we contrast our DTI model with a few of them, which are RF, Nearest Neighbor, and SVM. By adopting the visualization tool, Receiver Operating characteristic (ROC), we can clearly find that DTI outperforms other traditional binary classifiers as presented in Figure 8.

Systems Discovery and Design of the Multiple-Molecule Drug for HBV Infection
On the basis of the pre-trained DTI model, we could evaluate the docking ability between the identified biomarkers and the available drugs. The larger predicted result we obtain, the higher probability that the specific drug-target pair will have interaction. Namely, the candidate drugs would be sifted out owing to holding higher probability (with output score approaching 1) to interact with the identified biomarkers. However, the exploration of promising drugs from a large quantity of the candidates (predicted drugs) is still arduous. Accordingly, in order to narrow down the scope of the possible options and improve the reliability of the ultimate selected drug combination, general criteria for drug design specifications including regulation ability, toxicity, and sensitivity were applied to further filter out unavailable or inappropriate medications. Under the criteria of the pharmacological properties, the regulation ability value for each drug was extracted from the LINCS L1000 level 5 dataset consisting of results from 978 genes of 75 cell lines treated with 19,811 small molecular compounds. By referring to LINCS L1000, we can examine whether a specific gene was up-regulated (positive values) or down-regulated (negative values) after being treated with a small molecular compound. Screening through a positive-regulation of NFKBIA, EIF4EBP1, HDAC6, STAT1, and TP53 and the negative-regulation of AKT1 and STAT3, the remaining drugs with agreeable regulation ability for the treatment of HBV-infected patients are shown in Table 6.  Subsequently, to further narrow down the possibilities for drug combinations, the drug toxicity specification (LD50) recorded in DrugBank as well as the drug sensitivity specification (EC50) obtained from the PRISM dataset, which is a dataset comprising sensitivity values of 4518 drugs tested across 578 human cell lines, were also adopted. Being the numeric index of lethality, median lethal dose (LD50) plays a pivotal role in drug safety evaluation [139]. The drug with lower LD50 possesses higher toxicity, and thus it is more detrimental. On the other hand, half maximal effective concentration (EC50) is used as a measurement of concentration, which is often adopted to evaluate the suitability and the efficacy of drugs, where lower EC50 indicates the stronger inducibility, i.e., higher sensitivity. Based on these concepts, the drug combination comprising Sorafenib, Nutlin-3, and Tenofovir as shown in Table 7 was then designed as the multiple-molecule drug to rejuvenate the dysfunctions in the pathogenic mechanism under HBV infection with minor collateral tissue damage.  Among them, Sorafenib is a multikinase inhibitor drug having been approved for several tumor types, including Hepatocellular Carcinoma (HCC) [140]. Although prominent effects of Sorafenib for apoptosis induction were recognized, several limitations have been indicated by numerous medical studies upon applying Sorafenib to various cancers as a single agent. Therefore, the use of Sorafenib in combination with other targeted agents appears to be the main strategy for combatting drug resistance and has been widely explored with promising results [141]. Falling under the category of small molecule inhibitor, Nutlin-3 is conducive to the restoration of p53 pro-apoptotic ability, leading to the activation of cell cycle arrest and apoptosis pathways [142]. Though it is a novel antitumor compound, which hasn't been clinically approved, the synergistic effect of Nutlin-3 combined with other agents contributes to diminishing of the dose required in monotherapy and decreasing the occurrence of adverse drug events [143,144]. Being a potent nucleotide analogue reverse-transcriptase inhibitor, Tenofovir is highly active against human immunodeficiency virus (HIV). In spite of its clinical use associated with the risk of kidney injury [145], it has been shown highly effective to patients who have never undergone an antiretroviral therapy [146]. Owing to its low toxicity, trifle side effects, and subtle drug resistance compared to other antivirals, Tenofovir is thereby frequently used as a member of combined drugs against HBV to ensure stronger antiviral activity and a more favorable safety profile [147,148]. It is worth noting that while therapy with Tenofovir is commonly used to treat Hepatitis B, life-long therapy is needed, which often increases the risk of relapse. In this respect, the combination of other viable drugs with Tenofovir may have a chance to improve the therapeutic effect and shorten the course of therapy.
Taken together, by administering the proposed multiple-molecule drug, an up-regulation of NFKBIA, EIF4EBP1, HDAC6, STAT1, and TP53 accompanied by the down-regulation of AKT1 and STAT3 can plausibly be attained, yielding encouraging results for the treatment of HBV-infected patients.

Discussion
Traditional drugs commonly used for treatment in patients afflicted with HBV are interferon and nucleoside analogues. However, the low therapeutic response of patients to interferon and the inevitable development of drug resistance to nucleoside analogues render the clinical treatment challenging [149]. In  Among them, Sorafenib is a multikinase inhibitor drug having been approved for several tumor types, including Hepatocellular Carcinoma (HCC) [140]. Although prominent effects of Sorafenib for apoptosis induction were recognized, several limitations have been indicated by numerous medical studies upon applying Sorafenib to various cancers as a single agent. Therefore, the use of Sorafenib in combination with other targeted agents appears to be the main strategy for combatting drug resistance and has been widely explored with promising results [141]. Falling under the category of small molecule inhibitor, Nutlin-3 is conducive to the restoration of p53 pro-apoptotic ability, leading to the activation of cell cycle arrest and apoptosis pathways [142]. Though it is a novel antitumor compound, which hasn't been clinically approved, the synergistic effect of Nutlin-3 combined with other agents contributes to diminishing of the dose required in monotherapy and decreasing the occurrence of adverse drug events [143,144]. Being a potent nucleotide analogue reverse-transcriptase inhibitor, Tenofovir is highly active against human immunodeficiency virus (HIV). In spite of its clinical use associated with the risk of kidney injury [145], it has been shown highly effective to patients who have never undergone an antiretroviral therapy [146]. Owing to its low toxicity, trifle side effects, and subtle drug resistance compared to other antivirals, Tenofovir is thereby frequently used as a member of combined drugs against HBV to ensure stronger antiviral activity and a more favorable safety profile [147,148]. It is worth noting that while therapy with Tenofovir is commonly used to treat Hepatitis B, life-long therapy is needed, which often increases the risk of relapse. In this respect, the combination of other viable drugs with Tenofovir may have a chance to improve the therapeutic effect and shorten the course of therapy.
Taken together, by administering the proposed multiple-molecule drug, an up-regulation of NFKBIA, EIF4EBP1, HDAC6, STAT1, and TP53 accompanied by the down-regulation of AKT1 and STAT3 can plausibly be attained, yielding encouraging results for the treatment of HBV-infected patients.

Discussion
Traditional drugs commonly used for treatment in patients afflicted with HBV are interferon and nucleoside analogues. However, the low therapeutic response of patients to interferon and the inevitable development of drug resistance to nucleoside analogues render the clinical treatment challenging [149]. In   [140]. Although prominent effects of Sorafenib for apoptosis induction were recognized, several limitations have been indicated by numerous medical studies upon applying Sorafenib to various cancers as a single agent. Therefore, the use of Sorafenib in combination with other targeted agents appears to be the main strategy for combatting drug resistance and has been widely explored with promising results [141]. Falling under the category of small molecule inhibitor, Nutlin-3 is conducive to the restoration of p53 pro-apoptotic ability, leading to the activation of cell cycle arrest and apoptosis pathways [142]. Though it is a novel antitumor compound, which hasn't been clinically approved, the synergistic effect of Nutlin-3 combined with other agents contributes to diminishing of the dose required in monotherapy and decreasing the occurrence of adverse drug events [143,144]. Being a potent nucleotide analogue reverse-transcriptase inhibitor, Tenofovir is highly active against human immunodeficiency virus (HIV). In spite of its clinical use associated with the risk of kidney injury [145], it has been shown highly effective to patients who have never undergone an antiretroviral therapy [146]. Owing to its low toxicity, trifle side effects, and subtle drug resistance compared to other antivirals, Tenofovir is thereby frequently used as a member of combined drugs against HBV to ensure stronger antiviral activity and a more favorable safety profile [147,148]. It is worth noting that while therapy with Tenofovir is commonly used to treat Hepatitis B, life-long therapy is needed, which often increases the risk of relapse. In this respect, the combination of other viable drugs with Tenofovir may have a chance to improve the therapeutic effect and shorten the course of therapy.
Taken together, by administering the proposed multiple-molecule drug, an up-regulation of NFKBIA, EIF4EBP1, HDAC6, STAT1, and TP53 accompanied by the down-regulation of AKT1 and STAT3 can plausibly be attained, yielding encouraging results for the treatment of HBV-infected patients.

Discussion
Traditional drugs commonly used for treatment in patients afflicted with HBV are interferon and nucleoside analogues. However, the low therapeutic response of patients to interferon and the inevitable development of drug resistance to nucleoside analogues render the clinical treatment challenging [149]. In O indicates the docking between each multi-target drug composing the multiple-molecule drug and their corresponding drug target.
Among them, Sorafenib is a multikinase inhibitor drug having been approved for several tumor types, including Hepatocellular Carcinoma (HCC) [140]. Although prominent effects of Sorafenib for apoptosis induction were recognized, several limitations have been indicated by numerous medical studies upon applying Sorafenib to various cancers as a single agent. Therefore, the use of Sorafenib in combination with other targeted agents appears to be the main strategy for combatting drug resistance and has been widely explored with promising results [141]. Falling under the category of small molecule inhibitor, Nutlin-3 is conducive to the restoration of p53 pro-apoptotic ability, leading to the activation of cell cycle arrest and apoptosis pathways [142]. Though it is a novel antitumor compound, which hasn't been clinically approved, the synergistic effect of Nutlin-3 combined with other agents contributes to diminishing of the dose required in monotherapy and decreasing the occurrence of adverse drug events [143,144]. Being a potent nucleotide analogue reverse-transcriptase inhibitor, Tenofovir is highly active against human immunodeficiency virus (HIV). In spite of its clinical use associated with the risk of kidney injury [145], it has been shown highly effective to patients who have never undergone an antiretroviral therapy [146]. Owing to its low toxicity, trifle side effects, and subtle drug resistance compared to other antivirals, Tenofovir is thereby frequently used as a member of combined drugs against HBV to ensure stronger antiviral activity and a more favorable safety profile [147,148]. It is worth noting that while therapy with Tenofovir is commonly used to treat Hepatitis B, life-long therapy is needed, which often increases the risk of relapse. In this respect, the combination of other viable drugs with Tenofovir may have a chance to improve the therapeutic effect and shorten the course of therapy. Taken together, by administering the proposed multiple-molecule drug, an up-regulation of NFKBIA, EIF4EBP1, HDAC6, STAT1, and TP53 accompanied by the down-regulation of AKT1 and STAT3 can plausibly be attained, yielding encouraging results for the treatment of HBV-infected patients.

Discussion
Traditional drugs commonly used for treatment in patients afflicted with HBV are interferon and nucleoside analogues. However, the low therapeutic response of patients to interferon and the inevitable development of drug resistance to nucleoside analogues render the clinical treatment challenging [149]. In addition, serious toxic accumulation of the nucleoside analogues during long-term therapy for HBV infection is nonnegligible as well. Therefore, great efforts have been placed into seeking highly specific ligands affecting single target to alleviate HBV replication and to treat HBV-related complications. Yet, in view of the prohibitive development costs and timeline of a newly designed drug as well as the frequent failure of monotherapy due to drug resistance, growing interest has centered on discovering effective combinations of drugs for HBV treatment. In this study, we develop a medicine discovery and repositioning procedure in terms of system identification approaches via two-side RNA-seq data and deep learning-based framework considering drug design specifications to find a possible drug combination for the treatment of HBV infection. Through investigation of core HPI-GEN annotated with KEGG pathways, the core signaling pathways responsible for the abnormally regulated cellular functions were extracted and unified in Figure 4. However, distinct cooperation between these pathways sparks diversified implication. Consequently, further discussion of the integrated effect is conducive to the selection of potential biomarkers in relation to the remedy of HBV infection.

Apoptosis in HBV Infection
As a fundamental and highly regulated biological process, apoptosis is a form of programmed cell death that participates in cell resistance to viral infection. Although mounting evidence shows that HBx is uniquely endowed with the capability to regulate apoptosis, its dual role as an apoptotic mediator raises the difficulty to elucidate the pathological mechanism [150]. Caspase-3 is the typical hallmark of apoptosis [110]. Numerous studies have shown that HBx abrogates the activity of caspase-3 [151][152][153], resulting in resistance to assorted apoptotic stimuli. In Figure 4, HBx-mediated inhibition activity of caspase-3 was chiefly through both activating XIAP, the inhibitor of caspase-3, and directly or obliquely regulating transcription factors such as NF-κB and p53 to affect the expression of their downstream anti-apoptotic target, e.g., BCL3 and Bax. Further, even if previous researches did not specifically indicate the paramount downstream anti-apoptotic genes of NF-κB [154,155], BCL3, which hampers DNA damage-induced apoptosis and serves as a transcriptional coregulator of NF-κB [78,79], counted for NF-κB -mediated anti-apoptosis a lot in the pathogenic mechanism. Furthermore, inhibition of caspase-3 not only abrogated apoptotic killing triggered by the pathways in response to TNF-α, FasL, and TRAIL stimulation, but also contributed to the activation of autophagy, leading towards alleviated apoptosis coming from caspase-3 loss of its function to cleave Beclin-1. Asides from that, Fas, indispensable for the FasL-induced pathway that gives rise to the activation of caspase-3, was downregulated during HBV infection. Negative-expression of Fas might impede the sustainable anti-viral response of death ligand stimulus-specific apoptotic pathway. In contrast, as a target gene relevant to execution and completion of apoptosis, the overexpressed CDKN1A triggered by HBx-induced HDAC4 inhibition was concordant with the earlier research that HBx can relieve a block on CDKN1A expression and prolong G1→S transition in human hepatoma cells [156,157]. Even though p21, the protein encoded by CDKN1A, is treated as a cyclin-dependent kinase inhibitor capable of inhibiting all cyclin/CDK complexes, it often promotes the assembly of type-D cyclins with CDK4 and CDK6 and arrests cell cycle progression in the G1 phase [156,158], which not only protects cells against apoptosis but also enhances the chance of tumorigenesis.

Autophagy in HBV Infection
Autophagy is a catabolic mechanism known to engulf long-lived proteins and damaged organelles via a lysosomal degradative pathway. Previous studies have suggested that HBV can hijack components in the autophagic regulatory pathways to promote its survival through augmentation of autophagosomes formation and inhibition of autolysosome maturation [159,160], which indicates the possibility of targeting the autophagic pathway for the treatment of Hepatitis B. HBx-catalyzed autophagic response, which is crucial for the formation of autophagosomes as well as intensification of viral replication, resided mainly in the increased activity of the Beclin-1/Vps34 complex accompanied by the unchanged mTOR activity and positive regulation of LC3 during HBV infection, which has also been reported in several research [159,160]. The phagophores undergo nucleation and elongation with Beclin-1/Vps34 complex and LC3 lipidation to form double-membraned autophagosomes [161], and the autophagic membranes generated during infection are used for viral envelopment [162]. However, rather than Beclin-1, only PI3KC3 was activated by the direct interconnection with HBx, which is in agreement with the study that the need for HBx to also induce the expression of Beclin-1 is redundant [163]. Moreover, although emerging study revealed that HBx efficiently impedes autophagic degradation by disturbing lysosomal acidification relevant to its degradative capacity without influencing on the fusion of autophagosomes and lysosomes [160], HBx-induced reduction of HDAC6 regulation ability might lead to autophagosome maturation failure [103]. Despite the necessity of further validation through biological experiments, this discovery provides possible mechanistic insight into the overall observation of HBV infection. Firstly, the intrinsic ability of HDAC6 to deacetylate NF-κB, having been detected to hamper the invasion of cancer [164,165], was destructed by HBx through transrepression, which might gradually create a potent microenvironment favorable to the malignant transformation of cells. Meanwhile, the negative-regulation of HDAC6 in human HCCs and consequential loss of its tumor suppression ability supported by previous research [166] apparently advocate the hypothesis.

Inflammation and Innate Immune Response in HBV Infection
Inflammation and innate immune response represent a highly coordinated process aimed at fighting infection or tissue injury. When cells are subject to foreign invasion, the inflammatory response can ensure successful resolution of the condition and the restoration of tissue homeostasis. However, inappropriately controlled natural defense mechanisms will eventually lead to the progression of chronic diseases [167]. In general, innate immunity counts for a great deal in governing infection right after contact with the pathogen to limit the spread of the virus [168]. However, the exploitation of immune and inflammatory signaling pathways enables viruses to subvert antiviral immunity and replicate in the hostile environment [169]. Of all the proteins involved in the activation and execution of inflammation response, NF-κB stands out as being crucial for this process in diverse metazoan organisms. During HBV infection, NF-κB-dependent transcription accelerated by HBx not only triggered the anti-apoptosis mechanism in favor of viral persistence as mentioned earlier but also augmented the inflammatory response leading to hepatitis and cell transformation in accordance with the indication of previous study [170]. Concurrently, emerging research has also uncovered that the dysregulated continual synthesis of IL-6, an identified target of NF-κB, plays a pathological effect on chronic inflammation and autoimmunity [167], which obliquely emphasizes the magnitude of the correlation between abnormally regulated NF-κB and excessive inflammation. Besides, regarding viral immunoregulation, even though it is contradictory that HBx on one hand sensitized cells to inflammatory stimuli, but on the other positive-regulated PD-L1 through transactivation of c-Jun and activation of STAT3, the findings are consistent with the idea that production of PD-L1 transcripts pertained positively to the intensity of liver inflammation to prohibit and thereby evade the host immune response [171,172]. Moreover, unlike other proteins in their family which typically share common characteristics and functions, STAT1 and STAT3 poles apart and the mutual functional antagonism of them in T-cell-induced inflammation has also been elucidated [173]. Despite both serving as the target of miR-122, SOCS1 was regulated by miR-122 and thus transactivated by HBx in the identified mechanism during HBV infection, while SOCS3 did not. Therefore, we suggest that the abrogation of STATs activity by HBx to attenuate the interferon-mediated suppression of infection depends on STAT1 instead of STAT3. Meanwhile, blockades of IL-6-and IFNs-stimulated immune signaling pathways by viral proteins were through STAT1 rather than STAT3, which also supports the idea from another perspective. In addition, activation of STAT3 mediated by HBx also bolstered its unshakeable status on maintaining viral persistence. Likewise, inactivation of STAT1 mediated by HBV proteins counteracted its effects on the provocation of the innate antiviral system. Such phenomena have also been revealed in the evidence of numerous studies [174]. It is worth noting that the reported capacity of HBx to block STAT1 nuclear import by means of affecting its methylation status was partially through the communication with CBP in the identified host/pathogen cross-talk. In such a way, that STAT1 lost its ability to block cell cycle progression and hinder hepatoma cell transformation in a course of prolonged inflammatory injury. This led to continuous destruction and regeneration of hepatocytes, increasing the chances of genetic alterations.

Discovery of Potential Drug Combination
Based on the significant biomarkers including AKT1, NFKBIA, EIF4EBP1, HDAC6, STAT1, STAT3, and TP53, the combination of agreeable multi-target drugs, i.e., Sorafenib, Nutlin-3, and Tenofovir, was eventually recommended as a potential treatment for later clinical researches of the multi-drug therapy treating HBV-infected patients. Meanwhile, to gain deeper insight into the discovery of promising drugs, in silico profiling using deep learning-based DTI model was performed to predict interactions (dockings) between available drugs and the identified biomarkers. Additionally, deciding whether a drug is ready for clinical trials greatly pertains to preclinical studies that yield preliminary efficacy, toxicity, safety, etc., information. Therefore, to further narrow down the scope and improve the reliability of the predicted drugs, the general pharmacological specifications for drug design including regulation ability, toxicity, and sensitivity were additionally adopted to evaluate the efficacy. However, the entire process of moving a drug to clinical trials is actually a long way to go. Better preclinical preparation and assessment can be beneficial to the approval of a new medicine or drug combination. Considering the common reasons for withdrawal of an approved drug, e.g., hepatotoxicity and adverse events as well as the safety and tolerability assessment of a drug such as HLA (human leukocyte antigen) test, is conducive to better finding an agreeable option for drug combination. Repositioning drugs to treat with both common and rare disease is gradually becoming an attractive proposition. Taking advantage of drug repositioning avenues provides a faster discovery and translation of clinically relevant drug combinations. Furthermore, aided with the expediting development of computational technology and low-cost sequencing approaches, mounting publicly accessible data will certainly open up a broad spectrum of drug discovery and repositioning. Consequently, by systematically integrating more extensive data and information for both drugs and targets, a more appropriate drug combination out of the scope of original medical indication can be unveiled to bring new hope to HBV therapeutics.

Conclusions
The advent of the genomic eras has presented researchers with a myriad of high throughput biological data, which spark the interests in diverse biology applications. In this study, to investigate the pathogenetic mechanism under HBV infection, we constructed candidate host/pathogen interspecies genetic and epigenetic interaction network (HPI-GEN) by big data mining. Then, with the help of the two-side RNA-seq data, system identification strategies were applied to trim the false positives from the candidate HPI-GEN to obtain the real HPI-GEN. Thereafter, based on the extraction of PNP and the annotation of KEGG pathways, interspecies cross-talk signaling pathways are investigated from the real HPI-GEN for pathogenic mechanism under HBV infection to identify significant biomarkers. Moreover, in order to discover promising drugs for the identified drug targets, we trained a deep learning-based DTI model to predict possible drug-target interactions. Eventually, with the consideration of general drug design specifications including regulation ability, toxicity, and sensitivity, a combination of multi-target drugs as a potential multiple-molecule drug was selected to abate HBV infection. It is worth pointing out that although only pathogenesis for HBV infection was investigated in this work, our unique workflow can also be utilized on a wide variety of disease in view of systems biology, holding utility to aid in diversified drug discovery and repurposing process. Meanwhile, along with the development of NGS (Next-Generation Sequencing) technology, additionally integrating analysis from available genomics data into our pipeline can reinforce better downstream analysis and perform more exact biological interpretation. As more extensive genomic and pharmacological data being considered, the proposed pipeline could help reveal a more exhaustive host/pathogen offensive and defensive mechanism under HBV infection and facilitate the development of optimal therapy ensuring greater therapeutic outcome.

Funding:
This research was funded by Ministry of Science and Technology grant number MOST 107-2221-E-007-112-MY3.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In order to avoid overfitting in the parameter estimation process, the cubic spline interpolation method was employed to interpolate 4 time points into 1440 points. Even if the cubic interpolation couldn't increase the information for the parameter estimation, it could indeed prevent the parameters estimation in Equations (13) and (14) and (27)-(30) from overfitting during the identification process.