Use of a Deep Learning Approach for the Sensitive Prediction of Hepatitis B Surface Antigen Levels in Inactive Carrier Patients

Deep learning is a subset of machine learning that can be employed to accurately predict biological transitions. Eliminating hepatitis B surface antigens (HBsAgs) is the final therapeutic endpoint for chronic hepatitis B. Reliable predictors of the disappearance or reduction in HBsAg levels have not been established. Accurate predictions are vital to successful treatment, and corresponding efforts are ongoing worldwide. Therefore, this study aimed to identify an optimal deep learning model to predict the changes in HBsAg levels in daily clinical practice for inactive carrier patients. We identified patients whose HBsAg levels were evaluated over 10 years. The results of routine liver biochemical function tests, including serum HBsAg levels for 1, 2, 5, and 10 years, and biometric information were obtained. Data of 90 patients were included for adaptive training. The predictive models were built based on algorithms set up by SONY Neural Network Console, and their accuracy was compared using statistical analysis. Multiple regression analysis revealed a mean absolute percentage error of 58%, and deep learning revealed a mean absolute percentage error of 15%; thus, deep learning is an accurate predictive discriminant tool. This study demonstrated the potential of deep learning algorithms to predict clinical outcomes.


Introduction
Risk prediction models are routinely used in healthcare practice to identify high-risk patients, guide treatment decisions, and effectively allocate healthcare resources according to patients' needs. These risk prediction models are typically built using statistical regression techniques. Similar trends have also been reported for the clinical course of hepatitis B virus (HBV) [1,2].
Machine learning systems are used to identify objects in images, transcribe speech into text, match news items, match posts or products with user interests, and select relevant results in response to search requests. These applications increasingly make use of a class of techniques called deep learning (DL) [3]. DL is a branch of machine learning that employs algorithms to process data, imitate the thinking process, or develop abstractions and uses multiple layers of algorithms to process data. The application of DL models in the healthcare system has garnered great interest owing to the increasing complexity and volume of healthcare data [4].
Infection with the HBV leads to acute or chronic liver diseases. According to the World Health Organization report, 296 million individuals had chronic hepatitis B (CHB) 2 of 11 in 2019; among them, 820,000 mortalities were recorded primarily owing to cirrhosis and hepatocellular carcinoma (HCC), which is a form of primary liver cancer. The clinical course of HBV infection is variable and includes acute (self-limiting) infections, fulminant hepatic failure, an inactive carrier state, and chronic hepatitis with potential progression to cirrhosis and HCC [5].
Chronic inflammation and subsequent hepatocarcinogenesis can be substantially affected by various factors associated with HBV. A study in Taiwan has revealed that hepatitis B surface antigen (HBsAg) or hepatitis B e antigen (HBeAg) is associated with hepatocarcinogenesis [6]. Hepatocarcinogenesis is reportedly high in those with high HBV DNA titer, and it is relatively higher when the amount of HBsAg is high in patients with low HBV DNA titer, indicating that the suppression of HBV replication and negative HBsAg are vital in deterring hepatocarcinogenesis. HBV replication has recently been well-controlled by nucleos(t)ide analog (NA) therapy. In such patients, the main antiviral therapy goals are HBV DNA suppression and alanine aminotransferase (ALT) level normalization [7]. However, in most patients, treatment does not cure HBV infection but only suppresses the replication of the virus. Therefore, most patients who start HBV treatment must continue it throughout their life. For patients with chronic hepatitis due to HBV infection, longterm suppressive NA therapy in HBeAg-positive patients has shown minimal effects in decreasing the HBsAg titer [8].
Notably, it is difficult to judge the efficacy of NA treatment based on the HBV DNA level, as this metric typically becomes less sensitive after antiviral treatment is initiated. Consequently, some reports have suggested that it is useful to alternatively monitor HBs antigen levels over time. In the case of HBeAg-positive chronic hepatitis B, the measurement of HBsAg levels after 24 weeks of treatment with Peg-IFNα-2a alone or in combination with lamivudine can be used to determine HBeAg seroconversion, HBV DNA levels, and HBsAg levels 24 weeks after the cessation of treatment [9]. Additionally, HBeAg seroconversion, HBV DNA content, and HBsAg reduction rate after 24 weeks of treatment can be predicted based on the measured HBsAg level at 24 weeks of treatment. In addition to predicting the efficacy of antiviral therapy, the need to measure HBsAg antigen over time in the natural course of HBV has been proposed.
A prospective study of HBV cases without prior antiviral therapy conducted in Taiwan revealed that the incidence of hepatocellular carcinoma is higher when the baseline HBV DNA levels are high (≥2000 IU/mL), and the incidence of hepatocellular carcinoma is lower in HBeAg-negative patients and those with low viral loads (<2000 IU/mL) [7]. By contrast, in HBeAg-negative and low viral load cases (<2000 IU/mL), the incidence of hepatocellular carcinoma was reported to be related to HBsAg levels [10]. Briefly, even if the HBV DNA level is <2000 IU/mL, the risk of carcinogenesis is high when the HBsAg level is ≥1000 IU/mL, and the risk is even higher in the group for whom an HBsAg level of ≥1000 IU/mL persists for more than three years [7].
The disappearance and reduction of HBsAg levels are primary treatment challenges worldwide [11]. In the past, many reports have highlighted the risks associated with high HBsAg levels; however, none have used DL to assess HBsAg levels. The reason for this is that HBsAg levels over a long time have not been uniformly obtained at the same institution [12][13][14].
DL can help predict the reduction of HBsAg levels with high sensitivity during outpatient care. Therefore, the aim of this study was to identify an optimal DL model to predict changes in HBsAg levels in daily clinical practice for inactive carrier patients.

Ethics Statement
The study protocol conformed to the guidelines of the Declaration of Helsinki and was approved by the Ethics Review Board of Niigata University (approval number 2020-0114).
This retrospective study enrolled patients who were not required to provide informed consent; only anonymous clinical data obtained from patients who consented to treat-ment were used in this study. Furthermore, we provided the option for patients to opt out of the study via a poster. The poster was approved by the Ethics Review Board of Niigata University.

Study Design, Participants, and Settings
This retrospective, observational, single-center study was conducted at the Niigata University Hospital and only included patients who were HBsAg-positive.
Patients who had HBsAg examinations between April 2003 and December 2019 were screened. The inclusion criteria were as follows: a confirmed record of HBsAg quantitative levels and ALT, aspartate transaminase (AST), and HBV DNA levels measured in the first, second, fifth, and tenth years; not receiving NA therapy in the observation period; and no development of hepatocellular carcinoma.

Sample Analysis
Serum HBsAg titers were measured using Lumipulse Ii HBsAg (Fujirebio Inc., Tokyo, Japan) from 2003 to 2007 and Architect HBsAg QT (Abbott Laboratories, Rungis, France) from 2008 onwards. The correlation of HBsAg assay kits has been reported previously and does not introduce errors in the generation of teacher data [15]. HBV DNA levels were quantified using real-time PCR with the TaqMan PCR system (Roche Diagnostics, Tokyo, Japan).

Statistical Analysis
Continuous variables are expressed as the median and interquartile range, and differences were assessed using the chi-square test, Fisher's exact test, or Mann-Whitney U test, as appropriate. The correlation among continuous variables was assessed using linear regression analysis. All analyses were performed using IBM SPSS Statistics for Windows, version 22.0 (IBM Corp., Armonk, NY, USA). Statistical significance was defined as p-value < 0.05.

Deep Neural Network (DNN)
To construct and modify the machine learning model, a Windows PC, with Intel Core i5 2 GHz carrying 16 GB RAM, and NNC version 1.10 (Sony Corp., Tokyo, Japan) were used as a DL-integrated development environment.

Dataset Creation
After creating the dataset, the "input folder" and "output folder" were first created, and data were categorized in folders and stored in the "input folder." Subsequently, data regarding the age, weight, height, sex, ALT level, AST level, HBV DNA level, and HBsAg level after 1, 2, 5, and 10 years for each patient were set on the DATASET creation screen of the NNC. The ratio of training data to test data was specified and executed to a commaseparated values (CSV) file of the dataset that was created.
The large amount of data contained in a dataset is generally divided into two subsets. One subset is used for training, and the other is used to evaluate the performance of the DL model after training. Approximately 20% of the evaluation data was allocated for verification. At that time, we also ensured that the combination of training data and evaluation data was normally distributed. In addition, as explained in the inclusion criteria section, only cases with no missing values for each item were included.

Network Configuration
The DL program allows for easy editing of the configuration using drag and drop options; it also allows users to design neural networks using multiple layers. New ideas can be implemented in a matter of seconds, allowing better performance and automatic identification of lighter neural networks. For this analysis, we built a three-layer network, comprising an input layer, a hidden layer, and an output layer. The network was con-structed by dragging and dropping components from the component list on the left side of the network graph ( Figure 1). tion, only cases with no missing values for each item were included.

Network Configuration
The DL program allows for easy editing of the configuration using drag and drop options; it also allows users to design neural networks using multiple layers. New ideas can be implemented in a matter of seconds, allowing better performance and automatic identification of lighter neural networks. For this analysis, we built a three-layer network, comprising an input layer, a hidden layer, and an output layer. The network was constructed by dragging and dropping components from the component list on the left side of the network graph ( Figure 1). In the project screen of the NNC, a one-layer neural network (Input, Affine, Rectified Linear Unit (ReLU), and Affine2) was created by adding functions to the network graph, starting with Input in components. Thereafter, we checked if the CSV file of the dataset could be read.
Affine was used as an all-attachment layer that combines all input values to all output layers as specified by the out-shape property. ReLU was used as a function in which the output value was always zero when the input value of the function was ≤0, and the output value was the same as the input value when the input value was >0. After the second layer of Affine, the loss function HuberLoss_2 was set at the end. Huber Loss was employed as a loss function to help detect small errors using squared error and large errors using absolute error; thus, small errors were weighted, and large irregular errors were less weighted. In the project screen of the NNC, a one-layer neural network (Input, Affine, Rectified Linear Unit (ReLU), and Affine2) was created by adding functions to the network graph, starting with Input in components. Thereafter, we checked if the CSV file of the dataset could be read.
Affine was used as an all-attachment layer that combines all input values to all output layers as specified by the out-shape property. ReLU was used as a function in which the output value was always zero when the input value of the function was ≤0, and the output value was the same as the input value when the input value was >0. After the second layer of Affine, the loss function HuberLoss_2 was set at the end. Huber Loss was employed as a loss function to help detect small errors using squared error and large errors using absolute error; thus, small errors were weighted, and large irregular errors were less weighted.

Evaluation
By clicking on the learning curve button to commence learning, the progress was illustrated in the Learning Curve on the graph monitor (the learning result screen). The training ended with the set number of epochs ( Figure 2). The evaluation result could be checked in the Evaluation tab, and the evaluation operation commenced when the Confusion Matrix tab was clicked and "estimated y" was displayed ( Figure 3).

Evaluation
By clicking on the learning curve button to commence learning, the progress was illustrated in the Learning Curve on the graph monitor (the learning result screen). The training ended with the set number of epochs ( Figure 2). The evaluation result could be checked in the Evaluation tab, and the evaluation operation commenced when the Confusion Matrix tab was clicked and "estimated y" was displayed ( Figure 3).

Evaluation
By clicking on the learning curve button to commence learning, the progress was illustrated in the Learning Curve on the graph monitor (the learning result screen). The training ended with the set number of epochs ( Figure 2). The evaluation result could be checked in the Evaluation tab, and the evaluation operation commenced when the Confusion Matrix tab was clicked and "estimated y" was displayed (Figure 3).  One of the most common metrics used to measure the forecast accuracy of a model uses the mean absolute percentage error (MAPE) as follows: where n is the sample size; actual, the actual data value; and forecast, forecasted data value. MAPE is commonly used because its results are easy to interpret and explain. MAPE was designed for use in both statistical analyses and NNCs.

Patient Characteristics
There were 1820 HBsAg-positive patients identified in the initial screening (Figure 4). A total of 1583 patients without HBsAg levels for years 1, 2, 5, and 10 were excluded. A further 147 patients who underwent nucleic acid analog therapy during the investigation period were excluded from the analysis to prevent differentiation between inactive carrier and NA treatment patients as Boglione et al. reported that the calculated expected time to HBsAg loss is shorter for tenofovir than for telbivudine [16].
J. Clin. Med. 2022, 10, x FOR PEER REVIEW 6 of 12 Figure 3. Evaluation table. The evaluation result can be checked in the Evaluation tab, and the evaluation is initiated when the Confusion Matrix tab is clicked and "estimated y" is displayed. HBs: hepatitis B surface.
One of the most common metrics used to measure the forecast accuracy of a model uses the mean absolute percentage error (MAPE) as follows: where n is the sample size; actual, the actual data value; and forecast, forecasted data value. MAPE is commonly used because its results are easy to interpret and explain. MAPE was designed for use in both statistical analyses and NNCs.

Patient Characteristics
There were 1820 HBsAg-positive patients identified in the initial screening ( Figure  4). A total of 1583 patients without HBsAg levels for years 1, 2, 5, and 10 were excluded. A further 147 patients who underwent nucleic acid analog therapy during the investigation period were excluded from the analysis to prevent differentiation between inactive carrier and NA treatment patients as Boglione et al. reported that the calculated expected time to HBsAg loss is shorter for tenofovir than for telbivudine [16]. The training data comprised data from 72 patients, and the evaluation data included 18 patients with no missing values for HBsAg, ALT, and AST levels after 1, 2, 5, and 10 years and HBV DNA titer after 1 year. The characteristics of the patient training data are presented in Table 1. There were no significant differences between the training data and evaluation data.  The training data comprised data from 72 patients, and the evaluation data included 18 patients with no missing values for HBsAg, ALT, and AST levels after 1, 2, 5, and 10 years and HBV DNA titer after 1 year. The characteristics of the patient training data are presented in Table 1. There were no significant differences between the training data and evaluation data.

Evaluation of Statistical Analysis Multivariate Logistic Regression Model
In the initial analysis, 17 items were examined, and 17 continuous variables were used.
In the multivariable analysis model, nine variables, including HBeAg-rate, HBs levels after 5 years, and HBs levels after 2 years, were included. The multiple correlation coefficient was 0.989, and the coefficient of determination (R 2 ) was 0.97. Furthermore, the logistic regression model generated the coefficients of a formula to predict a logit transformation of the probability of the presence of the characteristic of interest as follows: HBsAg at   Multiple regression analysis was performed to assess serum HBsAg levels. We compared these values to the predicted value of serum HBsAg levels after 10 years, determined using DL. Multiple regression analysis revealed a MAPE of 58%.

Evaluation of DL Implementation and Evaluation of Learning
For the 18 patients included in the validation dataset for which antigen levels after 10 years were known, the CSV file of the learning results was referred to as the evaluation item after executing supervised data learning, and the predicted values were calculated a y .
Using the DNN, the new validation for 18 patients showed an MAPE of 15%. When verified using the mean absolute error rate, the model created using DNN was accurate, partly because it had a positive value.
The accuracy of the predicted value from the DL and multivariate logistic regression models are shown Figure 5. The correct median of the evaluation data and the predicted value from DL and multivariate logistic regression model are calculated as follows. HBsAg fifth year, IU/mL Multiple regression analysis was performed to assess serum HBsAg levels. We compared these values to the predicted value of serum HBsAg levels after 10 years, determined using DL. Multiple regression analysis revealed a MAPE of 58%.

Implementation and Evaluation of Learning
For the 18 patients included in the validation dataset for which antigen levels after 10 years were known, the CSV file of the learning results was referred to as the evaluation item after executing supervised data learning, and the predicted values were calculated a' y'.
Using the DNN, the new validation for 18 patients showed an MAPE of 15%. When verified using the mean absolute error rate, the model created using DNN was accurate, partly because it had a positive value.
The accuracy of the predicted value from the DL and multivariate logistic regression models are shown Figure 5. The correct median of the evaluation data and the predicted value from DL and multivariate logistic regression model are calculated as follows.

Discussion
In recent years, the evolution of artificial intelligence (AI) has been remarkable, and this is due to the emergence of DL-a type of machine learning, which is a technology that

Discussion
In recent years, the evolution of artificial intelligence (AI) has been remarkable, and this is due to the emergence of DL-a type of machine learning, which is a technology that allows computers to learn in a similar manner to the human brain. To extract features, neural networks that mimic the mechanism of human neurons are used, and the hierarchy of calculations for deriving results is deeper than that used for previous AI models [17].
There are two reasons why DL has become common: the first is the spread of the internet and the miniaturization of various sensors, which have made it possible to obtain large amounts of data that can be used for AI learning; the second is the emergence of graphics processing units (semiconductors for image processing), which have a considerably higher capacity for parallel operations than central processing units. The neural networks that form the basis of DL had already been devised, but they showed insufficient processing performance. With the aid of a large amount of data and high parallel computing power, the practical application of DL progressed rapidly [18].
Here, we applied a DNN technique to evaluate future decreased HBsAg levels and then compared its performance with a traditional statistical model. We observed that the efficacy of the DNN method was comparable to that of the traditional statistical model. DNN techniques have been used to predict factors, including physicochemical properties.
HBsAg loss is considered the ideal therapeutic goal in HBV-infected patients; however, it is rarely achieved via treatment with the currently available antiviral agents [19]. HBsAg seroclearance is widely considered to be an important indicator of CHB prognosis [20]. The spontaneous seroclearance of HBV DNA and HBsAg is an important predictor of reduced HCC risk [21]. Thi Vo et al. analyzed data from 1840 patients who presented with HCC; among them, 75.5% (1390) had high HBsAg levels, at over 1000 (IU/mL), and 24.5% (450) had low HBsAg levels, at under 1000 (IU/mL). The participants had a 2.46-fold rise in the risk of HCC development compared with those with lower HBsAg levels [22]. An understanding of HBsAg titer changes throughout HBV infection may provide some potentially valuable insights into hepatitis B pathogenesis and viral life cycle [23]. Although multifactor analysis has been previously performed, reliable predictors of the natural history of disappearance or decreased HBs levels have not been established. An additional point to consider is that the HBsAg level is also affected by factors such as age, HBV DNA level, and HBV genotype [24].
HBeAg and HBsAg titers have been proposed as biomarkers for infected liver cell mass or HBV covalently closed circular DNA (cccDNA), the hepatocyte nuclear reservoir that is responsible for viral persistence [25]. This concept supports their use as biomarkers. If the hepatocyte is considered in isolation, HBsAg, HBeAg, and serum HBV DNA levels would be expected to directly correlate with each other and with liver cccDNA levels, as all are translated from separate transcripts (Pre-S1, Pre-S2/S, precore/core, and pre-genomic mRNA, respectively) directly derived from cccDNA. However, the published data that describe these relationships are limited and conflicting [26,27]. ALT levels were included in the logistic regression model, and the result matches that reported previously [28].
In patients treated with NA, it also declines very slowly, even though serum HBV DNA levels decrease significantly. Low serum HBsAg may predict either spontaneous or treatment-induced HBsAg seroclearance and potentially selects HBeAg-negative patients who can safely stop NA. High serum HBsAg is associated with a high risk of hepatocellular carcinoma in an untreated population and predicts treatment failure in patients receiving pegylated interferon. Therefore, the potential roles of HBsAg quantification are applicable to selected populations only. There is also a need for novel markers to study the effect of emerging antiviral therapies targeting various parts of the HBV cycle to reflect their distinct mechanistic effects. Several agents measuring HBsAg levels have shown a rapid and significant decline. Ongoing studies are required to demonstrate the sustainability of HBsAg suppression by these novel agents [29].
In the multiple regression equation generated from the statistics, if a factor is not significant, it is deleted as an item; however, the DL reflects it even if it is not significant, which may indicate why the MAPE showed improved accuracy in the DL model compared to the logistic regression model. In the future, when additional cases are added, or when new markers are discovered, the error can be continually improved by adding new items to the DL, without needing to perform complicated multiple regression equations.
The use of DNN algorithms to predict disease status or outcomes in clinical datasets is consistently gaining more attention in medicine and healthcare, as reported by a study that reviewed relevant topics [30]. In this retrospective cohort study, we compared HBsAg levels predicted using DL and those predicted using statistical analysis.
In terms of the prediction of viral spread using AI, there are several leading reports involving COVID-19. The prediction of coronavirus infections using DNN has been reported [31], and DNN is likely to be the best model for predicting trends in viral diseases that continually show diversity and are influenced by multiple factors. There are several advantages of using AI to fight against pathogens such as viruses and bacteria that cause a complex reaction in the host [32]. DNN can accelerate the discovery of valuable vaccines or drugs to prevent pandemics and facilitate the diagnosis of diseases [33,34].
There are some limitations to this study, including the number of participants required for the training data. Entecavir, tenofovir disoproxil fumarate, and tenofovir alafenamide were approved for the treatment of CHB by the Food and Drug Administration in 2005, 2008, and 2016, respectively [35]. However, we did not include patients who were receiving nucleic acid analog therapy at the time of analysis to remove uncertainty in the training data.
DNN requires a specific area of expertise; however, this study demonstrates that clinicians can make accurate predictions using simple DL tools, rather than producing complex statistical formulas, using data managed by their institutions.  Informed Consent Statement: Patient consent was waived due to the retrospective nature of this study and because only anonymous clinical data that were obtained after each patient consented to treatment, in writing, were used. Furthermore, we also provided the option for patients to opt out of the study, via a poster. The poster was approved by the Ethics Review Board of Niigata University.

Data Availability Statement:
The data underlying this article are available in this article.