A Deep Recurrent Neural Network-Based Explainable Prediction Model for Progression from Atrophic Gastritis to Gastric Cancer

: Gastric cancer is the ﬁfth most common cancer type worldwide and one of the most frequently diagnosed cancers in South Korea. In this study, we propose DeepPrevention, which comprises a prediction module to predict the possibility of progression from atrophic gastritis to gastric cancer and an explanation module to identify risk factors for progression from atrophic gastritis to gastric cancer, to identify patients with atrophic gastritis who are at high risk of gastric cancer. The data set used in this study was South Korea National Health Insurance Service (NHIS) medical checkup data for atrophic gastritis patients from 2002 to 2013. Our experimental results showed that the most inﬂuential predictors of gastric cancer development were sex, smoking duration, and current smoking status. In addition, we found that the average age of gastric cancer diagnosis in a group of high-risk patients was 57, and income, BMI, regular exercise, and the number of endoscopic screenings did not show any signiﬁcant difference between groups. At the individual level, we identiﬁed that there were relatively strong associations between gastric cancer and smoking duration and smoking status.


Introduction
Gastric cancer is the fifth most common cancer in the world and one of the most frequently diagnosed cancers in South Korea [1,2]. The national cancer screening program in Korea provides adults over the age of 40 with endoscopic screening once every two years, and the mortality rate is decreasing due to early gastric cancer detection and prompt treatment. However, endoscopic screening has adverse effects and can generate falsepositive results that lead to overdiagnosis [3,4]. Therefore, it is important to identify the groups that are at high risk of gastric cancer and to recommend regular endoscopic screening for these patients [5,6].
Known risk factors for gastric cancer include infection with Helicobacter pylori (H. pylori), salty and smoked food intake, smoking, alcohol, family history of gastric cancer, low socioeconomic status, and obesity [2,7,8]. Notably, H. pylori infection was identified as a major cause of gastric cancer [9], and atrophic gastritis is a precursor that progresses to gastric cancer, although in less than 10% of cases [10]. Therefore, it is also necessary to identify the risk factors and predict the cases of atrophic gastritis that are at high risk of developing into gastric cancer.
In this study, we proposed a deep recurrent neural network-based prediction model to identify risk factors for progression from atrophic gastritis to gastric cancer and to elucidate patients with atrophic gastritis who are at high risk of gastric cancer. Deep learning has wide utilization in health care applications such as medical imaging, screening, biomarker selection, and electronic health record (EHR) analysis [11][12][13]. In particular, predicting patient status using EHRs with various machine learning techniques such as support vector machine (SVM), random forests, and deep learning has attracted much attention [14][15][16]. Researchers have used EHRs and deep learning to predict pneumonia risk, hospital readmission, clinical events, and patients' future [17][18][19].
The proposed predictive model has explainability, which is the ability to explain how a prediction has been obtained using important features for each patient. Typically, deep learning-based predictive models show high accuracy, but they have disadvantages in their explainability [20,21]. However, for medical applications, explanations are essential for both doctors and patients to understand and trust the predictive model's prediction results. Thus, explainability is essential for predictive models in medical applications [22].
In this paper, we present DeepPrevention, which consists of a prediction module and an explanation module. The prediction module performs a deep recurrent neural network (RNN)-based prediction, which predicts the probability that atrophic gastritis will develop into gastric cancer. The explanation module explains the prediction results, which identify risk factors for progression to gastric cancer from atrophic gastritis using the Chi-squared test. Furthermore, it detects high-risk patients who have a high probability of progressing to gastric cancer using K-means clustering [23]. Finally, the explanation module provides a personal explanation of the prediction using a local surrogate called LIME [24].
The data set used in this study consisted of South Korea National Health Insurance Service (NHIS) medical checkup data for patients with atrophic gastritis from 2002 to 2013. The NHIS data were converted to the Observational Medical Outcomes Partnership-Common Data Model (OMOP-CDM) by Observational Health Data Sciences and Informatics [25]. A total of 29,557 patients with atrophic gastritis were identified, and among them, 771 had progressed to gastric cancer. As gastroscopy is recommended for adults aged 40 and older, we restricted the patient age range to 40-75 years; this resulted in 18,846 atrophic gastritis patients and 610 gastric cancer patients undergoing our analyses.
For our analysis, we considered the patients' demographic and environmental risk factors, where the demographic factors were sex, age, and income, and the environmental factors were smoking habits, alcohol, regular exercise habits, and body mass index. We also captured each patient's number of endoscopic screenings. To maintain the accurate incidence of gastric cancer in gastritis patients, we did not under-or over-sample, even though the data imbalance was severe. Instead, we proposed a deep RNN model with eight hidden layers to learn the features of the minority class, that is, gastric cancer patients. We also applied L2 regularization and dropout, and the proposed model achieved an area under the receiver operating characteristics curve (AUC) of 0.84.
In our experimental results, the most influential predictors of developing gastric cancer were sex, smoking duration, and current smoking status; most of the patients in the high-risk group were current smokers. We also found that the average age at gastric cancer diagnosis was 57 years. Eight times more male than female patients were diagnosed with gastric cancer, but there were relatively strong associations between smoking duration and smoking status; a family history of cancer was also closely related.
The remainder of this paper is as follows. In Section 2, we discuss related work, and in Section 3, we give an overview of DeepPrevention. In Section 4, we present the deep RNN-based prediction model, and in Section 5, we describe the influential risk factors, characteristics of a group of patients at high risk, and personal explanation in detail. Finally, we give concluding remarks in Section 6.

Related Works
EHRs are widely used for predicting clinical risk [26]. Weng et al. [27] improved cardiovascular risk prediction using the clinical data of 378,256 patients from UK family practices with 22 risk factor variables. The authors applied four machine learning algo-rithms, logistic regression, random forests, gradient boosting, and neural networks, to UK primary care EHRs, and their machine learning-based prediction models outperformed a conventional algorithm against the American College of Cardiology guidelines.
Similar to our study, Taninaga et al. [6] predicted future gastric cancer risk based on XGBoost and logistic regression using medical checkup data from 25,942 participants who underwent multiple endoscopies in Japan. They found no accurate connections between long-term H. pylori infection or the presence of chronic atrophic gastritis and future gastric cancer. Biological background and blood test results led to increased prediction performance.
More recently, deep learning has attracted much attention in clinical applications based on EHRs [28]. The most important characteristics of EHRs is that they are records of time-series data. Each patient's medical and visit history is represented in sequence, and researchers used many different representations of the EHRs for prediction models based on deep learning. Deep Patient [18] uses unsupervised deep feature learning to represent EHR data. These researchers used a stack of denoising autoencoders to train the input data. Other researchers proposed Deepr (deep record) [29] to analyze the sequences of patients' diagnoses and treatment to predict their medical outcome.
Recurrent neural networks [30] and their variations [31,32] can successfully predict patients' future medical outcomes. DeepCare [16] introduced convolutional LSTM (C-LSTM), which is an end-to-end deep dynamic memory neural network built on long short-term memory (LSTM), and researchers use it to infer illness states and predict future medical outcomes based on illness history in medical records. The proposed C-LSTM in Deep-Care predicts next disease stages, recommends interventions, and estimates unplanned readmission among diabetic and mental health patients.
A common criticism of deep learning models is that they cannot explain the features that influence the predictions. In medical applications, in particular, it is essential to describe the factors that contribute to the results. Towards this aim, researchers have applied explainable artificial intelligence (XAI) to a variety of medical applications. For instance, researchers introduced an XAI model to predict acute critical illness from EHRs with visual explanations for the given prediction [33]. In this study, we adopted both traditional statistics such as Chi-squared test and K-means clustering, and explainable AI called LIME. As a result, we can describe risk factors for whole patients, characteristics for a group of patients at high risk, and personal explanations.

Data Description
We extracted the medical data for this study from South Korea's NHIS database for 2002-2013, converting the data to OMOP-CDM format. A total of 29,557 atrophic gastritis patients were identified, of whom 771 had progressed to gastric cancer. For our study purposes, we restricted the sample to patients aged 40 to 75, which resulted in a data set of 18,846 patients with atrophic gastritis, among whom 610 had progressed to gastric cancer. As our main consideration was the effect of lifestyle habits on the development of gastric cancer in patients with atrophic gastritis, we inputted the following 13 features into the prediction model:

•
Demographics: age, sex, test age at which gastric cancer was diagnosed • Smoking habits: frequency of smoking, smoking duration, current smoking status -Frequency of smoking: ranges included less than half a pack of cigarettes a day, less than one pack, more than one pack but fewer than two, more than two packs - Smoking duration: ranges included fewer than 5 years, more than 5 years but fewer than 9, more than 10 years but fewer than 19, more than 20 years but fewer than 29 years, more than 30 years -Current smoking status: yes for smokers and no for non-smokers • Drinking habits: categorical values included ranges such as two or three times in a month, one or two times in a week, three or four times in a week, and every day To explore and understand the characteristics of the data set used in this study, two-dimensional visualization was performed using t-Stochastic Neighbor Embedding (t-SNE) [34], as shown in Figure 1. t-SNE reduces high dimensional data to a two-dimensional space, i.e., component 1 and component 2, while preserving the distance between points in the high dimensional space. The red spots in Figure 2 represent gastric cancer patients, and the blue spots represent atrophic gastritis patients. From the visualization of our data set, we found that the complexity of boundaries separating classes was extremely high, and the problem of class imbalance was serious.
fewer than 9, more than 10 years but fewer than 19, more tha than 29 years, more than 30 years -Current smoking status: yes for smokers and no for non-smo  Drinking habits: categorical values included ranges such as two month, one or two times in a week, three or four times in a week  Regular exercise: binary yes or no  Income: index into 10 levels by 10% according to the total house  Family history of cancer: binary yes or no  Body mass index: weight in kilograms divided by the square of  Number of endoscopic screening tests  Current status: 0 for before gastric cancer diagnosis and 1 for af agnosis To explore and understand the characteristics of the data set use dimensional visualization was performed using t-Stochastic Neigh SNE) [34], as shown in Figure 1. t-SNE reduces high dimensional d sional space, i.e., component 1 and component 2, while preserving t points in the high dimensional space. The red spots in Figure 2 repr patients, and the blue spots represent atrophic gastritis patients. From our data set, we found that the complexity of boundaries separating cl The proposed prediction model is a deep neural network model with eight hidden layers. We arrived at the appropriate number of layers by increasing the number in each experiment. Table 1 shows the experimental results. For the evaluations, we selected sensitivity given by Equation (1), specificity given by Equation (2), and AUC [43][44][45] which is the probability that a classifier would rank a randomly chosen positive instance higher than it ranked a randomly chosen negative one. Algorithm 2 describes DeepPrevention in detail.

Algorithm 2 Pseudo code of DeepPrevention
Input: X: 3D tensor patient data Output: A1: a set of gastritis patients predicted to develop into gastric cancer patients A2: a set of patients predicted to remain as gastritis patients B: features affecting prediction C: a group of patients at high risk D: visualization of personal explanation // STEP 1 predicting patients as gastritis patients and gastric cancer patients Split X into training data trainX and test data testX Learn trainX by deep recurrent neural network Create a prediction model M with optimized parameters Predict the probability p of progression with testX using M for id in personID: In our data set, only 3% of patients with atrophic gastritis developed gastric cancer. It is known that over-sampling methods such as the Synthetic Minority Oversampling Technique (SMOTE) [35] and Adaptive Synthetic Sampling (ADASYN) [36] in extremely class-imbalanced data cause overfitting and an increased training time due to the increased size of the training set [37]. Recently, research on deep learning with class imbalance has attracted much attention [38]. It was noted that very deep neural networks are effective in highly imbalanced class distribution [39]. Therefore, to solve the class imbalance problem, we proposed a deep recurrent neural network with eight hidden layers to capture the features of the minority class at the algorithm level.

Data Preprocessing
The dataset was composed of categorical variables and numerical variables. For the categorical variables, missing values were replaced with the most frequent value. For the numerical variables, missing values for BMI were replaced with the average value of the patient's data, and missing values for the number of endoscopic screenings were replaced with 0. Furthermore, normalization was performed for the numerical variables. Since adults undergo medical checkups every one or two years in Korea, we converted the study data to 3D tensor data on the basis of the checkup date. The preprocessing pseudo code is given in Algorithm 1.

Algorithm 1 Pseudo-Code for Data Preprocessing
Input: X: NHIS medical checkup data of atrophic gastritis patients Output: Y: 3D tensor data of X based on medical checkup time Split X into categorical variables and numerical variables if features are categorical variables then replace missing values with the most frequent value elseif a feature is BMI then replace missing value with the average value of the patient's data else a feature is the number of endoscopic screenings then replace missing value with 0 endif Normalize the numerical variables Combine the numerical variables and categorical variables Sort by PersonID, Year max_length ← the number of the most frequent medical checkup for id in personID: Groupby id and Padd by max_length Endfor Create 3D array Return Y

DeepPrevention Architecture
DeepPrevention is composed of a prediction module and an explanation module to develop an explainable prediction model, as shown in Figure 2. The DeepPrevention prediction module built on a deep RNN [40] captures the features of time-series NHIS medical checkup data that formed this study's data set. At each time step, the model read each chronic gastritis patients' annual checkup data and returned a prediction of progression to gastric cancer. We set a predicted value of 80% or more as the likelihood that a patient would have developed gastric cancer ( Figure 2, left panel). Then, to explain the prediction results, we performed a statistical analysis of patients with high potential to develop gastric cancer based on the 80% cutoff ( Figure 2, center panel); we first used the Chi-squared statistics [41] to detect the features that influenced the classification of gastric cancer and atrophic gastritis and then used K-means clustering [42] to identify patients at higher risk. Finally, we generated a personal explanation of a prediction (Figure 2, right panel). The proposed prediction model is a deep neural network model with eight hidden layers. We arrived at the appropriate number of layers by increasing the number in each experiment. Table 1 shows the experimental results. For the evaluations, we selected sensitivity given by Equation (1), specificity given by Equation (2), and AUC [43][44][45] which is the probability that a classifier would rank a randomly chosen positive instance higher than it ranked a randomly chosen negative one. Algorithm 2 describes DeepPrevention in detail.

Algorithm 2 Pseudo code of DeepPrevention
Input: X: 3D tensor patient data Output: A1: a set of gastritis patients predicted to develop into gastric cancer patients A2: a set of patients predicted to remain as gastritis patients B: features affecting prediction C: a group of patients at high risk D: visualization of personal explanation // STEP 1 predicting patients as gastritis patients and gastric cancer patients Split X into training data trainX and test data testX Learn trainX by deep recurrent neural network Create a prediction model M with optimized parameters Predict the probability p of progression with testX using M for id in personID:

//STEP 3 K-means clustering to identify a group of patients at high risk Read A1
Decide K, which is the optimal number of groups using the elbow method Apply K-means clustering algorithm into A1 using the K C ← id in personID belonging to high-risk group Return C //STEP 4 LIME to show the personal explanation of a prediction Read prediction model M, A1 Perform personal explanation of a patient using M D ← Visualization of the personal explanation Return D

Prediction Model and Evaluation
The prediction model is a deep neural network model with eight hidden layers, as shown in Figure 3.

Prediction Model and Evaluation
The prediction model is a deep neural network model with eight hidden layer shown in Figure 3. For the evaluations, we selected sensitivity given by Equation (1), specificity g by Equation (2) and receiving operator characteristic-area under the curve (ROC-A [43][44][45] which is the probability that a classifier would rank a randomly chosen pos instance higher than it ranked a randomly chosen negative one. The main considera For the evaluations, we selected sensitivity given by Equation (1), specificity given by Equation (2) and receiving operator characteristic-area under the curve (ROC-AUC) [43][44][45] which is the probability that a classifier would rank a randomly chosen positive instance higher than it ranked a randomly chosen negative one. The main consideration in predicting gastric cancer progression was sensitivity because less than 3% of all patients had gastric cancer. As Table 1 shows, the sensitivity increased as the number of hidden layers increased. On the basis of our results, we selected a deep RNN model with eight hidden layers. Then we applied dropout and L2 regularization to overcome overfitting to the minority class (i.e., gastric cancer diagnosis). As a result, we achieved an AUC of 0.84, a sensitivity of 0.50, and a specificity of 0.98 with the optimal prediction model as shown in Table 1. Figure 4 shows the ROC-AUC graph, which is suitable for performance evaluation of prediction model for imbalanced data. The blue solid line represents the diagnostic ability of a binary classifier that is created by plotting the true positive rate against the false positive rate at various threshold settings. The orange dashed line illustrate random classifier's ROC points. of prediction model for imbalanced data. The blue solid lin ability of a binary classifier that is created by plotting the tr false positive rate at various threshold settings. The orange d classifier's ROC points. The prediction model was evaluated on a personal co 7500U CPU@2.70Ghz and 32GB RAM. For the implementatio language 3.7.3 along with keras 2.2.4, sklearn 0.24.2, matplo machine learning libraries was employed [46]. We performed dows 10.
Among 18,846 atrophic gastritis patients and 610 gastric c of patient cases as the training data and 30% as the test data tients with atrophic gastritis and 187 with gastric cancer (total confusion matrix of the test data, reflecting a prediction of 56  Table 3 presents the precision, recall, and F1-score for th mentioned, the atrophic gastritis precision, recall, and F1-sco cause the data set was highly imbalanced. We found that t score values were lower for gastric cancer than for atrophic g  [46]. We performed all experiments using Windows 10. Among 18,846 atrophic gastritis patients and 610 gastric cancer patients, we used 70% of patient cases as the training data and 30% as the test data, which comprised 5650 patients with atrophic gastritis and 187 with gastric cancer (total n = 5837). Table 2 shows the confusion matrix of the test data, reflecting a prediction of 5650 cases of atrophic gastritis. Table 2. Test data confusion matrix.

Predicted Gastric Caner Predicted Atrophic Gastritis
Actual gastric cancer 93 94 Actual atrophic gastritis 25 5625 Table 3 presents the precision, recall, and F1-score for the study data. As previously mentioned, the atrophic gastritis precision, recall, and F1-score were extremely high because the data set was highly imbalanced. We found that the precision, recall, and F1-score values were lower for gastric cancer than for atrophic gastritis.

Explanation of the Prediction Results
The DeepPrevention model proposed for this study predicted that 94 of the actual gastric cancer patients would be gastric cancer patients. In this section, we interpret these prediction results.

Analysis of Risk Factors
We calculated Chi-squared to determine significant distinguishing features between patients with gastric cancer and patients with atrophic gastritis. Among 13 features, categorical variables such as sex, alcohol habit, smoking status, smoking duration, frequency of smoking, family cancer history, income, and exercise were selected. Additionally, the number of endoscopic screenings and BMI were transformed into categorical values and added. For the number of endoscopic screening tests, if it was more than one, then the outcome was transformed to yes, otherwise it was transformed to no. For BMI, the values were transformed to underweight, normal weight, and overweight. As Table 4 indicates, sex, smoking status, and smoking duration were significantly distinct. Differently from our expectation, regular exercise and number of endoscopic screenings were not related to gastric cancer incidence. This interpretation has some limitations. First, although H. pylori infection is a wellknown risk factor of gastric cancer, we did not consider the patients' infection status. As the purpose of our study was to identify risk factors of the patients' lifestyle, we considered demographic and environmental factors. Second, our findings that sex, smoking status, and smoking duration are influential factors were based on the 93 actual gastric cancer patients predicted as gastric cancer patients. Therefore, the dataset is insufficient to generalize the findings. Despite these limitations, our study provides an improved understanding of the risk factors of progression of atrophic gastritis to gastric cancer.

Analysis of a Group of Patients at High Risk
To identify the group of high-risk patients, we applied K-means clustering [46] to the data set of 92 gastric cancer patients. Using the elbow method [47], the optimal K was 3, and the number of patients in groups 1, 2, and 3 was 31, 36, and 26, respectively. Figure 5a shows the distribution of each group's probability of progression to gastric cancer. Group 1 showed a markedly high probability for progression to gastric cancer; therefore, we defined these patients as the high-risk group 3. Figure 5b shows 3D scatterplots for each group. we defined these patients as the high-risk group 3. Figure 5b shows 3D scatterplots for each group.
(a) (b)  Figure 6 shows boxplots of age at diagnosis for each group. The average age at diagnosis was 57.0, 69.0, and 45.9 years, respectively. In addition, we identified that income, BMI, regular exercise, and the number of endoscopic screenings did not show any significant difference between groups.

Personal Explanation
To give an individual perspective on the prediction model, the explanation module explains the patient-specific analysis. Visualization of the personal explanation was implemented using LIME in the interpretML library [48]. Figure 7a,b present two examples of the visualization of the personal explanation of high risk using a local surrogate analysis to explain the impact of the predictors. In both cases, sex, frequency of smoking, and smoking duration affected the prediction of developing gastric cancer, consistent with the risk factors identified in Section 5.1 through Chi-squared tests. The important features affecting the prediction of gastritis patients differed from patient to patient. As shown in Figure 7a, in addition to smoking habits, sex and income were identified as risk factors to progression from atrophic gastritis to gastric cancer; whereas in case of Figure 7b, family history of cancer, alcohol habit, and exercise were identified as risk factors.  Figure 6 shows boxplots of age at diagnosis for each group. The average age at diagnosis was 57.0, 69.0, and 45.9 years, respectively. In addition, we identified that income, BMI, regular exercise, and the number of endoscopic screenings did not show any significant difference between groups.  Figure 6 shows boxplots of age at diagnosis for each gro nosis was 57.0, 69.0, and 45.9 years, respectively. In addition BMI, regular exercise, and the number of endoscopic screeni icant difference between groups.

Personal Explanation
To give an individual perspective on the prediction mo explains the patient-specific analysis. Visualization of the p plemented using LIME in the interpretML library [48]. Figur of the visualization of the personal explanation of high risk u sis to explain the impact of the predictors. In both cases, sex smoking duration affected the prediction of developing gastr

Personal Explanation
To give an individual perspective on the prediction model, the explanation module explains the patient-specific analysis. Visualization of the personal explanation was implemented using LIME in the interpretML library [48]. Figure 7a,b present two examples of the visualization of the personal explanation of high risk using a local surrogate analysis to explain the impact of the predictors. In both cases, sex, frequency of smoking, and smoking duration affected the prediction of developing gastric cancer, consistent with the risk factors identified in Section 5.1 through Chi-squared tests. The important features affecting the prediction of gastritis patients differed from patient to patient. As shown in Figure 7a, in addition to smoking habits, sex and income were identified as risk factors to progression from atrophic gastritis to gastric cancer; whereas in case of Figure 7b, family history of cancer, alcohol habit, and exercise were identified as risk factors. Personal explanation was performed on 59 patients with a probability for progression to gastric cancer of 0.97 which is the highest probability. The risk factor that appeared in the most patients was age at diagnosis, and was identified as an important predictor in 36 patients. The next most frequent risk factor was smoking duration, which was found in 20 patients. The experimental results showed that to prevent development of gastric cancer, individual analysis and treatment are essential.

Discussion
In recent years, AI-based diagnoses and prognosis prediction have emerged in the field of gastric cancer [49]. While DeepPrevention was developed to predict gastric cancer progression from atrophic gastritis using medical check-up data, Jiang et al. predicted gastric cancer survival using SVMs [50]. A deep neural network was also applied to predict early recurrence in advanced gastric cancer [51] and computed tomography diagnosis of metastatic lymph nodes from gastric cancer [52]. Table 5 shows AI-based prediction and diagnosis in the gastric cancer field. The AUC of DeepPrevention outperformed two other studies [50,51]. Gao et al. achieved a high AUC of 0.9541 because they used CT images rather than electronic health records.
While other AI-based applications in the gastric cancer field focus on prognosis prediction and diagnosis, our study focused on prevention of gastric cancer. That is, we predicted a high-risk group of patients and risk factors among atrophic gastritis patients. These prediction results could be useful to prevent gastric cancer in atrophic gastritis patients. To prevent gastric cancer progression, we used medical checkup data, unlike other research that used EHRs. Our main consideration was identifying risk factors from lifestyle characteristics in atrophic gastritis patients. Smoking status and smoking duration were determined as important lifestyle factors influencing gastric cancer progression. Although we attempted to achieve higher sensitivity, because of the extremely imbalanced data we achieved up to 50% sensitivity. We applied the SMOTE algorithm, a well-known over-sampling method, to our data set but an overfitting problem occurred. Personal explanation was performed on 59 patients with a probability for progression to gastric cancer of 0.97 which is the highest probability. The risk factor that appeared in the most patients was age at diagnosis, and was identified as an important predictor in 36 patients. The next most frequent risk factor was smoking duration, which was found in 20 patients. The experimental results showed that to prevent development of gastric cancer, individual analysis and treatment are essential.

Discussion
In recent years, AI-based diagnoses and prognosis prediction have emerged in the field of gastric cancer [49]. While DeepPrevention was developed to predict gastric cancer progression from atrophic gastritis using medical check-up data, Jiang et al. predicted gastric cancer survival using SVMs [50]. A deep neural network was also applied to predict early recurrence in advanced gastric cancer [51] and computed tomography diagnosis of metastatic lymph nodes from gastric cancer [52]. Table 5 shows AI-based prediction and diagnosis in the gastric cancer field. The AUC of DeepPrevention outperformed two other studies [50,51]. Gao et al. achieved a high AUC of 0.9541 because they used CT images rather than electronic health records. While other AI-based applications in the gastric cancer field focus on prognosis prediction and diagnosis, our study focused on prevention of gastric cancer. That is, we predicted a high-risk group of patients and risk factors among atrophic gastritis patients. These prediction results could be useful to prevent gastric cancer in atrophic gastritis patients. To prevent gastric cancer progression, we used medical checkup data, unlike other research that used EHRs. Our main consideration was identifying risk factors from lifestyle characteristics in atrophic gastritis patients. Smoking status and smoking duration were determined as important lifestyle factors influencing gastric cancer progression.
Although we attempted to achieve higher sensitivity, because of the extremely imbalanced data we achieved up to 50% sensitivity. We applied the SMOTE algorithm, a well-known over-sampling method, to our data set but an overfitting problem occurred. We reached the conclusion that in cases of extremely imbalanced data with high complexity, algorithm-level methods are effective. Therefore, we used hidden layers to capture the characteristics of the minority class and used dropout and L2 regularization to avoid overfitting. To improve the prediction performance, we plan to adopt a stacked ensemble method [53] by combining SVMs, random forests, logistic regression, and deep learning.

Conclusions
In this study, we predicted patients with atrophic gastritis who were at high risk of developing gastric cancer and analyzed some of their characteristics. For this purpose, we used DeepPrevention, which is composed of a prediction module and an explanation module, based on a deep recurrent neural network with eight layers, and applied dropout and L2 regularization. The prediction model achieves 0.84 AUC, 0.5 sensitivity, and 0.98 specificity. The explanation module identified the significant features for distinguishing between atrophic gastritis and development of gastric cancer using Chi-squared tests. Furthermore, to identify a group of patients at high risk, K-means clustering was applied to the patients predicted to develop gastric cancer. Finally, to give a personal explanation of the prediction, LIME was applied to a specified patient.
Explainable AI has attracted much attention recently [33,54,55]. In particular, in medical applications, explainability is essential for both doctors and patients to understand the prediction results. In this study, we provided an explanation module, which explains the perspectives at the population, group, and individual levels. At the population level, sex, smoking status, and smoking duration were identified as influential factors. At the group level, the average diagnosis age was distinguishing factor of the high-risk group, and they were diagnosed as gastric cancer at 57 years. In addition, we identified that income, BMI, regular exercise, and the number of endoscopic screenings did not show any significant difference between groups. Finally, at the individual level, it was found that among the analyzed patient characteristics, two lifestyle habits were influential in the progression from atrophic gastritis to gastric cancer: current smoking status and smoking duration.
Real-world medical applications often confront the problem of imbalanced data such as that encountered in this study. In the case of the extremely imbalanced data in this study, oversampling caused an overfitting problem. Furthermore, we believe that it is important to create a model that maintains the ratio of gastric cancer incidence in the real world. Therefore, we proposed a deep recurrent neural network with eight hidden layers to capture the features of the minority class; the resulting model demonstrated a sensitivity of 0.5 and a specificity of 0.98. We are currently attempting to improve its sensitivity by extending the number of patients and their features to include related chronic disease by combining other data sources. In addition, we are developing stacking ensemble learning using SVMs and random forests. As a future work, we plan to utilize H. pylori infection information and consider genomic analysis.