Research on Medical Text Classification Based on Improved Capsule Network

Zhang, Qinghui; Yuan, Qihao; Lv, Pengtao; Zhang, Mengya; Lv, Lei

doi:10.3390/electronics11142229

Open AccessArticle

Research on Medical Text Classification Based on Improved Capsule Network

by

Qinghui Zhang

,

Qihao Yuan

,

Pengtao Lv

^*

,

Mengya Zhang

and

Lei Lv

Henan Grain Big Data Analysis and Application Engineering Research Center, Henan University of Technology, Zhengzhou 450000, China

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(14), 2229; https://doi.org/10.3390/electronics11142229

Submission received: 26 May 2022 / Revised: 9 July 2022 / Accepted: 14 July 2022 / Published: 17 July 2022

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

In the medical field, text classification based on natural language process (NLP) has shown good results and has great practical application prospects such as clinical medical value, but most existing research focuses on English electronic medical record data, and there is less research on the natural language processing task for Chinese electronic medical records. Most of the current Chinese electronic medical records are non-institutionalized texts, which generally have low utilization rates and inconsistent terminology, often mingling patients’ symptoms, medications, diagnoses, and other essential information. In this paper, we propose a Capsule network model for electronic medical record classification, which combines LSTM and GRU models and relies on a unique routing structure to extract complex Chinese medical text features. The experimental results show that this model outperforms several other baseline models and achieves excellent results with an F1 value of 73.51% on the Chinese electronic medical record dataset, at least 4.1% better than other baseline models.

Keywords:

medical record; text classification; capsule network

1. Introduction

Text classification is one of the classical tasks in natural language processing, which has a wide range of applications in various industries. With the gradual improvement of China’s hospital informatization construction, there has been an explosive growth of Chinese electronic medical record text data. Most of them are unstructured text, so it becomes very meaningful to study the classification task of Chinese electronic medical records [1]. Clinical trials are scientific studies conducted with human volunteers, also known as subjects, to determine the efficacy, safety, and side effects of a drug or treatment. It plays a crucial role in advancing medicine and improving human health. Subjects may be patients or healthy volunteers, depending on the purpose of the trial, etc. Subject recruitment for clinical trials is generally done by manual comparison of medical record forms and clinical trial screening criteria [2], which may be time-consuming and inefficient [3]. Therefore, a natural language processing and information extraction system has excellent practical application and clinical medical value for subject screening in clinical trials [4,5]. We aim to solve the medical short text classification problem, and the experimental dataset is selected from the China Conference on Health Information Processing (CHIP2019) review data. As shown in Table 1, where the input is a series of descriptive sentences of Chinese clinical trial screening criteria, and the output is a specific category of screening criteria returned according to each clinical trial data.

In recent years, deep learning methods have gradually replaced traditional machine learning methods in the task of text classification due to their good adaptability and high accuracy. With the Skip-gram model and Continuous Bag-of-Words (CBOW) models proposed by Mikolov et al. [6] and the proposed concept of word vectors, there has been a profound impact in the field of NLP. At present, the commonly used text classification models are mainly Convolutional Neural Networks (CNN) [7] models, Long Short-Term Memory (LSTM) [8,9] models and Bidirectional Encoder Representation from Transformers (BERT) [10] models.

These models have good results in the field of general text classification, but there are some limitations in the field of Chinese medical text classification: (1) Chinese electronic medical record has many professional terms [11]; (2) There are more character forms in the medical record text, including Chinese and English abbreviations, Arabic numerals, scientific notation, etc.; (3) There is a certain degree of overlap in the medical record text in different categories of texts [12]. Fortunately, the unique structure of the Capsule network can overcome these problems. Sabour et al. [13] first proposed a Capsule network, which changes the input of neurons from scalars to vectors and enhances the network’s ability to integrate features by adjusting parameters through dynamic routing. Hinton et al. [14] proposed a matrix Capsule network using Expectation-Maximization routing (EM routing), and the experiment shows that the Capsule network can effectively learn the changes of objects from different viewpoints with image data. In fact, the essence of the Capsule network is the implementation of the clustering algorithm idea in the model. In the improved Capsule network, the same ideological approach is used for the classification of text and combined with Gated Recurrent Unit (GRU) [15] and LSTM networks respectively to have good results. Therefore, to solve the above problems, we propose a study of Chinese medical text classification based on the Capsule network. The main contributions of this paper are as follows:

1.: We proposed an improved Capsule network model based on features of Chinese medical text classification. The unique network structure and powerful capability of feature extraction of the Capsule network enable us to extract the features of complex medical texts;
2.: Combined with the initial processing of medical text by the Long Short-Term Memory (LSTMs) network, the Capsule network has better performance, with at least 4% improvement in F1 values compared to other baseline models.

2. Related Work

2.1. Text Classification

The main natural language processing techniques based on deep learning are the Skip-gram and CBOW models based on neural networks proposed by Mikolov et al. [6], and the concept of word vectors proposed. Kim proposed the TextCNN [16], which has a strong ability to extract shallow features of text and runs fast for a wide range of applications. However, feature extraction (Feature extraction is the conversion of arbitrary data, such as text or images, into digital features that can be used for machine learning, and feature extraction is used for computers to better understand the data.) in long sentences mainly relies on the filter, which has limited ability in long-distance modeling and is insensitive to word order. Recurrent neural networks (RNN) [15] are an optimal technical solution to deal with the dynamic input sequences prevalent in NLP. Until 2013, RNNs were still considered difficult to train and were soon replaced by classical LSTM, and LSTM proved to be more resilient to vanishing and exploding gradient problems. Another variant with more modifications is the Gated Recurrent Unit (GRU) [15], which combines a forget gate and an input gate into a single update gate, but is not much different from LSTM. Burns et al. [17] conducted a classification study of biomedical texts containing information on molecular interactions by combining CNN models and LSTM models incorporating attention [18] mechanisms. Yang et al. [19] introduced the Attention mechanism based on the Bi-LSTM-CRF fusion model, which is able to obtain the contextual representation of words in the current range of full text, and applied the model to the task of chemical drug entity recognition, pre-trained word vectors and LSTM models on biological texts. Samina et al. [20] detect and classify dengue tweets using recurrent neural networks with TF-IDF embeddings. Li et al. [21] proposed an improvement in the model ls-gru on the medical text classification task, which still has good results based on the old model.

Two annotators with experience in biomedical informatics research annotated the screening criteria statements according to the defined annotation rules, and then the paper calculated the annotation consistency for each category based on Cohen’s kappa score, and the overall consistency score was 0.992 in the CHIP dataset. Different semantic types of screening criteria can find their counterparts in different medical data and play an important role in medical clinical studies. Correct identification of semantic categories of screening criteria in clinical trials is the basis and support for these studies.

2.2. Capsule Network

In computer vision, the low-level features of an image refer to contour, edge, color, texture, and shape features. The low-level features have less semantic information, but the target location is accurate; The high-level semantic features of the image are worth what we can see. For example, for a face extracted low-level features, we can extract the outline, nose, eyes, and so on, then the high-level features are shown as a human face. The semantic information of the high-level features is richer, but the target location is rougher.

Traditional CNN networks [7] make it difficult to account for the spatial relationships between the low-level features due to their unique convolutional operations, so CNN uses pooling to overcome these drawbacks. Pooling reduces the complexity of convolution operations and captures the invariance of local features. But the pooling operation cannot represent the positional relationship between the high-level features and the low-level features [13]. The capsules in the Capsule network learn to recognize the presence of visual entities and encode their attributes as vectors. When neurons operate independently in a CNN, the Capsule network represents the capsule as a vector using a nonlinear function called squeeze. The Capsule network is one of the neural networks introduced in recent years to overcome the shortcomings of convolutional neural networks, representing the relationship between part and whole in the form of vectors. It can represent images in terms of the intensity of feature responses and characterize the orientation, location, and other information of image features. Dynamic routing determines the strength of the connection between the high-level features and the low-level features through the coupling coefficients. The coupling coefficient is used to measure the similarity between the low- and high-level features, and through this process, the capsule is able to learn over the entity attributes.

Since Sabour et al. [13] first proposed Capsule network, researchers have not stopped exploring the implementation of Capsule networks for text classification. Zhao et al. [22] explored Capsule networks with dynamic routing for text classification and proposed three strategies to stabilize the dynamic routing. Capsule networks show a significant improvement in the task of text classification when single-label transfer multi-label. Other researchers [23] proposed a single model Capsule network and beat other architectures with a total ROC AUC of 98.46. They have also shown that the problem that occurs during extensive preprocessing and augmentation of data can be tackled using Capsule networks. Yang et al. [24] investigated the transferring capability of Capsule networks for text classification in transfer learning scenarios and proposed an iterative adaptive algorithm to extend Capsule networks for cross-domain text classification. Extensive experiments on six text classification benchmarks demonstrate the effectiveness of Capsule networks in general text classification.

3. Model

3.1. Model Structure

The first layer of the model converts the text into vectors by pre-processing. The second layer of the model processes the transformed vectors through a bi-directional LSTM network. This dataset is a short text dataset, and the processing text length is set to 30 to be able to cover most of the text. To avoid overfitting, 50% of the weights are zeroed. The third layer of the model sets the Capsule network after the bidirectional LSTM network and finally outputs the results. The model structure is shown in Figure 1.

3.2. Capsule Network Structure

The working principle of the neural capsule is shown in Figure 2 and can be briefly summarized in four steps: matrix transformation, input weighting, weights summation, and nonlinear transformation. Which contains multiple layers of capsular layers, each of which is grouped into many sets of vector neurons so that each neuron has the concepts of length and direction. In the text, the length is used to measure the probability of the presence of each word, and the direction represents the attributes of the presence of each word (location, semantics, context, etc.). In Figure 2, the underlying capsules

v_{1}

and

v_{2}

are transformed into

u_{1}

and

u_{2}

by the transformation matrices

W_{1}

and

W_{2}

:

u_{i} = v_{i} \cdot W_{i}

(1)

After determining

u_{i}

, the second stage of weight assignment is performed to calculate the node

s^{r}

, r is the number of iterations:

s^{r} = \sum_{i} c_{i}^{r} \cdot u_{i}

(2)

where c is the coupling coefficient, and this coupling coefficient is updated iteratively by the dynamic routing algorithm, and

b_{i}^{r}

of the softmax function is initialized to 0, k is the number of neurons, and the coupling coefficient

c_{i}^{r}

is calculated as:

c_{i}^{r} = \frac{exp (b_{i}^{r - 1})}{\sum_{k} exp (b_{k}^{r - 1})}

(3)

The output of the Squash activation function used by the Capsule network, a is calculated as:

a^{r} = \frac{{∥s^{r}∥}^{2}}{1 + {∥s^{r}∥}^{2}} \frac{s^{r}}{∥s^{γ}∥}

(4)

The previous part of this activation function is the scaling scale of the input vector

s^{r}

, and the latter part is the unit vector of

s^{r}

, which not only preserves the direction of the input vector but also compresses the mode of the input vector to between [0,1), which can effectively avoid the problem of gradient explosion.

b_{i}^{r}

is updated by measuring the vector consistency between the output

a^{r}

of the current layer and the previous

u_{i}

:

b_{i}^{r} = b_{i}^{r - 1} + a^{r} \cdot u^{i}

(5)

The coupling coefficient c is updated by dynamic routing in the whole network, which also determines a route, and the end of the route is the sought optimal parameter.

4. Experiments

4.1. Dataset

The dataset for CHIP2019 measurement task-3 was derived from the real clinical trial screening criteria on the website of the China Clinical Trials Registry, which has open and transparent data on clinical trial registration information for scientific research. Screening criteria are generally an unstructured piece of free-text data of varying lengths used to describe various information about patients eligible for a particular clinical trial, such as age, gender, and disease. The dataset was summarized by hierarchical clustering and manual induction to identify seven topics and 44 semantic categories, and descriptive information and annotation rules were defined for each category. The final published dataset of CHIP2019 measurement task III includes 44 semantic category definitions and 38,341 screening criteria, including 22,962 items in the training set and 7682 items in the test set. The dataset is characterized by an uneven distribution of category data volume, with more numerous categories such as disease comprising approximately 22% of the dataset and less numerous categories such as race containing only 0.6%. More details can be found in Table 2.

4.2. Evaluation Criteria and Parameter

The evaluation metrics used in the evaluation include macro average accuracy, macro average recall, and macro average F1 value. The final ranking is based on the average macro F1 value. Assuming that there are n categories:

C_{1}

, …,

C_{i}

, …,

C_{n}

, the formulae for accuracy

P_{i}

for each category, recall

R_{i}

for each category, macro average accuracy, macro average recall, and macro average F1 value are shown in the following equations:

macro average accuracy = \frac{1}{n} \sum_{i = 1}^{n} P_{i}

(6)

macro average recall = \frac{1}{n} \sum_{i = 1}^{n} R_{i}

(7)

macro average F 1 = \frac{1}{n} \sum_{i = 1}^{n} \frac{2 \times P_{i} \times R_{i}}{P_{i} + R_{i}}

(8)

4.3. Experimental Results

In this paper, five different models were used to compare clinical trials screening short text datasets, and the experiments calculated average accuracy, recall, and F1 values for 44 categories. The optimal classification results of different models are shown in Figure 3:

As can be seen from Figure 3, the Capsule+LSTM model works best among the five models, with an average F1 value of 73.51%. Overall the CNN network [25] has improved its performance in the experiment compared to GRU and LSTM due to the shorter text. The experimental results of LSTM [26] and GRU [27] were expected to be comparable, but the F1 value of the GRU network was 4.5% higher than that of the LSTM network. After analysis, the reason may be that the experimental text is shorter, resulting in the gating unit output of both networks not achieving the best results. If the experimental text is longer, the results of the LSTM and GRU networks may be better and the difference will be smaller. After combining the LSTM and GRU network with the Capsule network, both models have good results, with F1 values improving by 5.46% and 10.92%, respectively, which means that the Capsule network does play a significant role in the Chinese medical text classification task.

The Anova test results are shown in Table 3, we can see that the p-value of Capsule+GRU is 1.76 × 10

^{- 14}

and the p-value of Capsule+LSTM is 2.99 × 10

^{- 17}

, which are much less than 0.05, respectively. There is a significant difference between the groups. This proves that the Capsule network performs significantly better than other baseline models. In addition, the average F1 value of the Capsule network is higher than that of other models, which also indicates that the Capsule network is stable.

Figure 4 shows the training graph of Capsule network, which includes the accuracy as well as the loss function variation curve:

We set the number of dynamic routing to 2, the learning rate to 0.001, and the dropout to 0.5. Table 4 is shown the comparison of the different sizes of hidden layer of Capsule+LSTM:

We set the number of dynamic routing to 2, the size of hidden layer to 128, and the dropout to 0.5. Table 5 is shown the comparison of the different learning rates of Capsule+LSTM:

The F1 values of some of the categories are also noteworthy in Table 6:

In general, there are good and bad performances of different models in each category. Among them, the GRU model achieves the best results in one category, the LSTM achieves the best results in four categories, the CNN model achieves the best results in ten categories, and Capsule+GRU and Capsule+LSTM achieve fifteen and twenty best results, respectively. There is a relationship between the text length and the performance of each model. Among them, the category of Multiple had the longest average text length of 42.67, and the Capsule+LSTM model performed the best with an F1 value of 68.92%. The category of Ethnicity had the shortest average text length of 8.8, and both the Capsule+GRU model and the Capsule+LSTM model achieved the best results with an F1 value of 1.

The best performing model among the three baseline models is the CNN, which achieves the best results in several categories. In the category of Gender, the average text length is 10.1, and the CNN achieves the best F1 value of 94.74%. In the category of Education, the average text length is 14.38, and both the CNN and Capsule+LSTM achieve the best F1 value of 94.74%. In the category of Bedtime, the average text length is 14.36, and both the CNN and Capsule+GRU achieve the best F1 value of 53.33%. Both the CNN model and the Capsule network perform well with short sentence lengths. On the contrary, LSTM and GRU did not perform well. This can also indicate that the Capsule network and CNN network have some advantages in processing shorter texts, GRU and LSTM can make up for their disadvantages in processing short sentences when combined with the Capsule network.

Among the three baseline models, LSTM performs better when dealing with longer sentence lengths. In the category of Blood Donation, with an average sentence length of 27.11, LSTM and Capsule+GRU obtained the highest F1 value of 87.50%. In the category of Diagnostic, with an average sentence length of 30.34, LSTM achieved the highest F1 value of 75.62%, but was only 1.6% higher than Capsule+GRU. Compared to CNN, LSTM, and Capsule networks perform better on longer texts. It also shows that the Capsule network can inherit the ability of LSTM and GRU to handle longer texts.

In the category of Address, Capsule+LSTM achieved the highest F1 value of 77.78%, where the text mostly contains the keyword Reside. Similarly, in the category of Therapy or Surgery, Capsule+LSTM also achieved the highest F1 value, where the keywords Surgery and Treatment appear several times in the text. However, the Capsule network did not achieve good results in the category of Disease and Symptom, and after analyzing the texts, it was found that one or more words did not appear in the texts as frequently as in the other texts. Therefore, the Capsule network behaves similarly to other baseline models when it encounters keywords that are not obvious or absent. Enhancing the recognition of inconspicuous text features is also one of the directions for future improvement of the Capsule network.

5. Conclusions

The Capsule network model constructed in this paper improved the F1 value by at least 4.1% compared to other models on the task of CHIP2019 medical short text classification. The experiments demonstrate the feasibility and effectiveness of the method of GRU and LSTM fused with the Capsule network model on the task of medical short text classification, which plays a reference role in the future for the task of medical text classification and contributes its ideas to the construction of knowledge graphs in medical-related fields. However, the dataset in this paper is not comprehensive enough, which affects the effect of the model to some extent. In the future, the dataset can be expanded, and we aim to design a better network model based on the dataset characteristics for medical classification tasks.

Author Contributions

Writing—original draft preparation, resources, Q.Y.; Conceptualization, methodology, M.Z.; software, L.L.; writing—review and editing, P.L.; supervision, Q.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 62073123, and Major Public Welfare Project of Henan Province, grant number 201300311200.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors choose not to disclose the data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jasmir, J.; Nurmaini, S.; Malik, R.F.; Tutuko, B. Bigram feature extraction and conditional random fields model to improve text classification clinical trial document. Telkomnika 2021, 19, 886–892. [Google Scholar] [CrossRef]
Hao, T.; Rusanov, A.; Boland, M.R.; Weng, C. Clustering clinical trials with similar eligibility criteria features. J. Biomed. Inform. 2014, 52, 112–120. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Thadani, S.R.; Weng, C.; Bigger, J.T.; Ennever, J.F.; Wajngurt, D. Electronic screening improves efficiency in clinical trial recruitment. J. Am. Med. Inform. Assoc. 2009, 16, 869–873. [Google Scholar] [CrossRef] [PubMed]
Gulden, C.; Kirchner, M.; Schüttler, C.; Hinderer, M.; Kampf, M.; Prokosch, H.U.; Toddenroth, D. Extractive summarization of clinical trial descriptions. Int. J. Med. Inform. 2019, 129, 114–121. [Google Scholar] [CrossRef] [PubMed]
Wu, H.; Toti, G.; Morley, K.I.; Ibrahim, Z.M.; Folarin, A.; Jackson, R.; Kartoglu, I.; Agrawal, A.; Stringer, C.; Gale, D.; et al. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. J. Am. Med. Inform. Assoc. 2018, 25, 530–537. [Google Scholar] [CrossRef] [Green Version]
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
Tai, K.S.; Socher, R.; Manning, C.D. Improved semantic representations from tree-structured long short-term memory networks. arXiv 2015, arXiv:1503.00075. [Google Scholar]
Mousa, A.; Schuller, B. Contextual bidirectional long short-term memory recurrent neural network language models: A generative approach to sentiment analysis. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 3–7 April 2017. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Li, T.; Zhu, S.; Ogihara, M. Using discriminant analysis for multi-class classification: An experimental investigation. Knowl. Inf. Syst. 2006, 10, 453–472. [Google Scholar] [CrossRef]
Huang, C.C.; Lu, Z. Community challenges in biomedical text mining over 10 years: Success, failure and the future. Brief. Bioinform. 2016, 17, 132–144. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Hinton, G.E.; Sabour, S.; Frosst, N. Matrix capsules with EM routing. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
Kim, Y. Convolutional Neural Networks for Sentence Classification. arXiv 2014, arXiv:1408.5882. [Google Scholar]
Burns, G.A.; Li, X.; Peng, N. Building deep learning models for evidence classification from the open access biomedical literature. Database 2019, 2019, baz034. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar]
Luo, L.; Yang, Z.; Yang, P.; Zhang, Y.; Wang, L.; Lin, H.; Wang, J. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics 2018, 34, 1381–1388. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Amin, S.; Uddin, M.I.; Hassan, S.; Khan, A.; Nasser, N.; Alharbi, A.; Alyami, H. Recurrent neural networks with TF-IDF embedding technique for detection and classification in tweets of dengue disease. IEEE Access 2020, 8, 131522–131533. [Google Scholar] [CrossRef]
Li, Q.; Li, Y.K.; Xia, S.Y.; Kang, Y. An Improved Medical Text Classification Model: LS-GRU. J. Northeast. Univ. Nat. Sci. 2020, 41, 938. [Google Scholar]
Zhao, W.; Ye, J.; Yang, M.; Lei, Z.; Zhang, S.; Zhao, Z. Investigating capsule networks with dynamic routing for text classification. arXiv 2018, arXiv:1804.00538. [Google Scholar]
Srivastava, S.; Khurana, P.; Tewari, V. Identifying aggression and toxicity in comments using capsule network. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), Santa Fe, NM, USA, 25 August 2018; pp. 98–105. [Google Scholar]
Yang, M.; Zhao, W.; Chen, L.; Qu, Q.; Zhao, Z.; Shen, Y. Investigating the transferring capability of capsule networks for text classification. Neural Netw. 2019, 118, 247–261. [Google Scholar] [CrossRef]
Guo, B.; Zhang, C.; Liu, J.; Ma, X. Improving text classification with weighted word embeddings via a multi-channel TextCNN model. Neurocomputing 2019, 363, 366–374. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef] [Green Version]
Sachin, S.; Tripathi, A.; Mahajan, N.; Aggarwal, S.; Nagrath, P. Sentiment analysis using gated recurrent neural networks. SN Comput. Sci. 2020, 1, 74. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Model Structure.

Figure 2. Capsule Network Model.

Figure 3. Comparison of improved Capsule networks with different baseline models.

Figure 4. Accuracy curve and loss funtion curve.

Table 1. Example of dataset.

ID	Input	Input (in English)	Category
S1	全麻手术患者	Patients undergoing general anesthesia.	Therapy or Surgery
S2	过去4周内服用催眠或镇静药、精神类药物	Use of hypnotic or sedative drugs, or psychotropic drugs within the past 4 weeks.	Pharmaceutical Substance or Drug
S3	血糖< 2.7 mmol/L	Blood glucose < 2.7 mmol/L	Laboratory Examinations

Table 2. Details of dataset.

Category	Count (Ratio)	Min Length	Max Length	Average Length
Disease	5127 (22.33%)	3	213	23.90
Multiple	4556 (19.84%)	7	342	42.09
Therapy or Surgery	1504 (6.55%)	5	159	21.67
Consent	1319 (5.74%)	4	112	19.10
Diagnostic	1233 (5.37%)	7	194	29.54
Laboratory Examinations	1142 (4.97%)	5	174	33.36
Pregnancy-related Activity	1026 (4.47%)	7	186	20.52
Age	917 (3.99%)	5	67	13.27
Pharmaceutical Substance or Drug	877 (3.82%)	6	238	31.32
Risk Assessment	708 (3.08%)	8	195	23.66
Allergy Intolerance	668 (2.91%)	4	76	21.28
Enrollment in other studies	514 (2.24%)	9	58	22.48
Researcher Decision	464 (2.02%)	12	225	27.35
Compliance with Protocol	370 (1.61%)	5	67	19.35
Organ or Tissue Status	358 (1.56%)	6	100	17.18
Sign	286 (1.25%)	4	65	19.86
Addictive Behavior	272 (1.18%)	3	133	23.94
Capacity	168 (0.73%)	6	303	21.48
Life Expectancy	166 (0.72%)	9	30	15
Symptom	154 (0.67%)	5	144	23.39
Neoplasm Status	131 (0.57%)	6	69	22.48
Device	129 (0.56%)	7	71	21.35
Special Patient Characteristic	104 (0.45%)	4	43	15.28
Non-Neoplasm Disease Stage	103 (0.45%)	6	57	20.81
Data Accessible	71 (0.31%)	8	169	23.15
Encounter	66 (0.29%)	8	61	21.33
Diet	61 (0.27%)	11	111	37.07
Smoking Status	54 (0.24%)	6	123	27.93
Literacy	52 (0.23%)	8	37	20.75
Oral related	51 (0.22%)	4	78	23.75
Healthy	39 (0.17%)	6	77	22.05
Address	31 (0.14%)	9	35	17.10
Blood Donation	31 (0.14%)	10	56	30.00
Gender	30 (0.13%)	4	32	9.7
Receptor Status	28 (0.12%)	9	56	23.68
Nursing	22 (0.10%)	12	39	18.36
Exercise	21 (0.09%)	10	60	26.48
Education	19 (0.08%)	11	37	16.79
Disabilities	17 (0.07%)	8	58	24.41
Sexual related	17 (0.07%)	6	57	30.71
Alcohol Consumer	17 (0.07%)	17	104	56.65
Bedtime	14 (0.06%)	5	53	20.29
Ethical Audit	12 (0.05%)	10	21	14.5
Ethnicity	13 (0.06%)	5	15	8.70

Table 3. Anova test results.

Model	Sources of Difference	SS (Sum of Squared Deviations)	df (Degrees of Freedom)	MS (Mean Square)	F (Effect Term/Error Term)	p-Value	F Crit
Capsule + GRU	Between groups	0.05109	3	0.01703	63.79289	1.76 × 10 $^{- 14}$	2.86626
Capsule + GRU	Inside the group	0.00961	36	0.00026	-	-	-
Capsule + LSTM	Between groups	0.07403	3	0.02467	96.17230	2.99 × 10 $^{- 17}$	2.86626
Capsule + LSTM	Inside the group	0.00923	36	0.00025	-	-	-

Table 4. Comparison of different sizes of hidden layer.

Size	Precision	Recall	F1
128	80.46%	70.55%	73.51%
246	78.58%	70.64%	73.36%
512	73.24%	67.91%	68.93%
1024	76.95%	69.29%	71.42%

Table 5. Comparison of different learning rates.

Learning Rate	Precision	Recall	F1
0.0001	81.31%	69.14%	73.04%
0.00033	77.57%	67.48%	70.83%
0.001	80.46%	70.55%	73.51%
0.003	57.86%	52.65%	53.96%

Table 6. Performance of different models in different categories.

Category	GRU	LSTM	CNN	Capsule+GRU	Capsule+LSTM
Multiple	68.00%	68.82%	68.75%	67.89%	68.92%
Ethnicity	66.67%	66.67%	57.14%	100.00%	100.00%
Gender	78.26%	85.71%	94.74%	82.35%	85.71%
Bedtime	28.57%	12.50%	55.33%	55.33%	30.77%
Blood Donation	71.43%	87.50%	80.00%	87.50%	80.00%
Diagnostic	73.87%	75.62%	74.63%	75.46%	74.91%
Address	73.68%	60.87%	60.00%	63.16%	77.78%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Q.; Yuan, Q.; Lv, P.; Zhang, M.; Lv, L. Research on Medical Text Classification Based on Improved Capsule Network. Electronics 2022, 11, 2229. https://doi.org/10.3390/electronics11142229

AMA Style

Zhang Q, Yuan Q, Lv P, Zhang M, Lv L. Research on Medical Text Classification Based on Improved Capsule Network. Electronics. 2022; 11(14):2229. https://doi.org/10.3390/electronics11142229

Chicago/Turabian Style

Zhang, Qinghui, Qihao Yuan, Pengtao Lv, Mengya Zhang, and Lei Lv. 2022. "Research on Medical Text Classification Based on Improved Capsule Network" Electronics 11, no. 14: 2229. https://doi.org/10.3390/electronics11142229

APA Style

Zhang, Q., Yuan, Q., Lv, P., Zhang, M., & Lv, L. (2022). Research on Medical Text Classification Based on Improved Capsule Network. Electronics, 11(14), 2229. https://doi.org/10.3390/electronics11142229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Medical Text Classification Based on Improved Capsule Network

Abstract

1. Introduction

2. Related Work

2.1. Text Classification

2.2. Capsule Network

3. Model

3.1. Model Structure

3.2. Capsule Network Structure

4. Experiments

4.1. Dataset

4.2. Evaluation Criteria and Parameter

4.3. Experimental Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI