Using Text Mining and Data Visualization Approaches for Investigating Mental Illness from the Perspective of Traditional Chinese Medicine

Background and Objectives. Anxiety and depressive disorders are the most prevalent mental disorders, and due to the COVID-19 pandemic, more people are suffering from anxiety and depressive disorders, and a considerable fraction of COVID-19 survivors have a variety of persistent neuropsychiatric problems after the initial infection. Traditional Chinese Medicine (TCM) offers a different perspective on mental disorders from Western biomedicine. Effective management of mental disorders has become an increasing concern in recent decades due to the high social and economic costs involved. This study attempts to express and ontologize the relationships between different mental disorders and physical organs from the perspective of TCM, so as to bridge the gap between the unique terminology used in TCM and a medical professional. Materials and Methods. Natural language processing (NLP) is introduced to quantify the importance of different mental disorder descriptions relative to the five depots and two palaces, stomach and gallbladder, through the classical medical text Huangdi Neijing and construct a mental disorder ontology based on the TCM classic text. Results. The results demonstrate that our proposed framework integrates NLP and data visualization, enabling clinicians to gain insights into mental health, in addition to biomedicine. According to the results of the relationship analysis of mental disorders, depots, palaces, and symptoms, the organ/depot most related to mental disorders is the heart, and the two most important emotion factors associated with mental disorders are anger and worry & think. The mental disorders described in TCM are related to more than one organ (depot/palace). Conclusion. This study complements recent research delving into co-relations or interactions between mental status and other organs and systems.


Introduction
The World Health Organization (WHO) has published alarming data regarding mental disorders, in which depression is listed as one of the main causes of disability worldwide [1,2]. Anxiety and depressive disorders are the most prevalent mental disorders in 2019, affecting 1 in 8 people, or 970 million people worldwide. Due to the COVID-19 pandemic, there were significantly more people in 2020 who suffered from anxiety and depressive disorders as compared to previous years. Increasing numbers of studies have shown that a considerable fraction of COVID-19 survivors have a variety of neuropsychiatric problems that continue or even manifest months after the initial infection. With new variants and lineages continuously evolving, the number of people infected by the virus is expected to break records at an unprecedented rate. As such, the number of people suffering from anxiety and depressive disorders and other neuropsychiatric problems are also expected variants and lineages continuously evolving, the number of people infected by the virus is expected to break records at an unprecedented rate. As such, the number of people suffering from anxiety and depressive disorders and other neuropsychiatric problems are also expected to reach record highs. Management of these mental disorders are of great urgency and importance, as the expected economic and societal costs may be greater than preliminary forecasts.
Preceding research in the early 20th century on mental disorders had focused on interpreting the relations between emotions and the human body, such as analyzing the manifestation of emotions through observable body reactions [3][4][5][6]. However, the rise of psychiatric research can be traced back to the 19th century, in which Western biomedicine started to emphasize the importance of the brain in the proper or improper functioning of mental activity, and neurology and psychiatry were combined into a single science. This was totally different from TCM, which views the body as an intricately woven system that has to be considered holistically and cannot be compartmentalized. However, the terminology used in TCM involves idioms that only trained professionals can understand, and not the modern scientific language that is typically seen, therefore preventing non-TCM trained professionals from understanding or utilizing the wealth of information in the TCM literature. This study seeks to bridge the gap such that non-TCM professionals may also understand basic TCM teachings in mental disorders. TCM considers mental disorders in three aspects, namely recognition, manifestation, and response. This combination is important as TCM views mental disorders as a result of disharmony between the somatic and/or spiritual entities.
Within the framework of TCM, this is understood as an imbalance of Yin and Yang, Vital Energy ("Qi," " 氣 "), and Blood ("Shie," " 血 "). TCM uses pattern (or syndrome) differentiation to diagnose diseases, which considers signs and symptoms of the disease, the environment, and the patient's profile, allowing personalized medical treatment. Taking depression as an example, TCM believes that depression is caused by an imbalance of different organs and Qi. Mental-like disorders have long been mentioned in the TCM literature, such as the Huangdi Neijing (Inner Canon of the Yellow Emperor) [7], the earliest Chinese medical classic that was written between the Warring States Period and the Han Dynasty (from 475 BC to AD 220). This ancient Chinese medical classic presents theories regarding mind, body, causations of disease, reasoning behind affiliating mental emotions, observed symptoms, and physical organs. The mental-like symptoms listed in this classic demonstrate that ancient Chinese physicians were aware of mental disorders, their diagnosis, and management. In this research, we investigate the correlation between mental disorders, symptoms, and organs within the Huangdi Neijing through text mining and data visual analytics. The observations and results are interpreted from the perspective of TCM and Western medicine.
The remainder of the work is organized as follows. Section 2 describes our proposed approach in detail. Section 3 presents the experiment results and discussions, and we also detail the process of ontology construction. Finally, we conclude our work in Section 4.

Data source and collection
The spirit-mind theory in the Huangdi Neijing (HDNJ) is the foundational source of ancient Chinese Medical Psychology. It elaborates the impact of emotions on physical and mental health. The contents of the HDNJ are expressed in archaic Chinese, and the phrasings of mental disorder symptoms lack consistency, which makes it difficult to analyze. In order to successfully achieve the purpose of this work, words having psychological meaning descriptions are identified through manual review and are then classified; in all, 65 mental disorder descriptions are identified. We then further classified them into nine categories according to the seven emotions theory [7], and referred to [8] for the specification of various symptoms. The nine categories are: anger (nu, 怒), happiness (shi, 喜 "), and Blood ("Shie", " cina 2023, 59, x FOR PEER REVIEW 2 of 13 variants and lineages continuously evolving, the number of people infected by the virus is expected to break records at an unprecedented rate. As such, the number of people suffering from anxiety and depressive disorders and other neuropsychiatric problems are also expected to reach record highs. Management of these mental disorders are of great urgency and importance, as the expected economic and societal costs may be greater than preliminary forecasts. Preceding research in the early 20th century on mental disorders had focused on interpreting the relations between emotions and the human body, such as analyzing the manifestation of emotions through observable body reactions [3][4][5][6]. However, the rise of psychiatric research can be traced back to the 19th century, in which Western biomedicine started to emphasize the importance of the brain in the proper or improper functioning of mental activity, and neurology and psychiatry were combined into a single science. This was totally different from TCM, which views the body as an intricately woven system that has to be considered holistically and cannot be compartmentalized. However, the terminology used in TCM involves idioms that only trained professionals can understand, and not the modern scientific language that is typically seen, therefore preventing non-TCM trained professionals from understanding or utilizing the wealth of information in the TCM literature. This study seeks to bridge the gap such that non-TCM professionals may also understand basic TCM teachings in mental disorders. TCM considers mental disorders in three aspects, namely recognition, manifestation, and response. This combination is important as TCM views mental disorders as a result of disharmony between the somatic and/or spiritual entities. Within the framework of TCM, this is understood as an imbalance of Yin and Yang, Vital Energy ("Qi," " 氣 "), and Blood ("Shie," " 血 "). TCM uses pattern (or syndrome) differentiation to diagnose diseases, which considers signs and symptoms of the disease, the environment, and the patient's profile, allowing personalized medical treatment. Taking depression as an example, TCM believes that depression is caused by an imbalance of different organs and Qi. Mental-like disorders have long been mentioned in the TCM literature, such as the Huangdi Neijing (Inner Canon of the Yellow Emperor) [7], the earliest Chinese medical classic that was written between the Warring States Period and the Han Dynasty (from 475 BC to AD 220). This ancient Chinese medical classic presents theories regarding mind, body, causations of disease, reasoning behind affiliating mental emotions, observed symptoms, and physical organs. The mental-like symptoms listed in this classic demonstrate that ancient Chinese physicians were aware of mental disorders, their diagnosis, and management. In this research, we investigate the correlation between mental disorders, symptoms, and organs within the Huangdi Neijing through text mining and data visual analytics. The observations and results are interpreted from the perspective of TCM and Western medicine.
The remainder of the work is organized as follows. Section 2 describes our proposed approach in detail. Section 3 presents the experiment results and discussions, and we also detail the process of ontology construction. Finally, we conclude our work in Section 4.

Data source and collection
The spirit-mind theory in the Huangdi Neijing (HDNJ) is the foundational source of ancient Chinese Medical Psychology. It elaborates the impact of emotions on physical and mental health. The contents of the HDNJ are expressed in archaic Chinese, and the phrasings of mental disorder symptoms lack consistency, which makes it difficult to analyze. In order to successfully achieve the purpose of this work, words having psychological meaning descriptions are identified through manual review and are then classified; in all, 65 mental disorder descriptions are identified. We then further classified them into nine categories according to the seven emotions theory [7], and referred to [8] for the specification of various symptoms. The nine categories are: anger (nu, 怒), happiness (shi, 喜 "). TCM uses pattern (or syndrome) differentiation to diagnose diseases, which considers signs and symptoms of the disease, the environment, and the patient's profile, allowing personalized medical treatment. Taking depression as an example, TCM believes that depression is caused by an imbalance of different organs and Qi. Mental-like disorders have long been mentioned in the TCM literature, such as the Huangdi Neijing (Inner Canon of the Yellow Emperor) [7], the earliest Chinese medical classic that was written between the Warring States Period and the Han Dynasty (from 475 BC to AD 220). This ancient Chinese medical classic presents theories regarding mind, body, causations of disease, reasoning behind affiliating mental emotions, observed symptoms, and physical organs. The mental-like symptoms listed in this classic demonstrate that ancient Chinese physicians were aware of mental disorders, their diagnosis, and management. In this research, we investigate the correlation between mental disorders, symptoms, and organs within the Huangdi Neijing through text mining and data visual analytics. The observations and results are interpreted from the perspective of TCM and Western medicine.
The remainder of the work is organized as follows. Section 2 describes our proposed approach in detail. Section 3 presents the experiment results and discussions, and we also detail the process of ontology construction. Finally, we conclude our work in Section 4.

Data Source and Collection
The spirit-mind theory in the Huangdi Neijing (HDNJ) is the foundational source of ancient Chinese Medical Psychology. It elaborates the impact of emotions on physical and mental health. The contents of the HDNJ are expressed in archaic Chinese, and the phrasings of mental disorder symptoms lack consistency, which makes it difficult to analyze. In order to successfully achieve the purpose of this work, words having psychological meaning descriptions are identified through manual review and are then classified; in all, 65 mental disorder descriptions are identified. We then further classified them into nine categories according to the seven emotions theory [7], and referred to [8] for the specification of various symptoms. The nine categories are: anger (nu, 怒), happiness (shi, 喜), worry & think (yousz, 憂思), sadness (bai, 悲), fear & fright (jingkung, 驚恐), sleep disorder (bumei, 不寐), drowsiness (shrshuei, 嗜睡), forgetfulness (jianwang, 健忘), and hallucination (huanjiue, 幻覺). The classifications were discussed and verified by three TCM practitioners, each with more than 10 years of clinical experience.
The dataset was collected from the HDNJ according to descriptions of mental disorders' symptoms based on TCM teachings. Subsequently, the data and the collected mentions of the visceral systems are classified as different mental disorder symptoms and different viscera accordingly. The data assembled contained multiple groups of passages referring to different categorical mental disorder symptoms with reference to the five depots of liver, heart, spleen, lung, and kidney, and two of the six palaces, stomach and gallbladder. Among the six palaces, these two are considered to be connected to emotions, with the remaining four (small intestine, large intestine, urinary bladder, and triple energizer) to not be found or too infrequently mentioned to be considered. In view of these outcomes, the present study's analysis involved the five depots and two of the palaces, gallbladder and stomach. Next, we analyzed the co-occurrence between descriptions of mental disorders and the nine mental disorder categories mentioned above. After removing duplicate passages, 273 out of 1521 descriptions from the HDNJ were analyzed in this study.

Text Mining for Correlation Analysis
We analyze the text of the HDNJ to construct an explanatory model of mental disorders. For this purpose, we investigate the correlation between mental disorders, symptoms, and organs (depots/palaces) within the book. We first detect the correlation degree between mental disorders and symptoms. HDNJ is initially decomposed into a set of candidate clauses, each of which is likely to mention a relationship between symptoms and mental disorders. To generate candidate clauses, the clinician compiled a list for a symptom lexicon corresponding to the nine mental disorders. A clause is identified as a candidate clause if it contains a word or phrase from the symptom lexicon. Next, based on the generated candidate sentences, we calculate the correlation between the mental disorder and symptoms. To compute this correlation, an advantageous feature selection method, the log likelihood ratio (LLR), is used to differentiate word co-occurrence. With a dataset containing positive instances, LLR implements Equation (1) to estimate the likelihood with the assumption that the occurrence of a symptom in the clauses of mental disorder is not random.
where MI denotes the set of mental disorder clauses in the dataset; N(MI) and N(¬MI) are the numbers of positive and negative mental disorder clauses, respectively; and N(S∧MI) is the number of positive mental illness clauses containing the symptom lexicons SL. The probabilities p(SL), p(SL|MI), and p(SL|¬MI) are estimated using maximum likelihood estimation, and a symptom lexicon with a large LLR value is thought to be closely associated with the mental illness. Lastly, depending on their LLR values, we rank the symptom lexicons and extract the top 10 to consolidate into a list of the mental disorder symptoms. Furthermore, we investigated the correlation between symptoms and organs. Based on our observation, symptoms and organs frequently co-occur in a sentence, and for this reason, we adopted a sentence as a unit for exploring the correlation between symptoms and organs, instead of a clause. It is worth noting that multiple organs are usually mentioned in a sentence, which means we need to identify the specific organ(s) related to a particular symptom expression. The addressed issue is an illustration of coreference resolution, which is an essential element in natural language understanding required in numerous advanced natural language processing tasks. To correctly identify the organ in a sentence, we utilized the closest-first algorithm to link the closest antecedent to the anaphora [9]. We examined the contexts between an organ, o i , and a symptom, s j , in a sentence extracted from a candidate clause. Only if there are no other organ mentions occurring between o i and s j are they considered positive instances (i.e., symptom s j belongs to organ o i ); otherwise, sentences are classified as negative instances. For instance, the sentence "The liver stores the blood, and the blood houses the ethereal soul. Vacuity of livervital energy (Qi) begets fear, while repletion begets anger" ( 肝藏血, 血舍魂, 肝氣虛則恐, 實則怒) is classified as a positive instance for the symptom-organ pair liver-fear ( 肝-恐). However, in the clause "the five essences can be incorporated: happiness is begot when incorporated in the heart; sadness is begot when incorporated in the lung; worry is begot when incorporated in the liver; dread is begot when incorporated in the spleen; fear is begot when incorporated in the kidney; these are known as the five incorporations, which occur due to the vacuity of the depots" ( 五精所並: 精氣並於心則喜, 並於肺則悲, 並於肝則憂, 並於脾則畏, 並於腎則恐, 是謂五並, 虛而相併者也), there are two depots (spleen " 脾" and kidney " 腎") that are mentioned between the depot of liver ( 肝) and the symptom of fear ( 恐). We therefore recognize this sentence to be a negative instance. By examining all symptom-organ pairs in the HDNJ, we obtain a sentence set S = {s 1 , . . . , s m }, which contains 388 sentences that express relationships between symptoms and depots and palaces. After this step, the correlation degree is estimated using the LLR feature selection approach. We can further visualize the relationship between mental disorders, symptoms, and the depots/palaces to better comprehend mental disorders from the perspective of TCM.

Results
We analyze relationships between mental disorder descriptions and illness categories, as well as between illness and the depots and palaces according to TCM. We utilize normalized LLR values for the frequency of co-occurrences between the nine illness categories and the seven organs, which included the five depots (heart, lung, liver, spleen, and kidney) and two palaces (stomach and gallbladder). Table 1 shows the correlation between the mental disorders and their top 10 descriptions. As shown in Table 2, the diverse descriptions of mental symptoms can be classified into emotions. Using the Heart as an example, the description "upset" corresponds to worry & think; "heart feel sad", "keeps feeling sad," and "not happy at first," "easy to be sad" can correspond to sadness; "laughing and can't stop laughing" can correspond to happiness; while "fear and be alert to be caught" corresponds to fear & fright. We further categorize descriptions into nine mental disorders, and then calculate the distribution for organs. Appendix A shows the complete list of symptoms and their corresponding mental disorders. To better understand the mental disorders mentioned in HDNJ, we constructed an association network to visualize the relationship using Tables 1 and 2. The node and edge size are determined by the degree of the node and LLR value between two nodes, respectively. Figure 1 presents this network, in which green nodes represent the mental disorders, orange nodes represent the depots/palaces, and purple nodes represent the symptoms. As shown in Figure 1, fear & fright had a high co-occurrence with fright and fear, and these symptoms are highly associated with the liver, kidney, heart, and gallbladder. In addition, we observe that anger is highly associated with the liver and the lung. Worry & think has a high co-occurrence with frustration and worry, and these symptoms are highly associated with the spleen, liver, and lung. Therefore, worry & think is highly associated with the spleen, liver, and lung. This implies that anxiety is related to liver, spleen, heart, and gallbladder, and thus appears to be influenced by multiple organs. Furthermore, depression is often fostered by emotional fluctuations, which are typically related to the liver. Liver vital energy ("ganqi", " 肝氣") is responsible for the depressive episodes. Mania ("kuang", " 狂") is found to be associated to the liver and lung. The reversed flow of vital energy is the general cause of mania.

Discussion
Figure 2 displays the distribution for organs (depots and palaces). The percentages of various illnesses in relation to the organs are seen. The heart is noticeably prevalent. Anger is highly related to the lung and liver, worry & think is related to the spleen, sad- Data visualization of the mental disorder network of Huangdi Neijing suggests that mental disorders are related to multiple organs. Different illnesses have common accompanying symptoms, which can also be seen in the field of psychiatry, where anxiety and depression often have common clinical symptoms. Figure 2 displays the distribution for organs (depots and palaces). The percentages of various illnesses in relation to the organs are seen. The heart is noticeably prevalent. Anger is highly related to the lung and liver, worry & think is related to the spleen, sadness and forgetfulness are related to the heart, happiness is related to the lung and heart, fear & fright is related to the gallbladder, sleep disorder and drowsiness are related to the stomach, and hallucination may be associated with the heart. Two main factors of mental disorders are anger and worry & think. Figure 3 shows the ratio of the top 10 descriptions of the seven organs. The heart, liver, lung, and kidney have a high ratio of other symptoms not included in the top 10 descriptions. This explains why, in Table 2, the heart and kidney did not have any descriptions about anger in the top 10, while Figure 3 still shows the distribution of anger in these two depots (20%; 26%).   In biomedicine, the heart is an organ that pumps blood throughout the body, providing oxygen and nutrients for bodily functions. Meanwhile, in TCM, the heart is responsible for the regulation of mental activities [10][11][12]. TCM views the heart as being in control of blood circulation, regulating blood vessels, and storing the spirit ("shen", "神"), which implies consciousness and other cognitive functions [10][11][12][13]. Internal organs rely on the heart and blood for sufficient nourishment. The heart is comprised of heart qi, heart yang, heart blood, and heart yin. A dysfunction in the heart may result in poor circulation and, thus, generates symptoms such as coldness of limbs, and a white or purple complexion [10]. Moreover, previous studies [14,15] have evidenced that heart problems may double the risk of cognitive impairment without memory impairment [16,17]. Previous studies have also suggested the possibility of comorbidity between cardiovascular diseases and mental disorders. In [18,19], the authors demonstrated that chronic mental disorders In biomedicine, the heart is an organ that pumps blood throughout the body, providing oxygen and nutrients for bodily functions. Meanwhile, in TCM, the heart is responsible for the regulation of mental activities [10][11][12]. TCM views the heart as being in control of blood circulation, regulating blood vessels, and storing the spirit ("shen", " 神"), which implies consciousness and other cognitive functions [10][11][12][13]. Internal organs rely on the heart and blood for sufficient nourishment. The heart is comprised of heart qi, heart yang, heart blood, and heart yin. A dysfunction in the heart may result in poor circulation and, thus, generates symptoms such as coldness of limbs, and a white or purple complexion [10]. Moreover, previous studies [14,15] have evidenced that heart problems may double the risk of cognitive impairment without memory impairment [16,17]. Previous studies have also suggested the possibility of comorbidity between cardiovascular diseases and mental disorders. In [18,19], the authors demonstrated that chronic mental disorders reduce life span by speeding up exposure to etiological factors that contribute to cardiovascular diseases. The authors of [20][21][22] analyzed the interactions between cardiovascular medication and mental disorders. This study once again reiterates the heart to be the dominant organ related to mental disorders. Western medicine believe that the human brain governs psychological activities; however, TCM holds the view that the heart governs the blood circulation and, therefore, is responsible for the balance of cognitive activities. The influence of the heart on the cognitive activities may be due to the requirement of good blood circulation to maintain the functioning of organs.

Discussion
It is interesting to note that the liver had a variety of descriptions in fear & fright and anger. In TCM, the liver is in charge of determining the dispersion in all depots and palaces, including the smooth flow of vital energy, blood, and fluid in all directions within the body. Therefore, diseases arise when emotions, especially anger and fear & fright, become too intense or prolonged. These overly intense or prolonged emotions are risk factors of the dysfunction of vital energy, interrupting the balance of yin-yang in the corresponding organs, and eventually bringing about various diseases. In Western medicine, the liver plays an important role in digestion, metabolism, detoxification, and the production of bile. Some studies have implied that, for depression, the liver is presumed to be related to the neuroendocrine system [23,24]. In addition, in Figure 2 the gallbladder seems to have high specificity with fear & fright and anger. The liver and gallbladder seem to be highly associated with these two mental disorders, and future studies may venture into this area.
According to the HDNJ, the spleen and stomach are highly associated with sleep disorder. Recent studies have shown that many gastrointestinal diseases can lead to poor sleep [25][26][27], and cytokines, such as interleukin-1 and interleukin-6 [25,28,29], have been shown to be associated with sleep dysfunction. Interaction between the brain and the gastrointestinal system is of importance in the regulation of the digestive tract and the function of the gut immune system. Animal studies have also found that seemingly insignificant microbiome and unrelated organs, such as the intestines and the brain, can be connected through the gut-brain axis [29,30] and affect the mood of mice, causing anxiety-like or depression-like symptoms [31]. Sleep disorder is closely related to mental health. Many patients with mental disorders also have sleep disorder. In TCM, sleep disorder is closely related to the function of the spleen and stomach. The authors of [7] indicate that "stomach disharmony leading to restless sleep" is an important reason behind sleep disorder. This new thinking may be of benefit to gastroenterologists as addressing the relationship between sleep disorder and gastrointestinal diseases may lead to better patient care.
In Ots's research [32], out of 243 cases at a TCM clinic, 106 patients were identified as suffering from some type of mental disorder, with approximately two-thirds of all mental health issues directly or indirectly influenced by the liver or the heart, indicating that emotional changes and organ malfunctions may influence each other. Furthermore, in TCM, when a patient is suffering from psychological issues (e.g., depression), they usually manifest as physiological dysfunction of the related organ (e.g., chest tightness with obstructed breathing due to excessive worrying, which therefore hurts the lung). TCM identifies the imbalance of the internal organs by observing changes in the external manifestation. As shown in Figure 2, mental disorders are related to multiple organs rather than a single one (e.g., anger can impact the lung (39%), the liver (38%), the gallbladder (33%), the heart (26%), and the kidney (20%). This encourages clinicians to consider the mutual influence of various organs during management of mental disorders.

Conclusions
This study provides insight into the TCM's mental disorder ontology, which presents a different perspective to Western medicine, allowing non-trained professionals to understand the terminology used in TCM. Psychologizing and somatization are often culturebound narratives. Further study into the psychopathology models of TCM may help broaden the definition of normalcy in Western biomedicine, which can in turn facilitate greater communication and collaboration. This can ultimately improve public health and provide better care to patients. A comprehensive understanding of the integration of the features and diagnostic approaches of Western and TCM remains incomplete. This may improve the clinical management of different mental disorder factors through application of both TCM and Western concepts. The integration and implementation of TCM and Western approaches for specific types of mental disorders can be a fruitful direction.
The present study has various restrictions. Firstly, the descriptions of mental disorders are from a single literature source with finite bibliographical data. It is necessary to broaden the dataset using other classical texts. Secondly, the current data only allow for learning of the relationship between mental disorder descriptions and the depots/palaces. Research needs to be carried out to focus on the associations between cognitive activities and body variations, together with physical symptoms and vital energy movements. In conclusion, extensive studies are needed to promote the integration of TCM and other contemporary concepts to improve the management and treatment of psychosocial disorders and mental disorders in the future.