A Sequential Emotion Approach for Diagnosing Mental Disorder on Social Media

: Mental disorder has been affecting numerous individuals; however, mental health care is in a passive state where only a minority of individuals actively seek professional help. Due to the rapid development of social networks, individuals accustomed to expressing their raw feelings on social media include patients who are suffering great pain from mental disorders. To distinguish individuals who merely feel sad and others who have mental disorders, the symptoms of mental disorder are taken into consideration. These symptoms constantly arise as a regular pattern like shifting of emotions or repeating of one representative emotion during a certain time. We proposed a Mental Disorder Identification Model (MDI-Model) to identify the four most commonly occurring mental disorders in the world: anxiety disorder, bipolar disorder, depressive disorder, and obsessive-compulsive disorder (OCD). The MDI-Model compares the sequential emotion pattern from users to identify mental disorders to detect those who are in a high risk. Tweets of diagnosed mental disorder users were analyzed to evaluate the accuracy of the MDI-Model, furthermore, the tweets of users from six different occupations were analyzed to verify the precision and predict the tendency of mental disorder among the different occupations. Results show that the MDI-Model can efficiently diagnose users with high precision in different mental statuses as severe, moderate, and mild stage, or tendency of mental disorder and mentally healthy status.


Introduction
Due to the rapid development of modern technology, social networks have rapidly merged in our social life. Absorbing information subjectively and one-sidedly from social media can enlarge negative emotions that leads to a so called "fake-depression" and continuously develops to damage mental health. This phenomenon is prevalent in both the millennium generation and children [1]. Due to a lack of proper mental health care and confusion between general feelings to mental issues, a large number of users are continuously suffering from mental disorders. The statistics released from the World Health Organization (WHO) [2] indicate that the quantity of diagnosed individuals with mental disorders is progressively increasing. In 2015, the number of diagnosed depressive disorders was beyond 300 million, which is 4.4% of the world's population. Mental disability takes up a larger part of global disability than physical disability. Depression is ranked first (7.4%), while anxiety disorder is ranked sixth (3.4%). Mental disorder patients have a burden mortality rate, which is 2.22 times higher than that in healthy patients, while inpatients have significantly higher mortality rates compared with outpatients due to proper treatment intervention in advance [3]. This is a worldwide phenomenon, especially in low-income countries where people burdened with heavy stress tend to have a higher rate of mental disorder [4], while the immigration rate is relatively higher in lowincome countries due to incomplete public services. Additionally, research [5] indicates that the prevalence of lifetime mental disorder is reducing from generation of generation in immigrant populations.
The causes of mental disorder can be diverse: poverty, discrimination, unemployment, death of loved ones, use of drugs and alcohol, and so on [2]. Mental disorder also leads to a higher suicide rate compared to other non-physical causes. According to western psychopathological studies [6,7], a large proportion of people who attempt suicide are diagnosed with mental disorders. However, attempted suicide is not merely related to a single particular disorder, but an overall tendency in psychopathology. Meanwhile, a proper psychopathological intervention can efficiently reduce the attempted suicide rate [8]. Measures must be taken to prevent this kind of situation from becoming worse. It is difficult to make a self-diagnosis merely based on the possible triggers that cause mental disorder, symptoms of mental disorder that are revealed as early stage signs can be captured in advance to make a diagnosis. The symptoms of disorder, especially in mental disorder, always arise as a regular pattern like the shifting of emotions or repeating of one representative emotion during a certain time. As stress is strongly associated with anxiety and depressive symptoms, in contrast, the positive emotions can enhance the resilience ability to protect the individual from developing clinical level of anxiety and depression [9]. The term "emotional problems" constantly appears in clinical science, which refers to diverse clinical symptoms and syndromes. Meanwhile, emotion dysregulation is the main cause of emotional problems, which refers to a problematic and uncontrolled emotion generation. Due to emotion dysregulation, individuals with emotional problems are incapable to cope with negative emotions, which will be accumulated as a result of mental disorder. Identifying the dysregulated emotions and coping with a proper regulation process can be the key in psychopathology [10]. Sentiment analysis is commonly used in nature language processing to analyze the sentiment and emotion component. Machine learning or emotion lexicon are the two most general approaches to achieve sentiment analysis [11].
In this paper, we present the Mental Disorder Identification Model (MDI-Model), which can identify the intensity of four common mental disorders (anxiety disorder, bipolar disorder, depressive disorders, and OCD disorder) from the sequential emotion patterns of individuals via social media. We detailed the eight basic emotions of the Plutchik Wheel into 21 generalized representative emotions by using cluster analysis with authoritative mental disorder criteria and blogs from the admissive psychology community to enhance the precision of further identification. Next, the intensity sequential emotion patterns were obtained by using association rules to precisely identify mental disorders on social media. The pattern approach used to identify mental disorder is shown in Figure 1.

Related Work
Mental disorder has a significant impact on a wide range of researchers including psychiatrists, sociologists, and medical professionals, as a result, rigorous diagnosis criteria of mental disease have been published. Spitzer et al. [12] proposed the definition and criteria of mental disorder. Mental disorder is diagnosed when four criteria are met which include three key concepts: consequences; organismic dysfunction; and implicit call. The four criteria define which scenario should be diagnosed as a mental disorder and the boundaries of mental disorder identification. Clark et al. [13] introduced three current approaches to classify mental disorder: the International Classification of Diseases, Tenth revision (ICD-10), which contains the 22 main categories of disease; the Diagnostic and Statistical Manuel of Disorders, Fifth Edition (DSM-5), which contains the 23 main categories of mental disorders; and the Research Domain Criteria (RDoC), whose aim is to rebuild a new taxonomy of mental disorders with bioinformatics.
With the rapid growth of social networks, how to use mental disorder criteria to detect mental disorder symptoms, especially in social media, has become the primary issue. Fried et al. [14] proposed a network approach of mental disorders by taking symptoms of mental disorder as features into a network model that could capture the intensity and variation in mentality that professionals and patients could identify instantly. Shuai et al. [15] proposed a machine learning framework with multi-source learning based on the Tensor model by exploiting features from social interactions and personal profiles such as online activity, offline duration, and social searching, etc. As a result, risky users can be automatically identified. Dham et al. [16] proposed depression identification from nature language text, facial expression, and audio sound based on support vector machines (SVM) and neural network classification algorithms. Experimental results indicate that people with depression show a relative change in speech pattern, facial expressions, and head movement.
Emotions among mental disorder symptoms are rather regular. Jazaieri et al. [17] suggested that negative emotions can be detected constantly due to the problematic emotional reactivity and dysregulation that exist in social anxiety disorder. Pe et al. [18] proposed an emotion network for detecting the emotion density in major depressive disorder. Results show that people with major depressive disorders are more resistant to changing their negative emotions.
A general method to solve emotion-oriented problems in nature language processing is sentiment analysis. A survey conducted by Hussein [19] concluded that the domain-dependency was related to sentiment recognition, which will cause uneven accuracy in different fields. Rout et al. [20] proposed a model by using an unsupervised approach and lexicon-based approach, which were applied with multinomial naive Bayes and support vector machines to analyze the sentiment and emotion from social media text in certain domains such as movie reviews and consumer reviews. The model categorized four types: positive, neutral, negative, and irrelevant. Results showed that the accuracy of the unsupervised approach and lexicon-based approach could achieve 80.68% and 75.20%, respectively. Subramaniyaswamy et al. [21] used a lexicon approach of sentiment analysis to predict possible violent events in the future from tweets. However, they only analyzed three proportions of sentiments: positive, negative, and neutral. Altenburg et al. [22] applied sentiment analysis on tweets to health insurance, calculated the association between words, then used a sentiment lexicon to fulfill sentiment analysis. Results showed that the more sensitive an individual is, the more health care is considered. Jabreel et al. [23] proposed a multi-labeled sentiment analysis on social media with deep learning. They restructured the multi-label problem to a single binary issue, then applied deep learning. Results showed that the system could distinguish eight different emotions with a high score on the emotion classification problem. However, these methods or models only analyzed the non-emotion factor or single emotion of the mental disorder symptom, while research on sequence-oriented emotion shifting among mental disorder has yet to be conducted furthermore. Previous works have verified that the mental disorder symptoms tend to appear as a group of emotions or constantly remain as a particular emotion, which means that analysis on the sentiment level should be further conducted on the emotion level. Therefore, emotions with subtle shifting as a sequential pattern can better interpret the symptoms among mental disorder, which can be the breakthrough to identify mental disorders.

Motivation
The general diagnosis of mental disorder commonly uses questionnaires to make diagnosis of mental status; however, these questionnaires can be subjective and one-sided, in which people might not be completely honest due to trust issues, or act irrationally through a momentary weakness of emotion. We think that long-time observations of an individual's mental state can be helpful to better understand their emotions. We want to comprehensively understand how individuals express their real and daily feelings. We chose social media to enlarge our corpus, which is more related to people's daily life.
With the rapid growth of the physical world, the importance of mental health care has been underestimated. Social networks have become a way of forming attachments of others; people regularly use social media to express their feelings, nonetheless, only a few realize that they require professional mental health care. Social media has provided a platform for the user to expose their feelings. Sentiment analysis is commonly adopted in emotion-oriented problems, which merely analyzes three sentiment polarities: positive, neutral, and negative. However, mental disorder patients generally suffer from the dysregulation of emotions, which refers to unexpected emotion shifts or intense representative emotion occurrence. Therefore, the precision of sentiment analysis is insufficient in identifying mental disorders as the complex symptoms are constantly revealed as diverse forms of emotions. Sequential emotion pattern is the straightforward revelation of mental disorder symptom. Our specialty is nature language processing, and we aimed to apply this model to artificial intelligent robotics and homecare to better interact with the user's emotions. The application of sentiment analysis and sequential emotion is shown in Figure 2. The aim of the Mental Disorder Identification Model (MDI-Model) we propose is three-fold: prevention of mental issues when the tendency occurs; intervention for mental disorders at an early stage; and treatment for mental disorders when diagnosed. To achieve this expectation, we proposed a sequential emotion analysis algorithm to interpret authoritative criteria into sequential emotion patterns. The flowchart of the MDI-Model with sequential emotion analysis is shown in Figure 3. To identify whether individuals had a mental disorder based on sequential emotion pattern of users via social media, we first extended eight basic emotions in the Plutchik Wheel by fitting a precise description about mental disorders from professionals with generally common expressed emotions. Second, we created an emotion lexicon with synonym similarity calculation to expand emotion words in each category of representative emotion. Third, based on the frequency of each sequential emotion pattern, the pattern intensity of the mental disorder was calculated with weights to highlight the importance of authoritative criteria. Therefore, individuals with a high intensity and matching rate of mental disorder can then be detected in the MDI-Model.

Sequential Emotion Analysis
In this section, the sequential emotion analysis (SEA) algorithm was proposed to analyze the generalized representative emotions and identification rules of mental disorder symptoms. With an analysis of the emotion patterns of mental disorders that have been professional described and that individuals suffer from, we redefined the Plutchik Wheel into 21 generalized representative emotions through unsupervised cluster analysis. Furthermore, the intensity of each identification rule was quantified. Sixteen parameters of the SEA algorithm are listed in Table 1. Intensity support of ith potential pattern

Data Preprocessing
In this section, data preprocessing was conducted to remove the irrelevant data from the corpus in order to minimize the detachment between the commonly expressed emotions from Twitter and the emotion related symptoms of mental disorder from the authoritative criteria.
First, data were acquired in four different documents: 125,559 blogs, which are related to mental issues from the admissive psychology community; 1,048,575 tweets from Sentiment140 [24], an open source Twitter dataset that is classified into two dimensional sentiments (happiness, sadness); 80 authoritative criteria of four commonly occurring mental disorders worldwide (anxiety disorder, bipolar disorder, depressive disorder, OCD); 167,072 tweets from 8649 users who were diagnosed with one or more than one among the mentioned mental disorders; and also 164,625 tweets from users from six different types of occupations: waiters, reporters, engineers, travelers, musicians, and comedians, who are burdened with different levels of stress.
Second, data cleaning was undertaken. To remove the irrelevant words that would affect the performance of disorder identification, the operation of noise reduction was conducted both on psychology blogs corpus and Sentiment140 corpus by using emotion lexicon of National Research Council (NRC) [25], an English lexicon of emotion words based on the Plutchik Wheel, to filter out the irrelevant non-emotional words among the corpus. Some emotional words such as "sick" can be described as physically or mentally ill; on the other hand, it is widely used as slang to express feelings of admiration on Twitter. To avoid misjudgment on ambiguous words, these emotion words were considered for removal, along with prepositions, articles, and pronouns that were excluded in the NRC emotion lexicon.
Third, data transformation was conducted. An advanced Term Frequency-Inverse Document Frequency (TF-IDF) was proposed to vectorize the emotion words among the corpus. TF-IDF is a vectorization method that calculates the importance of single words in the corpus by assigning a high value to it when it occurs frequently in one document, but also infrequently among all documents. It can filter out the most common words, but leave relatively important ones. It is suitable in our scenario where the adverbial might not commonly appear in the syntax structure. However, if the word appears in the title or is pointed out in the first and last sentence of each paragraph, then it should be considered as much more important, even if the frequency is less than that of the others. Therefore, we proposed an advanced TF-IDF algorithm that assigned another weight to the word that occurred constantly in the title of the blogs and also for those that occur relatively frequently between the first or last sentence of each paragraph.

Generalized Multidimensional Emotion Lexicon
In this section, a generalized multidimensional emotion lexicon was created depending on the groups obtained with the cluster analysis of vectorized emotion words and 21 representative emotions were generated from each group. Expansion on each representative emotion was conducted, relying on synonym similarity calculation. The 21 generalized representative emotions are shown in Table 2. First, the topic model, a latent Dirichlet allocation (LDA), which is a generative statistical model also known as a three-layer hierarchical Bayesian model, was selected to generalize diverse topics that contained lists of emotion words. The clustering of words depends on the association that occurs together in a large quantity of documents. Due to the quantity of each dataset, the topics of the psychology blogs were set as 15, while the topics of Sentiment140 were set as 20. For example, in psychology blogs, "happiness", "creative", "happy", "fun", "enjoy", etc. are listed as topic one; "pain", "fear", "shame", "loss", "depression", etc. are listed as topic two, and so on. In Sentiment140, "good", "love", "hope", "happy", "fun", etc. are listed as topic one; "die", "sadly", "miserable", "stress", "fear", etc. are listed as topic two, and so on. By fitting these two results, we obtained 21 topics of emotion words clustered together. Then, basic emotions in the Plutchik Wheel were redefined into 21 generalized representative emotions: Joy, Intimacy, Respect, Confidence, Concentration, Anxiety, Insecurity, Fear, Surprise, Pain, Despair, Tired, Shame, Disgust, Anger, Manic, Passion, Gratitude, Hope, and Relaxation. Under each topic, the top 20 words that frequently occurred were stored as the description for each topic. The 21 generalized representative emotions are shown in Table 1.
Second, to enlarge the emotion lexicon, synonyms of the emotion word were taken into account. We adopted the Wordnet, a lexical database where each English word is specified as separate meanings with a list of synonyms in the form of words or phrases shared under the same category to obtain the potential emotion word or phrase to build the emotion lexicon. By taking each of the 20 emotion words under each representative emotion as the baseline, all synonyms were obtained in iterations. Then, the duplicated words and phrases were removed. As a result, approximately 16,000 potential words and phrases were obtained.
Third, to build up the emotion lexicon, the descriptions of each representative emotion were used as the criteria. We calculated each potential word or phrase with the synonym similarity to each description emotion word under each particularly representative category, then took the average similarity under each category as the intensity support of the potential word or phrase to this category. We removed the potential ones with an intensity support of less than 0.7, then appended the potential word or phrase to the max intensity support category. Meanwhile, we set the intensity support of each description emotion word under each category as 1. As a result, a generalized multidimensional emotion lexicon of words and phrases with an amount of 12,009 was acquired.

Sequential Emotion Pattern Intensity Quantification
In this section, the sequential emotion pattern intensity quantification was conducted. First, we used an association rule-based machine learning method to discover the sequential emotion pattern among the authoritative criteria of mental disorder and the sequential emotion pattern among mental health blogs from the psychology community. Then, we quantified the intensity of each sequential emotion pattern to the labeled mental disorder.
Step 1: Append the mental disorder label to a total of 1300 blogs about each type of mental disorder from psychology blogs, then separate those documents according to their labels as four disorder corpuses, the same as the 80 criteria of mental disorders. Then, segregate the article level disorder corpuses into a sentence level to better adapt with the data type in Twitter, where tweets are limited within 140 characters.
Step 2: The frequent-pattern growth (FP-growth) algorithm, an association rule learning method, is selected on the mining pattern within each disorder corpus. FP-growth discovers frequent sequential emotion pattern sets with support, where the occurrence of each pattern among the corpus is greater than the minimal support threshold. First, it lists the frequent one emotion pattern, then uses it to generate the frequent two emotions pattern, which continues until no other sequential emotion pattern meets the standard. The support of the frequent one pattern is regarded as the intensity support psi to the representative emotion. Add the intensity support of each occurring emotion in the sequential emotion pattern as the intensity support to it. The disorder label of each pattern as li is inherited from the disorder corpus label.
Step 3: Highlight the significance of the criteria sequential emotion pattern with the weight 1.5 to psi, then integrate discovered patterns from the same labeled corpuses to obtain a sequential emotion pattern intensity set.
Step 4: Find the potential sequential emotion patterns from the rest of the psychology blog corpus, which reveals the common emotion shift tendency. Frequent potential patterns are also discovered by analyzing the probability of sequential emotion pattern shifting based on the confidence, which means that when the antecedent frequent potential pattern as apai appears, the probability that the consequent as apci will also occur. An association rule will be discovered when the confidence is greater than the minimal confidence threshold.
Step 5: Integrate the association rules into the sequential emotion pattern intensity set with the confidence as the probability of sequential emotion pattern shifting. Select the apai as potential pattern ppi when the apai does not exist in the sequential emotion pattern intensity set as well as when the apci appears in the pattern intensity set. The intensity support as ppsi of ppi is equal to the confidence of the associated rule as aci by multiplying the intensity support of the consequent as psapci . The disorder label of the potential pattern is inherited from the consequent as plapci . The equation is as follows: When potential patterns with the same disorder label occur, select the maximum ppsi to ppi. Append the potential patterns to the sequential emotion pattern intensity set. The process of the sequential emotion pattern intensity quantification is shown in Figure 4.

Sequential Emotion Analysis Algorithm
The sequential emotion analysis (SEA) algorithm can efficiently detect the emotion words from the corpus based on the generalized multidimensional emotion lexicon. Then, it generate intensity sequential emotion pattern and transforms the potential sequential emotion pattern to the next emotion shift with the probability as the intensity to this past pattern. The pseudo code of SEA is in Algorithm 1.

Mental Disorder Identification Model
In this section, we propose the MDI-Model based on the SEA algorithm to identify mental disorder users and predict the mental disorder tendency. The MDI-Model contains the sequential emotion detection (SED) algorithm and bidimensional hash search (BHS) algorithm. We took several training processes with samples chosen from Twitter until the parameter thresholds met the criteria and provided a proper and validated result. The parameters defined within the MDI-Model are listed in Table 3. Intensity support of jth pattern in dth disorder ms (d, j) Max intensity support of dth disorder is jth pattern g(d, j) Intensity of jth pattern in dth disorder greater than RoId

Intensity Calculation
In this section, the parameters of the MDI-Model indicate whether the user is identified as currently suffering from a mental disorder or has the tendency of having a mental issue, and if so, which stage the user is at. The criteria are designed rigorously with vertical calculation, which calculates the intensity in one particular disorder and horizontal comparison, indicating whether the user is suffering from multiple mental disorders or which particular disorder the user is likely to be at risk. The definitions of the parameters are listed as follows:

Mental Status of Identification
In this section, the mental statuses of identification are divided into seven statuses, based on the intensity. The seven mental health statuses are shown in Table 4.

Statuses Definition Status1
In severe stage of mental disorder Status2 In moderate stage of mental disorder Status3 In mild stage of mental disorder Status4 In severe tendency of mental disorder Status5 In moderate tendency of mental disorder Status6 In mild tendency of mental disorder Status7 Mentally healthy The criteria of the seven mental statuses are defined as follows: Status1: Iff For example, if the MDI-Model detected three sequential emotion patterns from the user, each as "Sadness", "Despair, Pain, Hope, Concentration, Sadness", "Anxiety, Intimacy, Trust, Pain, Confidence, Tired", then the MDI-Model will identify whether a single pattern will be matched among each mental disorder, and the diagnosis will be made based on vertical calculation and horizontal comparison identification.
The identification was based on four parameters: the maximum intensity support of the mental disorder as ms(d, j); the ratio of the matched patterns as RoMd; the average intensity above the relevance of intensity as AoId; and the significance of disorder as SoDd. The thresholds of the support intensity of mental disorder, the ratio of the matched patterns, the relevance of intensity, the average of the intensity above the relevance of intensity, and the significance of disorder in the MDI-Model are described as follows: ms (thr), RoM(thr), RoI(thr), AoI(thr), SoD(thr).
The identification process of the MDI-Model is based on vertical calculation and horizontal comparison identification, as shown in Figure 5.

Sequential Emotion Detection (SED) Algorithm
The Sequential Emotion Detection (SED) algorithm can detect sequential emotion from the user, and the diagnosis is made based on several parameters. The pseudo code of the SED is shown in Algorithm 2.

Bidimensional Hash Search Algorithm for Model Optimization
In this section, model optimization as conducted. The sequential emotion pattern intensity set consisted of 97,672 patterns as the linear search is rather time-consuming and inefficient in the MDI-Model. We proposed an optimized search algorithm based on the hash search algorithm: bidimensional hash search (BHS) algorithm. The intensity pattern had two identical attributes: the quantity of emotions in pattern as qi, and the initial character of the pattern as ci.
First, as the maximum of qi was 12, and the quantity of the probable value of ci was 14, we generated a 12 × 14 bidimensional hash table where each sequential emotion pattern could be projected in a particular block by combining the two identical attributes qi ∩ ci as the input key to each block of the hash table.
Second, the hash table is based on one key to one value, therefore we generated another hash table inside the former one as a bidimensional hash table to project all attributes from the sequential emotion pattern set. As a result, we achieved using the binary key set to obtain one particular value in the pattern.
The bidimensional hash search (BHS) algorithm can reduce the time consumption of searching and can reduce the processing time from 0.8 s in identifying a single user in the linear search algorithm to 0.2 s, which can significantly increase the efficiency of identifying mental disorder.
The pseudo code of BHS is shown in Algorithm 3.

6:
Set li as the value to "label", PISi as value to "itemset", psi as value to "intensity" in htv; 7: Add to BHT; 8: end for 9: Calculate length of SEP as n; 10: for j = 0, j < n, j++ do 11: Acquire qj ∩ cj as key; 12:

Experiment
In this section, the performance of the SEA algorithm was evaluated with LSTM [26] and SeNTU [27]. Meanwhile, the precision of the MDI-Model was evaluated between the four mental disorders and the distribution of each state of mental disorder was analyzed from the users with one of the mental disorders and users in six different occupations. The Twitter API, Tweepy was used to collect the diagnosed-oriented dataset and occupation-oriented dataset. We only used the data for academic research and strictly followed Twitter's policy. Aside from the data from Twitter, we also acquired 125,559 blogs related to mental issues from the admissive psychology community; 1,048,575 tweets from Sentiment140 [24], an open source Twitter dataset, which is classified into 2-dimensional sentiments (happiness, sadness); 80 authoritative criteria of four commonly occurring mental disorders in the world (anxiety disorder, bipolar disorder, depressive disorder, OCD), which are all public data.

Evaluation of Sequential Emotion Analysis Algorithm
In this section, the evaluation was conducted by applying "Precision", "Recall", and the "F1-Measure" to evaluate our proposed sequential emotion analysis algorithm.
As shown in Table 5, our proposed method provided a better performance. With higher "Precision", "Recall", and "F1_Measure", our method could better detect emotions from the nature language text due to redefined representative emotions and extended emotion lexicon. Combined with professional psychologists and general individuals, the emotion lexicon can cover a broader range of emotion words.

Evaluation of the Mental Disorder Identification Model
In this section, the diagnosed-oriented dataset contained the diagnosed users and 40 latest posted tweets of each account, and the posted time ranged from 2015 to 2019. We filtered the diagnosed-oriented dataset by using the keywords "diagnosed with", "I have", "I had", and "I used to have" plus each name of four types of mental disorders. On the other hand, the occupationoriented dataset was integrated with six different types of occupation: waiter, reporter, engineer, musician, comedian, and traveler. We used keywords: "I'm a", "I work as", "My job is" plus each occupation name to obtain the users. Each account collected the 40 latest posted tweets, and the posted time ranged from 2012 to 2019. Once we had the keyword-matched users, we filtered out those users whose tweets were all posted within a week, and all the interaction tweets that contained the character "@" were also filtered out. The details of the two types of Twitter data are shown in Table  6. First, we conducted an evaluation on the diagnosed-oriented dataset. The accuracies, defined as the ratio of the confirmed and tendentious ones in all users in each category, were calculated separately. It can be observed that the MDI-Model can efficiently identify users who have a mental disorder or the tendency of a mental disorder with high accuracy. The result of this evaluation is listed in Table 7. Furthermore, we analyzed the proportion of the confirmed and tendentious ones. It can be observed that in the diagnosed-oriented dataset, the proportion of the confirmed ones was nearly twice as large as the tendentious ones. The proportion of the confirmed mental disorder status indicates that more than half of the confirmed were in a severe stage and also 33% of the confirmed were at an early stage of mental disorders. Moreover, for those who have a tendency of mental disorder, the severe tendency reached up to 99%, which indicates that individuals with severe tendency show more detectable sequential emotion patterns that match the mental disorder symptom. The statistics can highly support the fact that the users filtered by keyword search are truly suffering from mental disorder. The proportion of identified users in different statuses is shown in Figure 6.

Occupation-Oriented Dataset Evaluation
Second, we conducted an evaluation on the occupation-oriented dataset. We calculated the prediction rate of who might have a tendency of mental disorder out of the six different occupations. The reason we chose these occupations was to be the supplementary evaluation of our model. If the model considers a normal user with slight complaining about his/her daily life or work to be a mental disorder patient, it will prove our model is overfitting. To exclude this possibility, we randomly selected three stressful jobs: "waiter", "reporter", and "engineer". To conduct a controlled experiment, we also selected "traveler", "comedian", and "musician" due to the general concept of their work content as they are more likely to post positive content on social media to attract public attention.
We listed the number of users who were detected as having a potential tendency of mental disorder in Table 8. Even though people who are diagnosed with a mental disorder and who are in a stressful occupation generally posted negative content on social media, our model gave a lower prediction rate in the occupation-oriented dataset than the diagnosed-oriented dataset, which indicates that our model was not overfitting, and also proved that constantly feeling negative is not the one and only essential component for our model to identify potential mental disorder tendency, but rather the connection of the shifting of emotions. Additionally, from the results, it could be observed that people who were constantly exposed in a stressful working environment such as waiters, reporters, and engineers tended to have a higher risk of mental disorders. In contrast, public figures such as comedians and musicians had lower detectable signs of mental disorder due to their social influence, which restrains them from posting negative information on social media. Meanwhile, travelers had the lowest prediction rate of mental disorder because they generally shared passionate and delighted information on social media.
We further analyzed the mental disorder distribution of detected users among the different occupations.
It can be observed that anxiety disorder tendency takes up the largest proportion in mental disorder distribution, and is particularly significant in a stressful working environment. In contrast, anxiety disorder tendency has decreased significantly among individuals who are immersed in a more relaxing working environment. OCD tendency is rather high in musicians due to the perfectionism commonly seen in this industry, which can be related to early signs of OCD. The distribution of mental disorder tendency among the six occupations is shown in Figure 7.
We tried to understand this phenomenon with group psychology; an individual is more likely to follow users who are related to his/her field on social media for the same interest, therefore, if a few have complained about their work or life, people who are facing the same work content or life situation will feel related and generate different levels of concern. This is why people who are in a stressful occupation show more obvious tendencies of anxiety disorder. It is not likely that people with a potential tendency of anxiety disorder will often chose these occupations, but clearly, the environment has some effect on their mental status.

Discussion
From the experimental results, we can obtain a regular pattern that individuals, who are diagnosed with mental disorders or are about to meet the criteria of mental disorder, commonly express the same emotion pattern. Those who are at a high risk of being diagnosed with mental disorder can be efficiently identified in the MDI-Model. Valuable applications such as early prediction [28] and intervention can then be made in time.
Sequential emotion pattern is suitable for identifying mental disorders because except for the factors of environment and personality, these are mainly surrounded with shifting or repeating one or few distinguishable emotions [29,30]. Researches show that emotion from social media can better interpret the symptoms of mental disorder [31][32][33][34]. Our aim was to generalize the professional criteria about mental disorders into a more normalized expression in people's daily life. The key ingredient is "Emotion", which at some level can reflect the mental status of the individual [35]. We are more interested in how emotions can reveal patterns to mental disorder as a whole to validate whether there is a connection between sequential emotions to mental disorder.
Our specialty is natural language processing, and sentiment analysis is quite related to this area [36][37][38]. Therefore, if the factors related to mental disorder such as the environment, personality, and other causative factors are revealed as non-text data, then our model has not taken those factors into account.

Conclusions
In this paper, we learnt from the diagnosis of mental disorder that emotions occur in a certain pattern. We proposed a Mental Disorder Identification Model (MDI-Model) based on a sequential emotion analysis algorithm to identify the intensity of the mental disorder. To obtain the sequential emotion pattern intensity set, authoritative criteria of mental disorders and professional blogs about mental health from the psychology community were acquired. After extracting the sequential emotion pattern related to mental disorder among the specialists and the individuals from social media, a multidimensional emotion representative was redefined to capture the subtle shifting in emotion. Based on the multidimensional representative emotions, an emotion lexicon was generated and extended to better acquire the sequential emotion pattern.
Due to the attachment between the authoritative professionals and common masses, the MDI-Model can efficiently identify the intensity of users who are diagnosed with mental disorders from social media. Follow-up mental health care can then be initiated to cope with mental issues.
Further study can be carried out with sentiment analysis to better interpret the emotion in a more complex structure, and a more rigorously intensity quantification can also be considered, for example, the use of punctuation and adverbs can enhance the intensity of expressed emotion. In addition to discovering the relationship between emotion and mental disease, the personality one holds, which highly indicates the growth environment one went through, is also greatly affected mental health status. How to analyze the personality of an individual via social media to better diagnose ones' mental health status and cope with proper therapy can be considered in future work.

Conflicts of Interest:
The authors declare no conflicts of interest.