Graph Analysis of Verbal Fluency Tests in Schizophrenia and Bipolar Disorder

Verbal Fluency Tests (VFT) are one of the most common neuropsychological tasks used in bipolar disorder (BD) and schizophrenia (SZ) research. Recently, a new VFT analysis method based on graph theory was developed. Interpreting spoken words as nodes and every temporal connection between consecutive words as edges, researchers created graph structures, allowing the extraction of more data from participants’ speech, called Speech Graph Attributes (SGA). The aim of our study was to compare speech graphs, derived from Phonemic and Semantic VFT, between SZ, BD, and healthy controls (HC). Twenty-nine SZ patients, twenty-nine BD patients, and twenty-nine HC performed Semantic and Phonemic VFT. Standard measures (SM) and 13 SGA were analyzed. SZ patients’ Semantic VFT graphs showed lower total word count and correct responses. Their graphs presented less nodes and edges, higher density, smaller diameter, average shortest path (ASP), and largest strongly connected component than the HC group. SM did not differentiate BD and HC groups, and patients’ Semantic VFT graphs presented smaller diameter and ASP than HC. None of the parameters differentiated BD and SZ patients. Our results encourage the use of speech graph analysis, as it reveals verbal fluency alterations that remained unnoticed in the routine comparisons of groups with the use SM.


Introduction
Schizophrenia (SZ) and bipolar disorder (BD) are severe mental disorders that share common symptom dimensions, neurophysiology, and genetics, and their treatment strategies are similar [1][2][3]. Both disorders are characterized by cognitive dysfunctions involving alteration of the structure of language. Verbal Fluency Tests (VFT) are one of the most common neuropsychological tasks used in BD and SZ research. These tests evaluate the ability to produce a correct sequence of spoken words during a limited time interval [4,5]. During VFT, participants are asked to name as many words as possible starting with a specific letter (Phonemic VFT), or belonging to a specific category, e.g., animals (Semantic VFT). Those tasks deliver information about the integrity of lexico-semantic memory and Brain Sci. 2022, 12, 166 2 of 11 the ability to recall items from it, self-monitoring, inhibition of responses in adequate situations, and effortful self-initiation [6,7].
Studies indicate that BD and SZ patients present deficits during VFT [8]. Among SZ, deterioration in semantic fluency is more common and severe than in the phonemic one [9,10]. A meta-analysis has shown that semantic VFT performance varies along the SZ continuum [11]. Patients with recent-onset psychosis, chronic SZ, as well as first degree relatives present a significantly lower number of correct responses than HC. Moreover, patients with chronic SZ show more non-perseverative errors [11]. In the case of BD, a meta-analysis revealed that patients present verbal fluency deficits with a medium effect size [12]. Unlike in SZ, there was no significant difference between semantic and phonemic VFT. Also, there was no significant effect of mood state when examined across tasks; however, in the case of semantic VFT, there is significant difference in effect size, indicating greater impairment in euthymic patients [12]. The first quantitative review comparing those two disorders showed that BD patients outperform the SZ group in both Phonemic and Semantic VFT [13]. However, a further systematic review indicated comparable neurocognitive impairments in both groups [14]. The majority of recent studies show no differences between SZ and BD [15][16][17][18], though one shows milder verbal fluency deficits in BD than in SZ [19].
Generally, VFT are being interpreted using only a number of spoken words, e.g., word count, number of correct words (productivity score), repetitions (perseverative errors), and non-perseverative errors [11]. Results being only a single characteristic are an important limitation of a neuropsychological test in clinical practice and research. As a consequence of this, many authors have considered addition aspects of VFT in their studies, adopting a qualitative approach for better understanding of the organization of semantic memory. One of these are clustering and switching scores [11]. These variables were introduced after the observation that, during VFT, respondents generate a sequence of words that can be grouped into semantic subcategories called clusters, and change from one subcategory to another, which are called switches [11,20]. It has been shown that SZ patients demonstrated a smaller cluster size and fewer instances of switching than HCs [11]. The observed deficits may represent degraded semantic store with less category exemplars available. It has been suggested that such impairments may be storage-related [11,[21][22][23]. A study using cluster analysis has shown that BD patients' strategies for categorization in semantic memory may be less related their knowledge compared to HC [24]. Patients present a reduced number and aberrant clustering of produced words. Sung et al., 2013, showed less coherent clustering of semantic exemplars in BD patients [25]. It has been proposed that those deficits may indicate impaired semantic activation/inhibition, insufficient spreading across the semantic network, or impairment regarding the control of these functions [12,25].
In recent years, Mota et al., 2012, developed an innovative method of speech analysis using mathematical graph theory [26]. Interpreting spoken words as nodes, and every temporal connection between consecutive words as edges, researchers created graph structures, allowing the extraction of more data from participants' speech. Mota et al., 2014., delivered a software called SpeechGraphs, which generates graph models with specific Speech Graph Attributes (SGA) from a set of written words, e.g., its density, diameter, and characteristics of loops between repeated words [27]. Transforming spoken words into a network enables an analysis of their hidden features, which can help understand the dynamics and organization of cognitive processes [5]. This approach was successful in differentiating SZ and BD patient groups through analysis of participants' dreams reports [27], or transcribed interviews with the patients [26]. Graph analysis of VFT was used only in the single study that showed that this method successfully differentiated groups of patients with Alzheimer's Disease, mild cognitive impairment, and healthy controls (HC) [5]. This study inspired us to use this approach in the group of SZ and BD patients.
The aim of our study is to compare the properties of speech graphs, derived from Phonemic and Semantic VFT, between SZ, BD, and HC. We hypothesize that SGA can differentiate the abovementioned groups. In opposite to the correct speech graph described by Bertola et al., 2014 [5], we hypothesize that, due to the suspected higher number of errors and repetitions and lower word count, BD and SZ patients will produce non-linear networks. Compared to HC, patients should present lower numbers of nodes, words, and edges; indicators of recurrence (i.e., presence of parallel edges, repeated edges or loops); and the presence of strongly connected components. Patients' graphs should present higher density, increased clustering coefficient (CC), and large distances. We also hypothesize that BD and SZ will present more profound deficits in Semantic than in Phonemic VFT measures.

Participants
Eighty-seven participants were recruited to this study: twenty-nine BD patients, twenty-nine SZ patients, and twenty-nine HC. All groups were matched in terms of age and gender (Table 1). A consensus diagnosis was made by two experienced psychiatrists according to DSM-5 and ICD-10 criteria. Inclusion criteria for patients were symptomatic remission (PANSS score of 3 or less, on all of its items), and treatment with antipsychotic drugs from the group of dibenzoxazepines (olanzapine, quetiapine, clozapine). BD patients were in euthymia, classified as <11 points in the Montgomery-Asberg Depression Rating Scale [28], and <5 points in the Young Rating Scale for Mania [29]. We selected patients treated with the antipsychotics from the dibenzoxazepine group to provide a relative pharmacological homogeneity across patient groups. In the case of BD patients, the use of lamotrigine and valproic acid was also accepted. Exclusion criteria involved history of alcohol or drug abuse according to substance use disorder of DSM-5; severe, acute, or chronic neurological and somatic diseases; and treatment different than that mentioned above. HC consisted of mentally healthy volunteers recruited from researchers' social network. This group did not meet any of the exclusion criteria for patients. All participants signed an informed written consent prior to the assessment. The study was approved by the Bioethics Committee of the Jagiellonian University Medical College in Cracow.

Fluency Tests
Semantic and Phonemic VFTs were used. During Semantic VFT, participants were asked to produce the maximum number of words from a category of animal without repetitions in one minute. During Phonemic VFT, the category of words beginning with the letter "K" was used, with the same instructions as above. Word sequences were recorded and transcribed. The following standard measures were calculated separately for Semantic and Phonemic VFT: word count, total number of correct words, total number of errors, and total number of repetitions.

Graph Analysis
Transcribed word sequences from Semantic and Phonemic VFT were analyzed with the use of SpeechGraphs software [27]. This tool represented words as speech graphs-every Brain Sci. 2022, 12, 166 4 of 11 word in a sequence was presented as a node (N), and the temporal link between words as an edge (E) (Figure 1). and transcribed. The following standard measures were calculated separately for Semantic and Phonemic VFT: word count, total number of correct words, total number of errors, and total number of repetitions.

Graph Analysis
Transcribed word sequences from Semantic and Phonemic VFT were analyzed with the use of SpeechGraphs software [27]. This tool represented words as speech graphsevery word in a sequence was presented as a node (N), and the temporal link between words as an edge (E) (Figure 1).

Statistical Analysis
Demographic variables were compared with X2 and ANOVA tests (Table 1). One-way ANOVA with a factor of group (SZ, BD, and HC) with Tukey's HSD post-hoc tests were used to compare SGA parameters and standard measures with normal distribution and equal variances. Standard measures and SGA parameters that did not meet assumptions for ANOVA were compared with the use of the Kruskal-Wallis test (Tables 2 and 3). For the non-continuouss variables (RE, PE, L1, L2, L3, and CC) due to the expected values <5, Fisher's tests were used. Series of logistic regressions adjusted for age and gender were used to evaluate associations between SGA parameters and diagnoses. The area under the receiver operating characteristic curve (AUC) was used to estimate classification quality of the above-mentioned variables between SZ, BD, and HC groups. Quality was considered excellent when AUC was higher than 0.8, good when AUC ranged from 0.6 to 0.8, and poor when AUC was smaller than 0.6.

Standard Measures
Analysis of Phonemic and Semantic VFT with the use of standard measures revealed only two statistically significant differences. SZ patients showed lower word count (p = 0.01) and lower total number of correct words in Semantic VFT than the HC group (p = 0.006; Tables 2 and 3). There were no statistically significant differences between BD and HC groups in terms of standard measures.

SGA Comparisons
Results of SGA analyses are presented in Tables 2 and 3. SZ patients' Semantic VFT graphs showed lower total word count (p = 0.01), less nodes (p = 0.003) and edges (p = 0.010), higher density (p = 0.005), and smaller diameter (p < 0.001) and average shortest path (p < 0.001) than the HC group.
BD patients' Semantic VFT graphs presented smaller diameter (p = 0.015) and average shortest path (p = 0.024) than HC. Phonemic VFT SGA comparisons revealed no statistically significant differences between BD and HC groups.
There were no statistically significant differences between SZ and BD patients in terms of Phonemic and Semantic VFT SGA. Also, we have shown no significant differences between SZ, BD, and HC groups in terms of Phonemic VFT SGA.
An increased density of the Semantic VFT graph was associated with a higher probability of having a diagnosis of SZ, rather than being in HC group (OR = 1.02, p = 0.020).
Decreased Phonemic VFT graph diameter (OR = 0.85, p = 0.030) and ASP (OR = 0.61, p = 0.030) were associated with a higher probability of being in the SZ group, rather than the HC group.

Discussion
To our best knowledge, this is the first study applying graph theory-based analysis theory to compare SZ, BD, and HC groups' Semantic and Phonemic VFT performance. The comparison of three groups in terms of standard measures revealed differences in only two variables. SZ patients showed a decreased word count and lower total number of correct words in Semantic VFT than the HC group. An analysis of speech graphs revealed the presence of variables reflecting the complex disturbances of verbal fluency, differentiating patients and HC groups. We have also found that none of the Phonemic VFT parameters were able to differentiate BD, SZ, and HC groups.
According to our hypotheses, SZ patients' Semantic VFT graphs presented a lower number of nodes, edges, and disturbances within global attributes: higher density, smaller diameter, and decreased ASP, compared to HC. A decreased number of nodes and edges corresponds directly to the lower word count, including the lower number of correct responses. Alterations of global attributes indicate that patients generated non-linear speech graphs, with shorter paths through the first word to the last one, and with additional, unnecessary connections between produced words. Speech graph analysis enabled us to demonstrate the presence of qualitative differences in Semantic VFT that would otherwise go unnoticed in the routine comparison of groups using standard measures. Our results are consistent with other studies using different analysis methods applied to the semantic networks produced by individuals with SZ, such as multidimensional scaling and clustering techniques [21,30,31]. These studies indicate the lack of organization, and the reduction in the size of lexicon in the patients group.
The mechanisms of semantic fluency disturbances in SZ are not clear. Data suggests that decreased production of category exemplars may represent an increase in time that it takes SZ patients to move among related nodes in a semantic network, which should be proximal in semantic space [9]. This may be associated with the failures in spreading activation across associative connections [9,30,32,33]. Also, it has been suggested that patients may produce fewer number of words because they may allocate more time to inhibit incorrect, or monitor otherwise problematic, responses [9,34]. Our results provide further support for the cognitive deficits of the semantic store in SZ. Interestingly, Bertola et al., 2014, have shown that the same SGA differentiated patients with mild cognitive impairment, Alzheimer's disease, and the control group [5]. They have shown that, with a decreasing functional performance and cognitive impairment, Semantic VFT graphs became denser, with a smaller diameter, and an ASP with fewer numbers of nodes and edges [5]. Future studies should evaluate the association between semantic networks and cognitive functions and functional activity measures in SZ. We have shown that SZ patients present alterations within speech graphs derived from Semantic VFT and preserved Phonemic VFT networks, which supports observations of a disproportionate impairment of semantic verbal fluency in SZ [10]. A meta-analytic review of Henry and Crowford (2005) showed that SZ patients were more impaired in semantic relative to phonemic verbal fluency, suggesting that, in addition to general retrieval difficulties, SZ is associated with compromises to the semantic store [35].
Unlike standard measures, SGA analysis showed differences between BD and HC groups. BD patients differed significantly in the diameter and ASP of the speech graphs obtained from Semantic VFT. Thus, they produced non-linear, less direct, and poorer networks with significantly smaller diameters. Though there were no significant differences between SZ and BD performance, a smaller number of SGA was able to distinguish the BD group from HC compared to SZ. This may be associated with the observation that, in our study, contrary to the SZ group, BD patients did not differ from the HC group in terms of standard measures. Despite that, SGA made it possible to distinguish between the two groups, suggesting that, in the case of BD patients, speech graph analysis may be more sensitive in differentiating both groups than comparisons based solely on productivity and error scores. Diameter and ASP values were also shown to be the most prevalent differences across the healthy elderly, mild cognitive impairment group, and the Alzheimer's disease group [5], and may be the most defined characteristic associated with general cognitive deficits. It has been shown that working memory, executive functioning, and processing speed score are related to the Semantic and Phonemic VFT output of BD patients [36]. We encourage future studies to use SGA in order to evaluate associations between verbal fluency and other cognitive domains in BD. Contrary to results of the recent meta-analysis, we have shown no differences in terms on Phonemic VFT between BD and HC groups [12]. Semantic VFT scores may present greater sensitivity to the cognitive impairment in the semantic memory, organization, and retrieval [35]. Given the fact that, in our study, BD patients did not differ in terms of standard measures with HC, our results suggest that semantic VFT scores may be more sensitive than phonemic VFT to spot verbal fluency deficits in the group of relatively well performing BD individuals.
In our study, SZ and BD patients did not present significant alterations of the recurrence attributes, e.g., repeated or parallel edges, and cycles between nodes. The recurrence attributes are related to the number of repetitions during VFT, which did not distinguish between the three examined groups. Surprisingly, this simple standard measure of repetitions is rarely analyzed in the literature. In the recent meta-analysis of SZ VFT done by Tan et al., 2020, the perseverative/repeat scores were reported only in 9 out of 48 studies [11]. The number of repetitions did not differentiate healthy controls from patients with chronic SZ [23,[37][38][39]. Galaverna et al., 2014, showed a higher number of perseverative errors in patients with chronic SZ [40]. In contrast, Kosmidis et al., 2005, showed a lower number of repetitions in the SZ group. Interestingly, in the BD literature, the types of errors during VFT are not evaluated [41]. In a recent meta-analysis performed by Rauchere-Chene et al., 2016, perseverative errors were not analyzed [12]. Bertola et al., 2014, comparing patients with mild cognitive impairment and Alzheimer's disease, showed that even the groups that did not differ in the number of perseverations, did differ in the occurrence of loops of three nodes [5]. This suggests that speech graphs' recurrence attributes may constitute a more sensitive measure of perseverative measures than a total number of repetitions. It was suggested that the impairments in the central executive and episodic buffer functions of working memory could explain the repetition of words during VFT [5]. Given the fact that both BD and SZ patients show deficits of executive functions, including working memory impairments [8], we suggest using speech graphs' recurrence attributes to measure perseverations during verbal fluency tasks in these disorders.
None of the standard measures and SGA analyzed in this study were able to differentiate BD and SZ patients. The absence of differences between the groups, established by the new graph theory-based approach, corroborates the results of a qualitative review and recent studies indicating a comparable level of verbal fluency impairments in both clinical groups [14][15][16][17][18]. Our results stay in line with the "schizophrenia-bipolar disorder boundary" hypothesis, suggesting that the verbal fluency deficits may present one of the intermediate phenotypes in both SZ and BD [1].
We are aware of the limitations of our study, such as relatively small groups, the fact that groups of patients were not drug-naïve, and our lack of ability to evaluate potential effects of medication on VFT results due to the small subject sample. To eliminate a possible bias of our findings related to antipsychotic treatment, we only recruited patients taking antipsychotic drugs from the group of dibenzoxazepines, and therefore provided a relative pharmacological homogeneity across patient groups.

Conclusions
In conclusion, our study, for the first time, compared BD, SZ, and HC in terms of VFT performance with the novel graph analysis approach. The application of this method revealed that, in the SZ group, semantic verbal fluency deficits are not only limited to the productivity scores. Our study showed a complex picture of semantic network disturbances in this clinical group, indicating that SZ patients' speech graphs present a higher density; smaller diameter, ASP, and LSC; as well as less nodes and edges compared to HC. In the case of BD patients, SGA were able to distinguish patients from HC, despite there being a lack of differences between the two groups in standard measures. Interestingly, none of the variables characterizing verbal fluency performance were able to differentiate SZ and BD. Our results encourage the use of speech graph analysis, as it reveals verbal fluency alterations that remained unnoticed in the routine comparisons of groups with the use standard measures. As these additional calculations are no extra burden on patients' time, future studies regarding verbal fluency deficits in SZ and BD should include the evaluation of SGA, in order to better understand this issue.