Development of Computer-Aided Semi-Automatic Diagnosis System for Chronic Post-Stroke Aphasia Classiﬁcation with Temporal and Parietal Lesions: A Pilot Study

: Survivors of either a hemorrhagic or ischemic stroke tend to acquire aphasia and experience spontaneous recovery during the ﬁrst six months. Nevertheless, a considerable number of patients sustain aphasia and require speech and language therapy to overcome the di ﬃ culties. As a preliminary study, this article aims to distinguish aphasia caused from a temporoparietal lesion. Typically, temporal and parietal lesions cause Wernicke’s aphasia and Anomic aphasia. Di ﬀ erential diagnosis between Anomic and Wernicke’s has become controversial and subjective due to the close resemblance of Wernicke’s to Anomic aphasia when recovering. Hence, this article proposes a clinical diagnosis system that incorporates normal coupling between the acoustic frequencies of speech signals and the language ability of temporoparietal aphasias to delineate classiﬁcation boundary lines. The proposed inspection system is a hybrid scheme consisting of automated components, such as confrontation naming, repetition, and a manual component, such as comprehension. The study was conducted involving 30 participants clinically diagnosed with temporoparietal aphasias after a stroke and 30 participants who had experienced a stroke without aphasia. The plausibility of accurate classiﬁcation of Wernicke’s and Anomic aphasia was conﬁrmed using the distinctive acoustic frequency proﬁles of selected controls. Accuracy of the proposed system and algorithm was conﬁrmed by comparing the obtained diagnosis with the conventional manual diagnosis. Though this preliminary work distinguishes between Anomic and Wernicke’s aphasia, we can claim that the developed algorithm-based inspection model could be a worthwhile solution towards objective classiﬁcation of other aphasia types.


Introduction
Aphasia is a language disorder mostly acquired after a stroke. Broca's, Wernicke's, and Anomic aphasia are the most common subtypes of aphasia [1]. Broca's aphasia is a non-fluent aphasia type caused from a lesion site in the frontal lobe. As a preliminary work this study considered Anomic Hence, this article demonstrates a hybrid software solution that incorporates acoustic frequencies of speech signals to determine and distinguish aphasia. In fact, the utilization of ASR free techniques augments the generalizability, since acoustic frequencies are not language dependent. The developed system consists of three diagnosis components, i.e., confrontation naming, single word repetition, and comprehension analysis. The hybrid system automates naming and repetition tasks. Meanwhile, quantitative measurements of the patients' comprehension levels were taken into account for the comprehension analysis. Introduced mathematical relationships among identified parameters derive a score for each assessment component. Accordingly, the proposed algorithm determines presence of aphasia or/and a differential diagnosis for each participant. Finally, a diagnosis report is generated for the reference of SLPs. As a pilot study, this work aims to distinguish aphasia occurrence with temporoparietal lesions, which are basically Anomic aphasia and Wernicke's aphasia. The conceptual breakthrough of the proposed work is in distinguishing two types of aphasia using acoustic frequencies of pathological speech production and comprehension analysis. To the best of our knowledge, previous works have not developed a language independent hybridized software solution to classify subtypes of aphasia using acoustic frequencies of speech signals.

Operational Procedure of the Developed Aphasia Classification System
The implementation of the proposed aphasia classification system was initiated by compiling a manual assessment. The manual speech assessment was developed to assess confrontational naming, single word repetition, single word comprehension, and simple command comprehension tasks. In manual assessment, a live model carried out the aforementioned tasks. Herein, manual assessment was tested with 5 neurologically healthy adults and 5 adults who had experienced a stroke but had been assessed as non-aphasic to validate the assessment materials, tasks, and instructions. Accordingly, the materials, tasks, and instructions used in the manual assessment were replicated in the software solution. Thus, this guarantees that the hybrid aphasia assessment tool is free from ambiguities. The assessment procedures were conducted by a speech and language pathologist certified at the medical council of Sri Lanka. The study followed a non-invasive, non-destructive, and non-harmful procedure, which only recorded human speech samples via a microphone. The ethical committee of the National Hospital Sri Lanka (Colombo) approved the study, and speech sample recording was performed in accordance with the guidelines of the of ethical committee of National Hospital Sri Lanka (Colombo) and the Institutional Care and Ethical Use Committees of Sri Lanka Institute of Information Technology, Sri Lanka and Kyungil University, Korea. All participants were included with approved consent of participant and/or guardian. Informed consent covered the participant's approval to participate in the language assessment procedure, to share demographic information, to share clinical history, and to audio record assessment speech samples.
Speech recognition is highly influenced by the frequency and the amplitude of a signal, which are the main components of an acoustic signal. However, these two components alone are insufficient to analyze and suggest a differential diagnosis of a pathological speech signal. Thus, the proposed system requires additional parameters that facilitate differential diagnosis of aphasia. The analysis focused on the number of pathological speech errors, cues given, and time taken to respond. The cues given are categorized into phonemic cues and semantic cues, where these cues as well as test words were included in our laboratory customized user manual. In order to determine a comprehensive relationship between language skills and errors, each error type and cue were handled separately. The clinical diagnosis was taken as the baseline to evaluate the accuracy of the developed system. All participants were clinically diagnosed and clinical diagnosis was blinded to the assessor who evaluated the participants using the proposed algorithm and vice versa. The proposed workflow began with the manual assessment compilation. Manual assessment materials were validated with a pilot test involving 5 neurologically healthy adults and 5 non-aphasic stroke patients. In the software design process, influential parameters were defined and assigned with scores to derive the math model. Simultaneously, a testing sample and control sample were identified and a clinical diagnosis was performed to use as the baseline for system validation. The materials and instructions of the manual assessment and math model were implemented as a software solution to evaluate participants. Finally, system generated diagnosis was tested with the clinical diagnosis to validate the accuracy of the system. Figure 1 outlines the afore described workflow of the proposed hybrid aphasia inspection tool. Moreover, the numerical algorithm as well as the automated software interface was developed using MATLAB software (Mathworks, MA, USA).
Appl. Sci. 2020, 10, x FOR PEER REVIEW  4 of 16 model. Simultaneously, a testing sample and control sample were identified and a clinical diagnosis was performed to use as the baseline for system validation. The materials and instructions of the manual assessment and math model were implemented as a software solution to evaluate participants. Finally, system generated diagnosis was tested with the clinical diagnosis to validate the accuracy of the system. Figure 1 outlines the afore described workflow of the proposed hybrid aphasia inspection tool. Moreover, the numerical algorithm as well as the automated software interface was developed using MATLAB software (Mathworks, Massachusetts, USA).

Graphical Description of the System Overview
Elicited speech samples were directly recorded on the laptop using a microphone mounted to a headphone. Each response from confrontation naming and single word repetition tasks were recorded separately. Pre-recorded words were presented to the subject in the word repetition task. The auditory models were presented via external speakers connected to a laptop. Initially, a trial test for confrontation naming and single word repetition tasks was conducted, in order to introduce the instructions. The pictures and words included in the trial test were not included in the actual assessment. For a better diagnosis, all attempts of a participant were recorded to determine the nature of word elicitation. The assessment was conducted twice using manual assessment and software. However, speech samples of both instances were recorded and analyzed. Data obtained from all participants were evaluated using the algorithm. According to scores obtained in each task, the proposed hybridized aphasia diagnosis process objectively derives the potential diagnosis. Figure 2 depicts the overview of the software architecture. The system consists of confrontation naming, repetition, and comprehension components. The confrontational naming task consists of 30 pictures. In confrontation naming, a single picture stimulus is presented to the participant at a time. The participants were asked to identify the picture and elicit the corresponding word. The repetition task consists of 15 single words. The participants were instructed to repeat the presented word. Elicited

Graphical Description of the System Overview
Elicited speech samples were directly recorded on the laptop using a microphone mounted to a headphone. Each response from confrontation naming and single word repetition tasks were recorded separately. Pre-recorded words were presented to the subject in the word repetition task. The auditory models were presented via external speakers connected to a laptop. Initially, a trial test for confrontation naming and single word repetition tasks was conducted, in order to introduce the instructions. The pictures and words included in the trial test were not included in the actual assessment. For a better diagnosis, all attempts of a participant were recorded to determine the nature of word elicitation. The assessment was conducted twice using manual assessment and software. However, speech samples of both instances were recorded and analyzed. Data obtained from all participants were evaluated using the algorithm. According to scores obtained in each task, the proposed hybridized aphasia diagnosis process objectively derives the potential diagnosis. Figure 2 depicts the overview of the software architecture. The system consists of confrontation naming, repetition, and comprehension components. The confrontational naming task consists of 30 pictures. In confrontation naming, a single picture stimulus is presented to the participant at a time. The participants were asked to identify the picture and elicit the corresponding word. The repetition task consists of 15 single words. The participants were instructed to repeat the presented word. Elicited speech signals for confrontation naming and repetition tasks were recorded and stored in .wav format. The comprehension task consists of two subcomponents, i.e., single word comprehension and simple command comprehension. The single word comprehension task contains 10 frequent words. The participants should identify the target object among pictures presented on the laptop screen. The simple command comprehension task evaluates the ability to follow simple instructions. Each instruction consists of one or more objects, sequences, and actions to be performed. In order to avoid the subjective perception of SLPs, the objects, actions, and sequences are included in the assessment. The implemented software autonomously process input speech signals to evaluate acoustic frequencies in confrontation naming and repetition tasks. Initially, the analog speech signal was digitized by sampling repeatedly and encoding the results into a set of bits. Sequentially, the acoustic parameters of the speech signals were determined by applying mel-frequency cepstral coefficients (MFCCs). The similarity level with the template was calculated using dynamic time warping (DTW). Although the comprehension task was not automated at this release and uses inputs from a SLP, all the parameters and desired responses are readily available in the assessment manual to avoid subjectivity and ambiguity. Upon completion, respective scores, i.e., naming score, repetition score, and comprehension score, were calculated using acoustic frequencies and SLP input.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 16 speech signals for confrontation naming and repetition tasks were recorded and stored in .wav format. The comprehension task consists of two subcomponents, i.e., single word comprehension and simple command comprehension. The single word comprehension task contains 10 frequent words. The participants should identify the target object among pictures presented on the laptop screen. The simple command comprehension task evaluates the ability to follow simple instructions. Each instruction consists of one or more objects, sequences, and actions to be performed. In order to avoid the subjective perception of SLPs, the objects, actions, and sequences are included in the assessment. The implemented software autonomously process input speech signals to evaluate acoustic frequencies in confrontation naming and repetition tasks. Initially, the analog speech signal was digitized by sampling repeatedly and encoding the results into a set of bits. Sequentially, the acoustic parameters of the speech signals were determined by applying mel-frequency cepstral coefficients (MFCCs). The similarity level with the template was calculated using dynamic time warping (DTW). Although the comprehension task was not automated at this release and uses inputs from a SLP, all the parameters and desired responses are readily available in the assessment manual to avoid subjectivity and ambiguity. Upon completion, respective scores, i.e., naming score, repetition score, and comprehension score, were calculated using acoustic frequencies and SLP input.

Algorithm of the Developed Inspection Procedure
Confrontation naming is a key component in neurological assessments, since word retrieval stages are commonly affected by inappropriate neurological changes [17]. All responses for the confrontational naming task were audio recorded separately in .wav format with the subject's consent. The auditory model provided instructions and cues for the confrontation naming task. Accordingly, the confrontation naming score was calculated considering time taken to word production, formant frequencies, and pathological speech characteristics. Formants are the frequency peaks, which have a high degree of energy in the spectrum. In general, formants are prominent in vowels.

Algorithm of the Developed Inspection Procedure
Confrontation naming is a key component in neurological assessments, since word retrieval stages are commonly affected by inappropriate neurological changes [17]. All responses for the confrontational naming task were audio recorded separately in .wav format with the subject's consent. The auditory model provided instructions and cues for the confrontation naming task. Accordingly, the confrontation naming score was calculated considering time taken to word production, formant frequencies, and pathological speech characteristics. Formants are the frequency peaks, which have a high degree of energy in the spectrum. In general, formants are prominent in vowels.
Sp NA is the achieved score for the confrontation naming task. F i denotes the average of the corresponding formant frequency achieved for all target words. In this study, we considered the Appl. Sci. 2020, 10, 2984 6 of 15 first four formant frequencies. Moreover, the evaluation considers the average time taken for the confrontation naming task, which is denoted by T avg . The quantitative measurements of the speech characteristics were taken into account by calculating Sp e , Sp pa , Sp c , Sp r as mentioned below.
Each speech characteristic was evaluated based on the potential impact on language skills of a person. Hence, identified pathological speech characteristics were considered to calculate speech errors, partial attempts, cues, and responses, which are denoted by Sp e , Sp pa , Sp c , and Sp r , respectively. The Sp e evaluated the number of occurrences of circumlocution, elicitation of an unrelated word, and absence of any response. If a participant is using many words to express the target word [18], it encounters circumlocution. Whenever the participant produces a word that neither has phonetic nor semantic relation to the target word that is counted as an unrelated word. A no response scenario is encountered when participants do not attempt at all to elicit the target word. Sp pa calculates the partial attempts made by a participant. In phonemic attempts, phonemes are produced instead of the target word. Semantic attempts rely on the verbal meaning of the target word [5], e.g., "eat" instead of "plate". The cueing comes in two types, (1) auditory cues (phonemic) and (2) verbal meaning of the target word (semantic). Whenever a participant produces a related word or produces the target word after the expected time, they are taken into account to calculate Sp r .
Sp pa = Phonemic attempts + Semantic attempts 4 (4) Sp c = Phonemic cues + Semantic cues 4 (5) The calculation process is designed to reduce the achieved score when the numbers of speech errors are increasing. In addition, the impact of the error is considered to get a more realistic confrontation naming score. For example, the severity of a phonemic attempt has less impact compared to circumlocution, whereas no responses and unrelated words are highly influential for a diagnosis of aphasia.
As a prominent language feature, repetition plays a vital role in differential diagnosis of aphasia subtypes. In this study, we propose a quantitative differential diagnosis tool for Anomic aphasia and Wernicke's aphasia. Although Wernicke's present deficits in ability to repeat spoken language, Anomic aphasia demonstrates significantly preserved repetition skills. DTW was used to determine the similarity between the repeated word and the presented auditory model to automatically determine the nature of the repetition. Herein, we defined similarity threshold values to differentiate the nature of the repetition. Accordingly, a repetition score was calculated as given below.
Sp rep = Not repeated + Phonemic attempts 2 + Partial attempts 4 Sp RA determines the achieved score for a single word repetition task. F i is the average of the corresponding formant frequency for 15 words used in the repetition task. Similar to confrontation naming, T avg indicates the average time taken to repeat a single word. Sp rep calculates the score corresponding to the nature of the repetition, which is determined by the similarity index of the produced speech signal.
Competency in auditory comprehension is widely used as a key language feature to classify aphasia. The comprehension assessment is not fully automated in this software solution. Nevertheless, it utilizes the manually entered quantitative SLP perceptive decisions on the comprehension competency data inputs to evaluate the auditory comprehension level as it is essential to classify aphasia into subtypes. The entire comprehension component identifies 10 objects in the single word comprehension, 20 objects in simple command comprehension, 15 action sequences, and 15 actions. The score is calculated using the following relationship.
Comprehension Accuracy (Sp CA ) = I obj + I seq + I act (9) I obj indicates the number of accurately identified objects followed by I seq and I act denoting the number of accurate sequences and actions. Finally, the conceptual algorithm evaluates obtained scores for naming, repetition, and comprehension tasks to determine a potential differential diagnosis of aphasia. The algorithm for the differential diagnosis process is presented in Figure 3. The scheme identifies Anomic aphasia and Wernicke's aphasia by considering the strengths and weaknesses of the performed language skills. For evaluation we considered right handed (RH) and non-right handed (NRH) participants with a left hemisphere lesion (LHL) and right hemisphere lesion (RHL). rep Sp calculates the score corresponding to the nature of the repetition, which is determined by the similarity index of the produced speech signal. Competency in auditory comprehension is widely used as a key language feature to classify aphasia. The comprehension assessment is not fully automated in this software solution. Nevertheless, it utilizes the manually entered quantitative SLP perceptive decisions on the comprehension competency data inputs to evaluate the auditory comprehension level as it is essential to classify aphasia into subtypes. The entire comprehension component identifies 10 objects in the single word comprehension, 20 objects in simple command comprehension, 15 action sequences, and 15 actions. The score is calculated using the following relationship. I denoting the number of accurate sequences and actions. Finally, the conceptual algorithm evaluates obtained scores for naming, repetition, and comprehension tasks to determine a potential differential diagnosis of aphasia. The algorithm for the differential diagnosis process is presented in Figure 3. The scheme identifies Anomic aphasia and Wernicke's aphasia by considering the strengths and weaknesses of the performed language skills. For evaluation we considered right handed (RH) and non-right handed (NRH) participants with a left hemisphere lesion (LHL) and right hemisphere lesion (RHL).

Patient Information
After validating the manual assessment, 30 participants [female (n = 13), Anomic (n = 18), Wernicke's (n = 12), RH (N = 24), NRH (n = 6)] who had experienced a single hemisphere stroke [LHL (n = 28) and RHL (n = 2)] at least 12 months before were recruited for the evaluation. The inclusion criteria was absence of any other psychiatric or neurological disorder except the stroke condition. Selected participants should be able to follow simple commands. Simultaneously, 30 participants who had also experienced a single hemisphere stroke (similar with test group) but had been assessed as non-aphasic were evaluated as the control cohort. In general, adults with aphasia are prone to motor speech disorders, i.e., dysarthria [19] and AOS [20]. It commonly occurs with aphasia and neurodegenerative disorders. To select the participants clinically diagnosed with either Anomic Figure 3. Algorithm for the differential diagnosis of Anomic and Wernicke's aphasia.

Patient Information
After validating the manual assessment, 30 participants [female (n = 13), Anomic (n = 18), Wernicke's (n = 12), RH (n = 24), NRH (n = 6)] who had experienced a single hemisphere stroke [LHL (n = 28) and RHL (n = 2)] at least 12 months before were recruited for the evaluation. The inclusion criteria was absence of any other psychiatric or neurological disorder except the stroke condition. Selected participants should be able to follow simple commands. Simultaneously, 30 participants who had also experienced a single hemisphere stroke (similar with test group) but had been assessed as non-aphasic were evaluated as the control cohort. In general, adults with aphasia are prone to motor speech disorders, i.e., dysarthria [19] and AOS [20]. It commonly occurs with aphasia and neurodegenerative disorders. To select the participants clinically diagnosed with either Anomic aphasia or Wernicke's aphasia, Frenchay Dysarthria Assessment (FDA) [21] and Apraxia Battery for Adults (ABA) [22] tests were performed to exclude the patients with dysarthria or/and apraxia. In fact, these two disorders can influence the objective diagnosis of aphasia based on ASR free properties. Therefore, participants with dysarthria and apraxia will be evaluated in future endeavors to optimize the accuracy of the proposed algorithm. Table 1 presents demographic data and initial clinical diagnosis of participants.

Confrontational Naming Analysis
Data obtained through the proposed system for naming, repetition, and comprehension tasks from non-aphasic, Anomic, and Wernicke's participants were analyzed to determine distinguishable characteristics of each sample. Figure 4 depicts the confrontation naming performance of three samples. The average time taken to elicit each word by each sample is illustrated in Figure 4a. Considering the total word list, on average non-aphasic participants took 2.51 seconds to elicit a target word. Contrastingly, the aphasic sample (Anomic and Wernicke's) took 10.73 seconds on average to elicit a target word. The system obtained the first four formant frequencies (F 1 -F 4 ) of the entire confrontation naming task (n = 30 words) for each participant. Upon completing the total word list, averages for each formant frequency were obtained for each participant.
Consequently, derived averages were separately analyzed considering the clinical diagnosis, in order to obtain average formant frequency values for non-aphasic, Anomic, and Wernicke's samples. Accordingly, formant frequency variation for the aforementioned samples are presented in Figure 4b, which depicts a higher formant frequency average for Anomic aphasia and lower formant frequency average for Wernicke's aphasia than the non-aphasic sample. Further, paired t-test confirmed that formant frequencies of Anomic [t (17) = 1.9977, (P value < 0.05)] and Wernicke's [t (11) = 2.5642, (P value < 0.05)] are significantly different from the non-aphasic sample. Figure 4c presents the average number of pathological speech characteristics encountered in the Anomic and Wernicke's samples, since the control sample was free from these characteristics. Comparatively, Anomic participants had 15% fewer pathological characteristics than Wernicke's participants. Upon the completion of the confrontation naming task, the mathematical module calculates Sp NA for each participant. The algorithm uses the naming score to perform a differential diagnosis of Anomic and Wernicke's aphasia. The Anomic sample's mean Sp NA value was always higher than mean Sp NA of the Wernicke's sample. Thus, the proposed algorithm can accurately distinguish the naming performance of patients with Anomic and Wernicke's aphasia. A summarized analysis of confrontation naming performance is given below in Table 2.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 16 naming task (n = 30 words) for each participant. Upon completing the total word list, averages for each formant frequency were obtained for each participant. Consequently, derived averages were separately analyzed considering the clinical diagnosis, in order to obtain average formant frequency values for non-aphasic, Anomic, and Wernicke's samples. Accordingly, formant frequency variation for the aforementioned samples are presented in Figure 4b, which depicts a higher formant frequency average for Anomic aphasia and lower formant frequency average for Wernicke's aphasia than the non-aphasic sample. Further, paired t-test confirmed that formant frequencies of Anomic [t (17) = 1.9977, (P value < 0.05)] and Wernicke's [t (11) = 2.5642, (P value < 0.05)] are significantly different from the non-aphasic sample. Figure 4c presents the average number of pathological speech characteristics encountered in the Anomic and Wernicke's samples, since the control sample was free from these characteristics. Comparatively, Anomic participants had 15% fewer pathological characteristics than Wernicke's participants. Upon the completion of the confrontation naming task, the mathematical module calculates NA Sp for each participant. The algorithm uses the naming score to perform a differential diagnosis of Anomic and Wernicke's aphasia. The Anomic sample's mean NA Sp value was always higher than mean NA Sp of the Wernicke's sample. Thus, the proposed algorithm can accurately distinguish the naming performance of patients with Anomic and Wernicke's aphasia. A summarized analysis of confrontation naming performance is given below in Table 2.

Repetition Analysis
Similar to the confrontation naming task, aphasic participants showed significant difficulty in repeating words compared to non-aphasic participants. Time taken for word repetition, formant frequency variation, and nature of repetition were separately analyzed for each participant. Figure 5 illustrates repetition task performances obtained through the proposed system. Non-aphasic participants repeated a target word within 2.02 seconds and aphasic participants took 9.16 seconds to repeat a word on average. Figure 5a presents each sample's average time taken for all target words in the repetition task. It is clearly visible that anomic participants encountered lesser difficulty than Wernicke's participants. As described in Section 3.1, a formant frequency analysis was conducted for the repetition task. Figure 5b depicts variations of average formant frequencies in non-aphasic, Anomic, and Wernicke's samples. Paired t-test on formant frequency confirmed significant difference in both Anomic [t (17) = 1.8858, (P value < 0.05)] and Wernicke's [t (11) = 2.3615, (P value < 0.05)] samples compared to the non-aphasic sample. As shown in Figure 5c, Anomic participants demonstrated their superior repetition ability over Wernicke's participants. Upon completion of the repetition task, the mathematical module calculates Sp RA for each participant. Since the Anomic sample's mean Sp RA is always higher than the Wernicke's sample, the algorithm uses repetition score to quantitatively differentiate Anomic aphasia from Wernicke's aphasia. A summarized analysis of repetition performance is given below in Table 3.

Repetition Analysis
Similar to the confrontation naming task, aphasic participants showed significant difficulty in repeating words compared to non-aphasic participants. Time taken for word repetition, formant frequency variation, and nature of repetition were separately analyzed for each participant. Figure 5 illustrates repetition task performances obtained through the proposed system. Non-aphasic participants repeated a target word within 2.02 seconds and aphasic participants took 9.16 seconds to repeat a word on average. Figure 5a presents each sample's average time taken for all target words in the repetition task. It is clearly visible that anomic participants encountered lesser difficulty than Wernicke's participants. As described in Section 3.1, a formant frequency analysis was conducted for the repetition task. Figure 5b depicts variations of average formant frequencies in non-aphasic, Anomic, and Wernicke's samples. Paired t-test on formant frequency confirmed significant difference in both Anomic [t (17) = 1.8858, (P value < 0.05)] and Wernicke's [t (11) = 2.3615, (P value < 0.05)] samples compared to the non-aphasic sample. As shown in Figure 5c, Anomic participants demonstrated their superior repetition ability over Wernicke's participants. Upon completion of the repetition task, the mathematical module calculates RA Sp for each participant. Since the Anomic sample's mean RA Sp is always higher than the Wernicke's sample, the algorithm uses repetition score to quantitatively differentiate Anomic aphasia from Wernicke's aphasia. A summarized analysis of repetition performance is given below in Table 3.

Comprehension Analysis
Comprehension analysis consists of single word comprehension and simple command comprehension tasks. Figure 6 illustrates participants' comprehension levels in both tasks. However, Figure 6 does not represent non-aphasic participants, since they demonstrated 100% accuracy in both tasks. Figure 6a illustrates the number of accurate responses by Anomic and Wernicke's participants per each target word. It is clear from the results that the Anomic participants have a higher level of single word comprehension than the Wernicke's participants. The simple command comprehension task separately analyzes participants' comprehension levels of objects, sequences, and actions, and the performance results are presented in Figure 6b-d, respectively. Aligning with single word comprehension, Anomic participants secured a higher level of comprehension in the simple command task as well. A summary of the comprehension task is given in Table 4. Although the comprehension analysis was not automated, all evaluation points were enclosed with the assessment and prompt the SLP to input only quantitative observations, e.g., number of objects, sequences, and actions identified. Hence, SLP input is not subjective and maintains consistency throughout. A performance summary of the comprehension task is given below in Table 5.
of repetition in aphasic subjects.

Comprehension Analysis
Comprehension analysis consists of single word comprehension and simple command comprehension tasks. Figure 6 illustrates participants' comprehension levels in both tasks. However, Figure 6 does not represent non-aphasic participants, since they demonstrated 100% accuracy in both tasks. Figure 6a illustrates the number of accurate responses by Anomic and Wernicke's participants per each target word. It is clear from the results that the Anomic participants have a higher level of single word comprehension than the Wernicke's participants. The simple command comprehension task separately analyzes participants' comprehension levels of objects, sequences, and actions, and the performance results are presented in Figure 6b-d, respectively. Aligning with single word comprehension, Anomic participants secured a higher level of comprehension in the simple command task as well. A summary of the comprehension task is given in Table 4. Although the comprehension analysis was not automated, all evaluation points were enclosed with the assessment and prompt the SLP to input only quantitative observations, e.g., number of objects, sequences, and actions identified. Hence, SLP input is not subjective and maintains consistency throughout. A performance summary of the comprehension task is given below in Table 5.  Table 4. Comprehension task summary (number of objects, sequences, and actions to comprehend).

Discussion
Experts became interested in automating speech and language assessments owing to the technological boost in ASR. In general, ASR performance is tightly coupled with the quality of the input speech signal [23]. Hence, applicability of ASR in pathological speech signal evaluation has become controversial despite its benefits. Nevertheless, ASR free features of speech signals are considered as promising alternatives when evaluating pathological speech signals [16]. Although a lot of insightful works have been performed in the autonomous language skills evaluation domain [9,[11][12][13][23][24][25], there is still a lot of room left for research. Hence, herein we incorporated ASR free features of pathological speech characteristics to develop a quantitative and objective aphasia assessment tool, which differentially diagnoses Anomic and Wernicke's aphasia. Thereby, we proposed a mathematical model to evaluate participants' performance levels in ASR free features. Subsequently, scores obtained from the mathematical model were analyzed using the diagnosis algorithm. As a result, an objective diagnosis [9] can be made utilizing quantitative observations of pathological speech signals, instead of relying upon subjective and qualitative evaluations by SLPs. In general, a majority of the standardized aphasia assessments follow a manual procedure conducted by a SLP. Conventional aphasia assessments consume a considerable amount of time and explicitly rely on expert knowledge, cultural background, and clinical experience [26]. In this work we incorporated a live auditory model and a recorded auditory model. The average assessment duration using a live auditory model was 396.2 seconds for the naming task and 199.8 seconds for the repetition task. The recorded auditory model took only 375.6 seconds and 180.7 seconds, respectively. Paired sample t-test confirmed a significant difference in assessment duration for both naming [t (29) = 10.66, (P value < 0.05)] and repetition [t (29) = 11.12, (P value < 0.05)] tasks with the aphasic sample. Hence, we can claim that using a recorded model improves the time efficiency of the assessment process, while mitigating the potential bias of the SLP. Figure 7 depicts assessment time variation between live and auditory models for naming and repetition tasks. Moreover, hands on experience in clinical assessment procedures significantly influences the diagnosis. Generally, SLPs diagnose language disorders adhering to a standardized assessment kit. Nevertheless, most of the SLPs in non-English speaking countries follow non-standardized assessments due to lack of cultural appropriateness and language differences. The non-standardized assessments do not consider normative data. Instead, they determine isolated skill levels of a person despite the other contributing factors [27], subsequently reducing the benefits of the assessment procedure. Furthermore, the manual assessment process results in multiple loopholes. The major drawback of the manual assessment is the subjectivity of the diagnosis process. It is said to be Moreover, hands on experience in clinical assessment procedures significantly influences the diagnosis. Generally, SLPs diagnose language disorders adhering to a standardized assessment kit. Nevertheless, most of the SLPs in non-English speaking countries follow non-standardized assessments due to lack of cultural appropriateness and language differences. The non-standardized assessments do not consider normative data. Instead, they determine isolated skill levels of a person despite the other contributing factors [27], subsequently reducing the benefits of the assessment procedure. Furthermore, the manual assessment process results in multiple loopholes. The major drawback of the manual assessment is the subjectivity of the diagnosis process. It is said to be subjective since the outcome highly depends on the experience, exposure, and perception of the SLP. Moreover, it has been suggested that a computerized approach could be used to minimize the subjectivity of the manual assessment procedure [9]. The participants of this study are clinically diagnosed with Anomic aphasia and Wernicke's aphasia according to Boston classification. The results shown in Figure 4 confirm the word finding difficulties of the participants diagnosed with both Anomic and Wernicke's aphasia. Moreover, the score obtained for the relationship among timing, formant frequencies, and pathological speech characteristics gave a significant boundary for differential diagnosis, as shown below in Table 6. Anomic participants were observed with minimal impairment in the repetition of words. Contrastingly, repetition ability was significantly affected in Wernicke's aphasia. Hence, near normal repetition ability of Anomic aphasia is widely used as a characteristic for differential diagnosis [1]. Evaluation scores obtained through the mathematical model identify the boundary values corresponding to Anomic and Wernicke's aphasia, as shown below. Anomic aphasia characterizes a relatively good comprehension level in general conversation. However, the difficulties appear in the comprehension of grammatically complex sentences [28]. Although visual comprehension is comparatively preserved, the lesion site of Wernicke's aphasia severely affects the verbal comprehension level [29]. By considering the correctly identified number of objects, actions, and sequences, the proposed assessment tool quantitatively evaluates the auditory comprehension level of participants. Average scores obtained by each sample for each component are given in Table 6 with corresponding standard deviation values. A naming score less than 500 was used as the boundary to classify non-aphasic and aphasic subjects (Anomic and Wernicke's). Similarly, a repetition score less than 500 was used to distinguish Wernicke's aphasia from Anomic aphasia. Furthermore, classification accuracy was confirmed through the comprehension score. If the comprehension score is less than 30 with significantly low repetition and naming skills, it indicates the presence of Wernicke's aphasia. Hence, we can claim that the values presented in Table 6 justify the constraint selection and conditions used in the diagnosis algorithm. Since all the experimental assessments and verifications were fundamentally based on the observed 60 participants of the study, more broad assessments with multiple aspects have to be performed to further enhance the accuracy of the study.

Conclusions
The extensive advancements in technology have inspired the adoption of computerized software solutions to diagnose speech and language disorders in clinical settings. However, this notion is still in its infancy due to language barriers and expensive software solutions. Herein, we proposed a generalizable semi-automated system and algorithm to diagnose the presence of aphasia despite of the language restrictions, while alleviating the inefficiencies and subjectivity in the manual diagnosis process. The effective and efficient detection will result in better understanding of the efficacy of treatments and can be helpful to improve the quality of life of post stroke patients suffering from aphasia. The obtained results are extensively analyzed to derive the relationships among the acoustic frequencies, accuracy of word retrieval, performance efficiency, and other related speech characteristics. The proposed scheme differentially diagnoses two subtypes of aphasia namely, Wernicke's aphasia and Anomic aphasia. Multi-dimensional evaluation for differential diagnosis of aphasia is an essential requirement to identify subtypes of aphasia. The fundamental scope of the study was to differentially diagnose Anomic and Wernicke's aphasia by evaluating major language evaluation components, namely, confrontation naming, repetition, and comprehension. Therefore in future endeavors, the developed algorithm will be further extended to evaluate the correlation of other language components, i.e., fluency, spontaneous speech, reading, and semantic comprehension, characteristics of dysarthria and AOS, and other speech defects, i.e., stuttering, to ensure the accuracy and efficiency of the proposed algorithm while extending its capability to diagnose the remaining subtypes of aphasia.